# Simulating the Spread of a Disease Using a Computer

The rapid spread of COVID-19 around the globe has caused a great deal of worry, and for good reason. At-risk segments of the population, including the elderly and those with preexisting health issues, face life-threatening dangers as the disease spreads. Although younger and healthier people tend to fare far better with COVID-19, the fact that they can carry the disease so easily to others who are much more impacted by it have led universities, entertainment venues, and even an entire country to abandon business as usual in an abundance of caution. Why do diseases such as COVID-19 spread, and why do they do so with such virulence? Computer simulations can give us some valuable insights.

The SIR (Susceptible-Infected-Recovered) model provides a common and effective approach to studying the spread of a disease. SIR divides a population into at least three categories: those susceptible to the disease, those infected by the disease, and those who have recovered from it. Some percentage of those who have “recovered” may have done so by passing on to the next life, a rather macabre twist on the notion of recovery, but one that gives rise to a fourth population, the deceased. Those who are susceptible to the disease can transition to the infected camp by coming in contact with an infected person, but not all such interactions will pass the sickness to the susceptible person. Rather, some fraction – call it alpha – of those who encounter an infected person will contract the disease. With this, alpha represents the rate of infection and is an important measure of how easily a disease spreads. Once infected, we will assume that person will either become part of the recovered group – those permanently immune to the disease – or to the deceased group. Suppose the rate of recovery is defined as beta, and that the death rate of those who have “recovered” from the disease is denoted as gamma.

We’ve seen a lot of definitions so far, including a few Greek letters thrown in just to complicate things. But we’re getting close to being able to use all these things. Suppose we are dealing with a population of 50,000 people, 10 of whom are already part of the “I” group (infected), and 49,990 of which are currently part of the “S” group (susceptible). No one is part of the the R (recovered) or D (deceased) groups yet, so their initial populations are 0. A key step in modeling the various segments of this population is to identify how and why the sizes of the S, I, R, and D groups will change over time. In other words, we need to identify and quantify those forces that will make S, I, R, and D increase, as well as those forces that will make them decrease. Let’s consider each of these, in turn:

S: The number of susceptible people can only decrease, because some of them will become infected. How many of them become infected by an interaction with an infected person is controlled by the rate alpha. Once infected and recovered, we assume a person is immune and cannot become susceptible again, so there is nothing that will make S increase.

I: The number of infected people will increase as a result of susceptible people coming into contact with a currently infected person, although not all such interactions will result in a new infection. The number of infected people will decrease as a result of an infected person recovering, which they will do at the beta rate. Although some of these recovered people will die (the fraction gamma), such unfortunate cases still result in a decrease in the number of infected people, as, to put it bluntly and obviously, someone who has died is no longer infected.

R: An infected person can recover and will do so at the rate beta, and this results in an increase in the number of people who have recovered. Unfortunately, some of these people – the fraction gamma – who have recovered will pass, and these people result in a decrease in the size of the R population.

D: The number of deceased can only grow over time as a percentage of people who have “recovered” from the infection. Since a person cannot come back to life, no factors can result in an increase in the D population.

Let’s get back to some math. The rate of change of a quantity that changes over time can always be expressed as

rate_of_change_of_quantity =
changes_that_result_in_an_increase – changes_that_result_in_a_decrease

We have four quantities – the populations S, I, R, and D – and a lot of words that describe their rates of change in terms of the things that make them increase and the factors that make them decrease. Let’s reduce these words to the following differential equations. As you read these equations, compare them with the verbal explanations above and see if they make sense. For example, the rate of change of S is due entirely to a decrease: the number of susceptible people will decrease because they will become infected. How quickly? It depends on the number of people who are susceptible and the number of people who are already infected, because they will mingle with each other, and some fraction alpha of those interactions will actually pass the disease. That is what the first of these equations represent. See if you can interpret the other equations in an identical manner.

rate_of_change_of_S = – alpha * S * I

rate_of_change_of_I = alpha * S * I – beta * I

rate_of_change_of_R = beta * I – beta * gamma * I

rate_of_change_of_D = beta*gamma*I

So now we have equations – differential equations – that describe how slowly or quickly the sizes of the various populations change. What do we do with them? Differential equations like these are useful because they help us learn the values of the various quantities – in this case, population sizes – as time moves on. To solve a differential equation means to consider these kinds of rates of change, together with the initial values of the quantities before they start changing, to calculate the actual values of those quantities as time advances. Undergraduate math, physics, and engineering majors typically take a course in differential equations, but they learn how to solve only certain categories of rather specific and relatively simple differential equations. More complicated differential equations require numerical techniques to solve them. That’s where computers come in especially handy.

Computers solve differential equations using a technique called numerical integration. There are many ways to perform numerical integration. The simplest way is to use a technique called Forward Euler. The Forward Euler approach calculates the next value of a quantity – in other words, its value at the next moment in time – from its current value and its rate of change over the time that elapsed. In other words:

the_next_value = the_current_value + the_rate_of_change * time_gone_by

For example, since we know the rate of change of S, the susceptible population, we can compute how S changes over time using Forward Euler like so:

the_next_value_of_S = the_current_value_of_S + the_rate_of_change_of_S *
time_gone_by

But we know that “the_rate_of_change_of_S” is – alpha * the_current_value_of_S * the_current_value_of_I. So, plug this into the equation above, and don’t forget that leading minus sign.

the_next_value_of_S =
the_current_value_of_S – alpha * the_current_value_of_S *
the_current_value_of_I

We can apply exactly the same approach to I, R, and D, the other three populations.

the_next_value_of_I =
the_current_value_of_I + alpha * the current_value_of_S *
the_current_value_of_I – beta * the current_value_of_I

the_next_value_of_R =
the_current_value_of_R + beta * I – beta * gamma * I

the next_value_of_D =
the_current_value_of_D + beta * gamma * I

We then have a system of four differential equations to solve. Assuming we knew the initial values of S, I, R, and D (and we do – S = 49,990, I = 10, R = 0, and D = 0 at the start), we could use these four equations to predict how S, I, R, and D vary over time. We could then determine if a community would be able to weather the spread of the disease as it worked its way through the population.

The good news is that this approach is rather easy to computerize. Although Forward Euler is very susceptible to the size of the time step you use and can go unstable and grossly inaccurate if you use too large a time step, it is particularly easy to program. Here, for example, is Python code that computes the four populations over time and then plots them.

``````# Ray Klump
import matplotlib.pyplot as plt
alpha = 0.000005
beta = 0.025
chi = 0.1
t = 
s0 = 49990
i0 = 10
r0 = 0
d0 = 0
sus = [s0]
inf = [i0]
rec = [r0]
for i in range(1,100):
t.append(i)
sus.append(sus[i-1] - alpha * sus[i-1]*inf[i-1])
inf.append(inf[i-1] + alpha*sus[i-1]*inf[i-1] - beta*(1-chi)*inf[i-1])
rec.append(rec[i-1] + beta*(1-chi)*inf[i-1])
plt.plot(t,sus,"r")
plt.plot(t,inf,"k")
plt.plot(t,rec,"g")
plt.xlabel("Day")
plt.ylabel("Number of people")
plt.title("Populations")
plt.show()
``````

Running this code will produce the following plot of the four populations over time.

You can simulate various strategies, such as quarantining, deploying masks, or just doing business-as-usual, by adjusting the parameters alpha, beta, and gamma. For example, if the rate of infection (alpha) doubles, the spread occurs more frequently, with the disease firmly taking hold in the population by day 20, and more than 42,000 contract the disease, up from about 35,000 with the original infection rate. Plots like these could demonstrate to a skeptical public the benefits to be had by canceling public gatherings and other events that could enhance the disease’s ability to spread.

Of course, there is guesswork involved in adjusting the parameters in ways that accurately capture the effects of these strategies. Still, simulations like these can help you understand how a disease like COVID-19 spreads, why it poses such significant risks, and which mitigating measures might be most effective in stopping it. 1. 