A Brief Introduction to Infectious Disease Modelling: Simulating a Hypothetical COVID-19 Outbreak

By Enrique M. Saldarriaga


The COVID-19 pandemic has boosted the interest for mathematical models of infectious diseases. In this entry, I will briefly introduce some of these models and provide an R-code to simulate an outbreak of COVID-19.

These models synthesize multiple sources of information into equations that aim to model the evolution of a disease and make predictions. When used correctly, they can be incredibly powerful tools to explain a very chaotic and complex reality, to evaluate policy options to inform decision-making, to understand hidden mechanisms that drive an epidemic, and others.

Infectious diseases do not occur in isolation in each person.  They are transmitted through contact with a pathogen. Thus, there is a need to understand the mechanisms for a susceptible person to establish effective contact (i.e. contact that results in a transmission; sexually transmitted disease is a good example) with someone who is infected with that pathogen. On the population level, disease prevalence is considered a risk factor for the incidence: the higher the proportion of people living with a disease, the higher the likelihood that an infected person gets in contact with a susceptible person. This relationship between incidence and prevalence can be characterized using dynamic models. Here, the probability of getting infected is determined by the probability of contact with an infectious person (or animal in case of diseases transmitted by vectors, like malaria), which is given by the prevalence. A contact resulting in an infection is called a susceptible-infected effective contact.

Infectious and non-communicable disease models have substantial similarities: both can be compartmental or agent-based (microsimulation), as well as deterministic (static transition probabilities) or stochastic (transition probabilities are random draws of a specified distribution). In any case, the decision about which model to use is determined by the scope, purpose of the analysis, and many times, the target audience for results dissemination.

In the following section I will describe compartmental, deterministic, closed-cohort models. In a closed cohort model, we assume no deaths or births, but the population remains constant over time.

Model Types

The Susceptible-Infectious (SI) Model. This is the most basic infectious disease model. It is characterized by two state variables or compartments: Susceptible (S) and Infectious (I). Here we model one transition, and once all susceptibles are infected, the epidemic is over (no deaths in this model). The transition is driven by the transmission coefficient. This is a very important concept because regardless of model type, this parameter determines the rate at which people get infected. It is usually denoted by lambda, λ, and it is the product of the infectivity or probability of transmission per contact (ρ), the contact rate at a given period (c), and the prevalence of infected (I/N; where N  is the total population): λ = c * ρ * I/N. At any point in time, and for all model options, the number susceptible decreases by λ.

The Susceptible-Infectious-Recover (SIR) Model. In addition to susceptible and infected, the SIR model includes the recovered (R) compartment. R includes people that were infected and overcome the disease. The rate of transition is given by the inverse of disease duration, also known as the recovery rate (γ). Some diseases confer immunity (e.g. measles) after infection, but others do not. To capture this, a SIRS (susceptible-infected-recovered-susceptible) model would be more appropriate and allows those who don’t develop immunity to transition back to susceptible.

The Susceptible-Exposed-Infectious-Recovered, (SEIR) Model. This model adds an exposed (E) compartment. Exposed are all persons who have been infected but are not yet symptomatic, and more importantly, not yet infectious. Infectious persons are the only ones capable of spreading the disease, hence, an accurate count of them is very important. When using a SEIR model, the transition between S and E is given by lambda (λ) and the transition between E and I is given by the inverse of the latency or incubation period (σ).


COVID-19 Outbreak Example

I am going to simulate a COVID-19 outbreak using a SEIR model, depicted in the figure below. All parameters have been obtained from the MIDAS Network repository – an excellent and publicly available compilation of COVID-19 parameters.

Let’s model the transitions between compartments considering 1-timepoint increment:

By taking the partial derivative of these equations with respect to t, we obtain the changes in every compartment at any point in time:


With this in mind, let’s go to the R-code to see how to implement the simulation.

COVID-19 Example Results

We model an outbreak for 1 year, using the following parameters: c * ρ = 1.5, σ = 1/4.2, and γ = 1/20, for a population of 1 million where 1 persons were already infected. The following image describes the outbreak.


We can see a very steep increase in the number of infected, which peaks at 625,095 infections on the 37th day of the outbreak. As it is often pointed out, this rapid increase in cases can overload health systems, reducing the possibility of many people to access care.

How can we flatten the curve? One intervention to contain the COVID-19 pandemic was to increase the physical distance between people. The objective was to reduce the probability of an effective susceptible-infected contact. In modelling terms, this would directly reduce c * ρ  and therefore λ.

The following image shows the results of reducing c * ρ  to 0.6 instead of 1.5.


The peak of infection occurs later, on day 65, at a lower count as well: 550,446. This is an example of how effective behavioral changes can be to reduce the severity of an outbreak.

In this example we changed only one parameter. But one thing that amazes me about infectious disease modelling, is that (almost) every parameter driving the outbreak is susceptible to change given the right intervention. You can now use the R-code to see how variations in other parameters affect the outbreak and think about what kinds of interventions might produce such changes.

Suggested Readings

Vynnycky, E. & White, R. G. An introduction to infectious disease modelling. (Oxford University Press, 2010). BookSite

Garnett, G. An introduction to mathematical models in sexually transmitted disease epidemiology. Sex Transm Infect 78, 7–12 (2002).

Kretzschmar, M. Disease modeling for public health: added value, challenges, and institutional constraints. J Public Health Pol 41, 39–51 (2020).