an R package for deterministic compartmental models with flexible dwell time distributions
Authors: Thinh Ong, Anh Phan, Lam Ha, Marc Choisy
Presenter: Anh Phan
For mathematical simplicity, compartmental models are typically formulated as a system of ordinary differential equations (ODE) , where the rate of transitioning from one compartment to another is either:
dependent on the simulation time (force of infection, seasonal forcing, etc.)
or constant
Consider the classical SIR model
\[ \begin{cases} \frac{dS}{dt} = - \frac{\beta I}{N}S \\ \frac{dI}{dt} = \frac{\beta I}{N}S - \gamma I\\ \frac{dR}{dt} = \gamma I \\ \end{cases} \]
When \(\gamma = 1/7\), it implies an exponentially distributed infectious period where rate = 1/7
When \(\gamma = 1/7\), it implies an exponentially distributed infectious period where rate = 1/7
While a more realistic distribution looks like the following
The distribution shape of the infectious period can greatly affect the model’s trajectory (Wearing, Rohani, and Keeling 2005; Keeling and Grenfell 1998; Kenah and Miller 2011).
Wrong assumptions for distribution of incubation/infectious period may lead to inaccurate estimations for important measures (e.g., disease persistency, basic reproductive ratio, etc.)
Several methods to relax the exponential distribution assumption were proposed.
The simplest one being Linear Chain Trick (LCT) to model Erlang distributed infectious period. The general idea being:
Infectious period (\(I\) compartment) is divided into sub-compartments, the number of which is the shape parameter of the Erlang distribution.
The rate of transitioning from one sub-compartment to another is the rate parameter of the Erlang distribution.
For example: ODE formulation for (\(T_I \sim E_{3}(0.2)\))
\[ \begin{cases} dS = -\beta \frac{I_1 + I_2 + I_3}{N} S \\ dI_1 = \beta \frac{I_1 + I_2 + I_3}{N} S - 0.2I_1 \\ dI_2 = 0.2I_1 - 0.2I_2 \\ dI_3 = 0.2I_2 - 0.2I_3 \\ dR = 0.2I_3 \\ \end{cases} \]
However:
Formulation can get complex as shape parameter grows
There are still many instances when this does not provide a great fit to the data
For arbitrarily distributed infectious period, formulation for compartmental model involves a system of delay integro-differential equation (DIDE).
Example:
\[ \begin{cases} \frac{dS(t)}{dt} = - N_{S \rightarrow I}(t) \\ \frac{dI(t)}{dt} = N_{S \rightarrow I}(t) - \int_0^{t} f(\tau) N_{S \rightarrow I}(t - \tau) d \tau \\ \frac{dR(t)}{dt} = \int_0^{t} f(\tau) N_{S \rightarrow I}(t - \tau) d \tau \\ \end{cases} \]
Where:
\(N_{S \rightarrow I}(t)\) is the number of individuals going from \(S\) to \(I\) at time \(t\), typically defined as \(N_{S \rightarrow I}(t) = \beta \frac{I(t)}{N}S(t)\)
\(f(\tau)\) is the probability distribution of infectious period
However, there is no solver for delay integro-differential equations
To manually implement a solver would lead to:
Time consuming process
Complex code
Our work denim
:
Propose a numerical approach to simulate compartmental model with arbitrarily distributed dwell-time
Provide an R package that implements the proposed approach
The central idea of our algorithm is to divide a compartment into a chain of sub-compartments, from which the out-going population is computed.
To demonstrate this idea, consider an SIR model where the time spent in \(I\) compartment follows a discrete distribution \(P_I = \{p_1, p_2, ..., p_n\}\)
We would divide \(I\) into \(n\) sub-compartments denoted \(I_k\), \(k \in \{1, 2, .., n\}\) where \(I_k\) represents individuals that have stayed in \(I\) for \(k\) time steps.
During each time step, a proportion \(q_k\) of sub-compartment \(I_k\) will transition to \(R\) while the remaining \(1 - q_k\) will transition to sub-compartment \(I_{k+1}\)
Our goal is to derive a formulation for \(q_k\) such that the proportion of individuals in \(I\) leaving from each sub-compartment \(I_k\) follows the distribution \(P_I\).
Our goal is to derive a formulation for \(q_k\) such that the proportion of individuals in \(I\) leaving from each sub-compartment \(I_k\) follows the distribution \(P_I\).
The formulation for \(q_k\) that satisfies said requirement is as followed
\[ q_k = \frac{p_k}{1 - \sum_{j=1}^{k-1} p_j} \]
We can also derive \(q_k\) from a survival analysis perspective.
Consider a survival analysis problem where the event of interest is “transitioning from \(I\) to \(R\)”
\(p_k = \int_{(k-1) \Delta t}^{ k \Delta t } f(\tau) d \tau\) is the discrete estimation of the probability distribution
\(I_k\) represents the population who have “survived” till time point \(k \Delta t\)
\(q_k\) - the instantaneous probability the event happen, conditioning on survival till time step \(k\) is thus the hazard rate, usually given by \(h(\tau) = \frac{f(\tau)}{S(\tau)} = \frac{f(\tau)}{1 -\int_0^{\tau} f(\tau)}\)
The algorithm can also be intepretted as the Euler method to solve the following system of delay integro-differential equation
This is a reformulation of the equation from the Background section, written in terms of \(h(\tau)\) instead of \(f(\tau)\)
\[ \begin{cases} \frac{dS(t)}{dt} = - N_{S \rightarrow I}(t) \\ \frac{dI(t)}{dt} = N_{S \rightarrow I}(t) - \int_0^{\mathcal{T}} h(\tau) I(t, \tau) d \tau \\ \frac{dR(t)}{dt} = \int_0^{\mathcal{T}} h(\tau) I(t, \tau) d \tau \\ \end{cases} \]
Where:
\(N_{S \rightarrow I}(t)\) is the number of individuals going from \(S\) to \(I\) at time \(t\), \(N_{S \rightarrow I}(t) = \beta \frac{I(t)}{N}S(t)\).
\(I(t, \tau)\) is sub-population of \(I\) at time \(t\) that have stayed for \(\tau\) period of time (i.e., \(I(t, \tau) = N_{S \rightarrow I}(t-\tau)S(\tau)\) where \(S(\tau)\) is survival function).
In denim
, a model structure is defined using denim
’s domain specific language (DSL) where each line of code must be a transition, in the following syntax.
from -> to = [transition]
Where [transition]
can be either:
a mathematical equation for differential equation
or, built-in distribution to specify the distribution of dwell time
Example: the classic SIR model
Example: the classic SIR model
library(denim)
# ---- Model specification
simple_SIR <- denim_dsl({
S -> I = beta * S * I/N
I -> R = d_exponential(gamma)
})
# ---- Define parameters and initial values
init_vals <- c(S = 999, I = 1, R = 0)
parameters <- c(beta = 0.8, gamma = 1/7, N = 1000)
# ---- Run simulation
out <- sim(
simple_SIR,
parameters = parameters,
initialValues = init_vals,
simulationDuration = 30,
timeStep = 0.1
)
Example: the classic SIR model
Time S I R
1 0.0 999.0000 1.000000 0.00000000
2 0.1 998.9201 1.065736 0.01418416
3 0.2 998.8349 1.135786 0.02930072
4 0.3 998.7442 1.210433 0.04541089
5 0.4 998.6474 1.289977 0.06257986
6 0.5 998.5444 1.374738 0.08087710
7 0.6 998.4346 1.465058 0.10037660
8 0.7 998.3175 1.561298 0.12115722
9 0.8 998.1929 1.663846 0.14330292
10 0.9 998.0600 1.773113 0.16690318
Example: the classic SIR model
Currently supported parametric distributions
Function | Parameters |
---|---|
d_exponential() |
rate |
d_gamma() |
rate, shape |
d_lognormal() |
mu, sigma |
d_weibull() |
scale, shape |
denim
also accept a vector of numeric values as a distribution via nonparametric()
function, where each value correspond to 1 time step.
Comparison between model definition in denim
and deSolve
, where infectious period is Erlang distributed
library(deSolve)
# --- Model definition in deSolve
transition_func <- function(t, state, param){
with(as.list( c(state, param) ), {
dS = -beta*S*(I1 + I2 + I3)/N
dI1 = beta*S*(I1 + I2 + I3)/N - rate*I1
dI2 = rate*I1 - rate*I2
dI3 = rate*I2 - rate*I3
dR = rate*I3
list(c(dS, dI1, dI2, dI3, dR))
})
}
# ---- Model configuration
parameters <- c(beta = 0.3, rate = 1/3, N = 1000)
initialValues <- c(S = 999, I1 = 1, I2=0, I3=0, R=0)
# ---- Run simulation
times <- seq(0, 75) # simulation duration
ode_mod <- ode(
y = initialValues,
times = times,
parms = parameters,
func = transition_func
)
# --- Format output
ode_mod <- as.data.frame(ode_mod)
ode_mod$I <- rowSums(ode_mod[, c("I1", "I2", "I3")])
library(denim)
# --- Model definition in denim
denim_mod <- denim_dsl({
S -> I = beta*S*(I/N)
I -> R = d_gamma(rate = rate, shape = shape)
})
# ---- Model configuration
init_vals <- c(S = 999, I = 1, R = 0)
parameters <- c(beta = 0.3, N = 1000, rate = 1/3, shape = 3)
# ---- Run simulation
denim_out <- sim(
denim_mod,
initialValues = init_vals,
parameters = parameters,
simulationDuration = 75,
timeStep = 0.05
)
denim
also support various ways to handle multiple out-going transitions depending on what information modelers have.
For example, consider a simple SIRD model, the information we can have includes:
When we only have \(f_{I \rightarrow R}(\tau)\) and \(f_{I \rightarrow D}(\tau)\): \(I \rightarrow R\) and \(I \rightarrow D\) can be modeled as independent competing risks
Under the assumption of independence, the overall hazard rate is the sum of hazard rates of the individual risks
In terms of sub-compartmental structure, the total proportion of people leaving each sub-compartment \(I_k\) is the sum of probabilities of leaving due to each risk.
Example: SIRD model with \(I \rightarrow R\) and \(I \rightarrow D\) as competing risks
When we have either:
\(f_{I \rightarrow R}(\tau)\), \(f_{I \rightarrow D}(\tau)\) and \(w_R\), \(w_D\)
or \(f_I(\tau)\) and \(w_R\), \(w_D\)
\(I \rightarrow R\) and \(I \rightarrow D\) can be modeled using multinomial framework in denim
\(I \rightarrow R\) and \(I \rightarrow D\) can be modeled using multinomial framework in denim
Both scenarios are implemented by having multiple sub-compartment chains — one chain per outcome.
Scenario 2 is equivalent to setting \(f_{I \rightarrow R}(\tau) = f_{I \rightarrow D}(\tau) = f_I(\tau)\)
Example: SIRD model with \(I \rightarrow R\) and \(I \rightarrow D\) under multinomial assumption
Example SIRD model with \(I \rightarrow R\) and \(I \rightarrow D\) under multinomial assumption
Example: a more complex model structure
library(denim)
initialValues <- c(S = 999, I = 1, R = 0, V = 0, IV = 0, D = 0)
modelStructure <- denim_dsl({
S -> I = beta * S * (I + IV) / N
S -> V = d_exponential(0.01)
0.1 * I -> D = d_lognormal(2, 0.5)
0.9 * I -> R = d_gamma(1/3, 2)
V -> IV = 0.2 * beta * V * (I + IV) / N
IV -> R = nonparametric(iv_r_dist)
IV -> D = d_weibull(scale = 2, shape = 1.5)
})
parameters <- list(
beta = 0.9,
N = 1000,
iv_r_dist = c(0, 0.15, 0.15, 0.05, 0.2, 0.2, 0.25)
)
simulation <- sim(transitions = modelStructure,
initialValues = initialValues,
parameters = parameters,
simulationDuration = 30,
timeStep = 0.1)
Draw backs
Due to the sub-compartment structure, denim
will have longer run time compared to ODE solvers.
Since the simulation algorithm is in discrete time, larger time step would lead to less accurate result.
Future plans:
Improve run time
Add support for structured compartment (for stratification by age, gender, location, etc.)
Add support for stochasticity
Extend the range of supported parametric distributions: Pareto, inverse Gaussian, Gompertz, log-logistic
The denim package is available on CRAN
For more information, visit denim website using the QR code
Or via the link: https://drthinhong.com/denim/
A preprint on our package is also available on bioRxiv