library(deSolve)
library(tidyverse)
<- function(t, state, param){
transition_func with(as.list( c(state, param) ), {
= - rate*I1
dI1 = rate*I1 - rate*I2
dI2 = rate*I2 - rate*I3
dI3 = rate*I3
dR
list(c(dI1, dI2, dI3, dR))
})
}
<- c(
desolveInitialValues I1 = 100,
I2 = 0,
I3 = 0,
R = 0
)
# ====== settings ========
<- c(
parameters rate = 1/4 # rate for Erlang distribution
)
Linear Chain Trick crash course
The goal of this blog post:
Introduce Linear Chain Trick
Explain Linear Chain Trick in a more “digestable” way (hopefully)
What this blog post does NOT provide:
Detailed mathematical proof for Linear Chain Trick
More advanced use cases (e.g., transition to multiple compartments)
Who is this for?
Those who are relatively new to compartmental modeling (like me) but already know the basics (e.g., able to understand the SIR model)
And, of course, those who are interested to learn and understand Linear Chain Trick
Background
Traditionally, SIR model is formulated using the following ODE
\[ \begin{cases} \frac{dS}{dt} = - \beta \frac{I}{N} S \\ \frac{dI}{dt} = \beta \frac{I}{N} S - \gamma I \\ \frac{dR}{dt} = \gamma I \end{cases} \tag{1}\]
With the underlying assumption that the infectious period (time to move from I to R or dwell-time in I compartment) is exponentially distributed (with rate \(\gamma\) and mean infectious time \(\frac{1}{\gamma}\) ). This was proven to be inadequate to capture many disease dynamics, as such, several formulations were suggested to incorporate other types of distributions.
Linear Chain Trick (LCT) in particular, is for modeling Erlang distributed infectious period (i.e., Gamma distribution with integer shape).
Linear Chain Trick
Formulation
The generalized formulation for \(I \rightarrow R\) transition where infectious period is Erlang distributed is as followed
\[ \begin{cases} \frac{dS}{dt} = - \beta \frac{I}{N} S \\ \frac{dI_1}{dt} = \beta \frac{I}{N} S - rI_1 \\ \frac{dI_2}{dt} = rI_1 - rI_2 \\ \vdots \\ \frac{dI_k}{dt} = rI_{k - 1} - rI_k \\ \frac{dR}{dt} = rI_k \end{cases} \tag{2}\]
Where:
\(k\) is the shape parameter of the Erlang distribution
\(r\) is the rate parameter of the Erlang distribution
For Erlang distribution with rate=1/4
, shape = 3
, the formulation would be
\[ \begin{cases} \frac{dS}{dt} = - \beta \frac{I}{N} S \\ \frac{dI_1}{dt} = \beta \frac{I}{N} S - rI_1 \\ \frac{dI_2}{dt} = \frac{1}{4}I_1 -\frac{1}{4}I_2 \\ \frac{dI_3}{dt} = \frac{1}{4}I_2 -\frac{1}{4}I_3 \\ \frac{dI_4}{dt} = \frac{1}{4}I_3 \end{cases} \]
The intuition of what is going on: The key idea is to delay the transition from I to R by introducing sub-compartment(s) i.e, \(I_1\), \(I_2\), …, \(I_k\).
LCT works by exploiting a property of Poisson processes: the time to \(k_{th}\) event under a homogeneous Poisson process at rate \(r\) is Erlang distributed with shape \(k\) and rate \(r\) (Hurtado and Kirosingh 2019).
Another way to think of it is each event is preceded by a length of time that is exponentially distributed with rate \(r\), and thus the time to \(k_{th}\) event is the sum of \(k\) independent and identical exponential random variables. The distribution of this sum turns out to be the Erlang distribution, with rate \(r\) and shape \(k\).
In the context of the example SIR model ( Equation 2 ), time to \(k_{th}\) event is the time going from sub-compartment \(I_1\) to \(I_k\). The rate of transitioning from \(I_n\) to \(I_{n+1}\) is constant (\(r\)) which implies the time of transitioning is exponentially distributed.
Implementing LCT
Consider a very simple scenario where there is no incoming population for I (i.e., no S->I
transition) and the infectious period is Erlang distributed with rate=1/4
and shape (k) = 3
<- 50
simulationDuration <- seq(0, simulationDuration)
times
<- ode(y = desolveInitialValues, times = times, parms = parameters, func = transition_func)
ode_mod <- as.data.frame(ode_mod) ode_mod
library(odin)
<- odin({
model_generator deriv(I1) = - rate*I1
deriv(I2) = rate*I1 - rate*I2
deriv(I3) = rate*I2 - rate*I3
deriv(R) = rate*I3
<- user(0)
init_I1 initial(I1) <- init_I1
<- user(1/4)
rate
initial(I2) <- 0
initial(I3) <- 0
initial(R) <- 0
})
<- 50
simulationDuration <- seq(0, simulationDuration)
times
<- model_generator$new(init_I1 = 100, rate = 1/4)
odin_mod <- odin_mod$run(times)
odin_mod <- as.data.frame(odin_mod) odin_mod
Visualization
Another good way to have an intuitive understanding is to visualize what is happening.
Population in sub-compartments
The following plot shows how the population in each sub-compartment and the population of I (dashed line) changes over time. Note that we’re using the model from the previous section, with Erlang(r=1/4, k=3)
distributed infectious period.
Code for plotting
<- ode_mod %>%
i_plot mutate(
I = I1 + I2 + I3
%>%
) ggplot() +
geom_line(aes(x = time, y = I1, color = "I1")) +
geom_line(aes(x = time, y = I2, color = "I2")) +
geom_line(aes(x = time, y = I3, color = "I3")) +
geom_line(aes(x = time, y = I), color = "red", linetype = "dashed") +
scale_color_manual(
values = c("I1" = "cornflowerblue", "I2" = "darkblue", "I3" = "blueviolet")
+
)labs(
title = "Infectious population over time",
x = "Time",
y = "Proportion"
) i_plot
Looks familiar? If not, here is the Erlang distribution with rate = 1/4
, shape=1, 2 and 3
Code for plotting
library(patchwork)
<- data.frame(
gamma_dists x = seq(0, 50),
shape1 = dgamma(seq(0, 50), rate = 1/4, shape = 1),
shape2 = dgamma(seq(0, 50), rate = 1/4, shape = 2),
shape3 = dgamma(seq(0, 50), rate = 1/4, shape = 3)
%>%
) ggplot() +
geom_line(aes(x=x, y=shape1, color="k=1")) +
geom_line(aes(x=x, y=shape2, color="k=2")) +
geom_line(aes(x=x, y=shape3, color="k=3")) +
scale_color_manual(
values = c("k=1" = "cornflowerblue", "k=2" = "darkblue", "k=3" = "blueviolet")
+
)labs(
x = "Time",
y = "Probability",
title = "Erlang distribution with rate=1/4"
)
/ gamma_dists i_plot
Visually, we can also see that the distribution for dwell-time of \(I_n\) follows \(Erlang(rate=r, shape=n)\) distribution. To change the shape parameter of the Erlang distribution, we can simply change the number of sub-compartments.
Sum of k i.i.d. exponential random variables
One of the reason LCT works is the fact that “the sum of \(k\) identical, independent exponential random variables with rate \(r\) follows \(\text{Erlang}(r,k)\) distribution”.
To visualize this, we can try computing distribution of sum using convolution and compare it with an Erlang distribution.
Function to compute sum of k i.i.d. exp variables
<- function(rate=1/2, k=2, n = 100){
compute_dist_sum_exp # divide by k so sum of k identical distribution can go up to n
<- seq(0, n/k, by=0.05)
x_range <- dexp(x_range, rate = rate)
curr_density
if(k>1){
sapply(
2:k,
\(curr_k){# compute distribution of sum of 2 exponential random variable using convolution
<<- convolve(
curr_density
curr_density,rev(dexp(x_range, rate = rate)),
type = "open"
)
# adjust x_range for length of the convolution
<<- seq(0, n/k, length.out = length(curr_density))
x_range
}
)
}
data.frame(
x = seq(0, n, length.out = length(curr_density)),
density = curr_density/sum(curr_density)
) }
For this demonstration, rate = 1/2
is used, while value for \(k\) (i.e. shape of Erlang) can be adjusted below.
Compute data for plotting
<- 20
n
<- bind_rows(
sum_k_exp lapply(
1:10,
\(curr_k){compute_dist_sum_exp(k=curr_k) %>%
filter(x<=n) %>%
mutate(k=curr_k)
}
)
)
<- bind_rows(
erlang_dist lapply(
1:10,
\(curr_k){data.frame(
x = seq(0, n, 0.05),
density = dgamma(seq(0, n, 0.05), rate=1/2, shape=curr_k)
%>%
) mutate(
# normalize
density = density/sum(density),
k = curr_k
)
}
)
)
write_csv(sum_k_exp, "./data/sum_k_exp.csv")
write_csv(erlang_dist, "./data/erlang_dist.csv")
Recap
Discussed the goal of Linear Chain Trick
Formulation and implementation for Linear Chain Trick in R (using deSolve package)
(Hopefully) provide a simple explanation for the formulation