Prerequisite¶

To run the Python code below, you need this notebook to have acces to the covid library from Covid-19 repository https://framagit.org/benjaminpillot/covid-19 Either locate the notebook in the same folder as the working copy, either use a symbolic link to the covid folder of the repository.

SBIR Model¶

We propose a simple modification of the classical SIR model : $$ d_t S=-\beta SI/N\\ d_t I=\beta SI/N-\alpha I\\ d_t R = \alpha I\\ S+I+R=N $$

Here variables are dimensional. If all populations are reported to the total population $N$, this writes:

$$ d_t S=-\beta SI\\ d_t I=\beta SI-\alpha I\\ d_t R = \alpha I\\ S+I+R=1 $$

We will introduce an extra compartiment $B$,= and define the Blob as $B+I+R$. This compartiment will gather the people like a diffusion process. You can enter the Blob when you are Suspected, simply by diffusion via the social relations (work, family, transportation). At early times, this diffusion process will be linear in time (like the behavior surface of a diffusive tracer). At large times, the Blob will cover the whole population. This can be written $d_t B=\kappa S$, as $S=1$ at early stages of the epidemy (linear increase in time). In the absence of epidemy, $B$ increases from 0 to 1 and $S$ decreases from 1 to 0 via this simple diffusion process.

Now, the SIR model can be applide inside the Blob. The idea is that in the SIR model, you need to reach a very high value of infection to reach collective immunity. But does it make sense at the scale of a country ? We propose to watch this evolution inside the Blob itself, that increases with time due to diffusion, with a dynamics that differs from the one of the epidemy. The only new parameter included is here $\kappa$, that describes diffusion (growth) of the Blob.

Finally, the lastthing to consider is how you exit the Blob. Here we simply assume, that if you leave the infected group over a timescale $1/\alpha$, you also have to leave the blob (if not infected) over this sime scale. This writes :

$$ d_t S=-\kappa S+\alpha B\\ d_t B=\kappa S-\alpha B -\beta BI/(1-S)\\ d_t I=\beta BI/(1-S)-\alpha I\\ d_t R = \alpha I\\ S+B+I+R=1 $$

Please note that, whereas before $N$ was appearing explicitely in the bilinear term, here it is $N(B+I+R)=N(1-S)$ that needs to be considered, as the SIR model develops inside the Blob only. This is why in nondimensional variables, the factor $1/(1-S)$ appears. And this is fundamental : collective immunity will occure inside the Blob, not at the scale of the country.

Step 1 : extracting data from JHU with covid-19 Python library¶

We will use two elements provided in the JHU daily timeseries, namely the confirmed cases and the recovered cases.

Recovered cases $r$ for JHU are daily recoveries. You need to integrate these data to get $R(t)=\int r(t)dt$ or, if deaths are included in $R$, $R(t)=\int (r(t)+d(t))dt$
Confirmed cases $c$ correspond in the SIR model to the daily increase of infected cases : $c(t)=\beta SIdt/N$ with $dt=1day$ (or $c(t)=\beta SIdt$ in normalized units).

Normally we should have at large times $R(t)=\int (r(t)+d(t))dt=\int c(t)dt$, because all the recovered $R$ (or recovered + deaths), are coming from the infected $I$. If at the end of the epidemy we have $I(t)=0$, then the two integrals should be equal.

Let's load some libraries to check this

import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
from covid.plotting import plot_country_cases, get_world_map, plot_country_pace
from covid.database import WorldCovidMatcher, CountryCollection                                                                                                             
import pandas as pd
import os
from scipy import interpolate
import country_converter as coco

Let's connect to the database and create the matchers for confirmed and recovered cases (the last possible matcher would be deaths) :

cm_c = WorldCovidMatcher('confirmed')                                                                                                                                           
cm_c.build_database()
cm_r = WorldCovidMatcher('recovered')                                                                                                                                           
cm_r.build_database()
cm_d = WorldCovidMatcher('deaths')                                                                                                                                           
cm_d.build_database()

We extract $c$,$r$ and $d$.

country_list=["Austria", "Italy", "Germany", "France", 
                    "Spain", "Denmark","Greece",
                    "Belgium"]
c = cm_c.get_cases(CountryCollection(country_list), normalized=True, cumulative=False)
r = cm_r.get_cases(CountryCollection(country_list), normalized=True, cumulative=False)
d = cm_d.get_cases(CountryCollection(country_list), normalized=True, cumulative=False)
for country in country_list:
    c[country].values[0]=0
    d[country].values[0]=0
    r[country].values[0]=0

country='Austria'

plt.figure()
plt.plot(np.cumsum(d[country].values[1:]+r[country].values[1:]))
plt.plot(np.cumsum(c[country].values[1:]))

[<matplotlib.lines.Line2D at 0x7fe2a18749a0>]

Unfortunately, the agreement is not so good for all countries... Some people are lost in the data ! We will try to fit the data without estimating directly $I$.

We will first try to obtain an estimate of $\beta-\alpha$ in the increase period. At the beginning, the linear growth is faster than the exponential one, hence $B+I+R\simeq R$ and : $$d_t I=\beta BI/(B+I+R)-\alpha I\simeq (\beta-\alpha)I$$ We have $I=I0e^{(\beta-\alpha)t}$, and this is more clearly visible on $\int Idt=I0+I0/(\beta-\alpha)e^{(\beta-\alpha)t}$. We look for an exponential behavior on the cumulative confirmed cases of JHU data. Here is a selection of countries with similar behavior. Time origin is choses at a prevalence of 0.1/100000.

This has to be written also in the SBIR model.

fig = go.Figure()
sum=0
for country in country_list:
    I=np.cumsum(c[country].values-r[country].values-d[country].values)
    offset_date = c[country][I >= 0.1].index[0]
    cases=c[country].copy()
    cases.index -= offset_date                                                                                                                                
    fig.add_trace(go.Scatter(x=cases.index.days,y=I, mode="lines",
                                         name=country))
days_doubling=3.1
fig.add_trace(go.Scatter(x=[0,20],y=[20,20*np.exp2(20/days_doubling)], mode="lines", 
                                         name='doubling every %0.1f days' % days_doubling))
fig.update_layout(title='Rough estimate of 𝛽-α=%0.3f' % (np.log(2)/days_doubling),
                   xaxis_title='Days',
                   yaxis_title='∫𝐼𝑑t (per 100000)')
beta_min_alpha=np.log(2)/days_doubling
fig.update_xaxes(range=(-20,100))
fig.update_yaxes(type="log")
fig.show()

Now we will observe the decrease phase.

plot_country_cases(country_list, 
           case_type="confirmed cases", cumulative=False, use_plotly=True)

We clearly see an exponential decay, with a different timescale than the increase. Note that this cannot be obtained with the SIR model, unless you change the coefficients of the equations. This exponential decay is even more clear when plotting $\int_t^\infty c(t)dt$ :

fig = go.Figure()
sum=0
for country in country_list:
#    Icum=np.cumsum(np.cumsum(c[country].values-r[country].values-d[country].values))
    Icum=(np.cumsum(c[country].values))
    offset_date = c[country][Icum >= 0.1].index[0]
    cases=c[country].copy()
    cases.index -= offset_date                                                                                                                                
    sum+=Icum[-1]-Icum
    fig.add_trace(go.Scatter(x=cases.index.days, y=Icum[-1]-Icum, mode="lines", legendgroup="group2",name=country))    

fig.add_trace(go.Scatter(x=cases.index.days, y=sum, mode="lines", legendgroup="group2",name='Sum'))    
days_doubling=14.0
fig.add_trace(go.Scatter(x=[20,100],y=[3000,3000*np.exp2(-80/days_doubling)], mode="lines", legendgroup="group2",
                                         name='halved every %0.1f days' % days_doubling))    
fig.update_layout(title='Decay estimate of 𝜆=%0.3f' % (np.log(2)/days_doubling),
                   xaxis_title='Days',
                   yaxis_title='∫𝑡→∞ c(t)𝑑𝑡')
fig.update_yaxes(type="log")
fig.show()

Now, let's reproduce this with the SBIR model. We find $\beta-\alpha$ from the increase period, and deduce $\alpha$ from the decrease period after some algebra. Finally, $\kappa$ is obtained by fitting the value of the peak maximum. We will now try to estimate this peak value.

from scipy.interpolate import UnivariateSpline
for country in ['Italy','Spain','Germany','France','Denmark']:
    fig = go.Figure()
    cases=c[country].copy()
    offset_date = cases[cases.values >= 0.1].index[0]
    cases.index-=offset_date
    spl = UnivariateSpline(np.arange(0,60),cases.rolling(7).mean()[(cases.index.days>0) & (cases.index.days<61)])
    spl.set_smoothing_factor(65)
    Imax=np.max(spl(np.arange(0,60)))
    fig.add_trace(go.Scatter(x=cases.index.days, y=cases, mode="markers", legendgroup="group2",name="confirmed"))    
    fig.add_trace(go.Scatter(x=np.arange(10,50), y=spl(np.arange(10,50)), legendgroup="group2",name="spline"))    
    fig.add_trace(go.Scatter(x=[0,100],y=[Imax,Imax],name='Imax=%f' % Imax))
    fig.update_yaxes(type="log")
    fig.update_layout(title='%s, t=0 corresponds to %s' % (country,offset_date.strftime('%Y %m %d')),
                     yaxis_title='Daily confirmed cases (per 100 000)') 
    fig.show()

Here is what we obtain :

cases.rolling?

for country in ['Italy','Spain','Germany','France','Denmark']:
    fig = go.Figure()
    cases=c[country].copy()
    offset_date = cases[cases.values >= 0.1].index[0]
    cases.index-=offset_date
    recovered=r[country].copy()
    recovered.index-=offset_date
    deaths=d[country].copy()
    deaths.index-=offset_date
    p1=np.polyfit(np.arange(0,25),np.log(np.cumsum(1e-3+cases[(cases.index.days>0) & (cases.index.days<26)])),1)
    beta_minus_alpha=p1[0]
    cases2=np.cumsum(c[country].values)
    cases2=cases2[-1]-cases2
    p2=np.polyfit(np.arange(40,80),np.log((cases[(cases.index.days>40) & (cases.index.days<81)])),1)
    lambd=p2[0]
    p2=np.polyfit(np.arange(25,60),np.log((cases2[(cases.index.days>25) & (cases.index.days<61)])),1)
    lambd=p2[0]
    fig.add_trace(go.Scatter(x=cases.index.days, y=cases, mode="markers", legendgroup="group2",name="confirmed"))    
#    fig.add_trace(go.Scatter(x=np.arange(0,40), y=0.1*np.exp(beta_minus_alpha*np.arange(0,40)), mode="lines",
#                  name='𝛽-α=%0.2f' % beta_minus_alpha))
#    fig.add_trace(go.Scatter(x=np.arange(0,100), y=np.exp(p2[1])*np.abs(p2[0])*np.exp(lambd*(np.arange(0,100))), mode="lines",
#                            name='𝜆=%0.2f' % lambd))
    t=np.linspace(0,100,1000)
    t0=t[np.argmin(np.abs(0.1*np.exp(beta_minus_alpha*t)-np.exp(p2[1])*np.abs(p2[0])*np.exp(lambd*t)))]
    #fig.add_trace(go.Scatter(x=[t0], y=[np.exp(p2[1])*np.abs(p2[0])*np.exp(lambd*t0)]))
    try:
        df=pd.read_csv(os.path.join('data','%s.fort' % country),delimiter='\s+').transpose()
        fig.add_trace(go.Scatter(x=df.values[0], y=df.values[5]*1e5,name='SBIR'))
    except:
        pass
        
    fig.update_yaxes(type="log")
    fig.update_layout(title='%s, t=0 corresponds to %s' % (country,offset_date.strftime('%Y %m %d')),
                     yaxis_title='Daily confirmed cases (per 100 000)') 
    fig.show()

Now, we will try to estimate $\kappa$ from the GAFAM mobility data

df=pd.read_csv(os.path.join('data','Global_Mobility_Report.csv'),low_memory=False)

for country in country_list:
    data=df[(df.country_region_code == coco.convert(country,to='ISO2')) & (df.sub_region_1.isna())]
    fig = go.Figure()
    for k in ['retail_and_recreation_percent_change_from_baseline',
           'grocery_and_pharmacy_percent_change_from_baseline',
           'parks_percent_change_from_baseline',
           'transit_stations_percent_change_from_baseline',
           'workplaces_percent_change_from_baseline',
           'residential_percent_change_from_baseline']:
        fig.add_trace(go.Scatter(x=fr.date, y=data[k].rolling(7).mean(), mode="markers", legendgroup="group2",name=k))    
    fig.update_layout(title='Google mobility data for %s' % country) 

    fig.show()

df=pd.read_csv(os.path.join('data','Global_Mobility_Report.csv'),low_memory=False)
fig = go.Figure()

for country in country_list:
    data=df[(df.country_region_code == coco.convert(country,to='ISO2')) & (df.sub_region_1.isna())]
    for k in [
           'retail_and_recreation_percent_change_from_baseline']:
        fig.add_trace(go.Scatter(x=fr.date, y=data[k].rolling(15).mean(), 
                                 mode="markers", legendgroup="group2",name=country))    
fig.update_layout(title=k) 
fig.show()

df.keys()

Index(['country_region_code', 'country_region', 'sub_region_1', 'sub_region_2',
       'iso_3166_2_code', 'census_fips_code', 'date',
       'retail_and_recreation_percent_change_from_baseline',
       'grocery_and_pharmacy_percent_change_from_baseline',
       'parks_percent_change_from_baseline',
       'transit_stations_percent_change_from_baseline',
       'workplaces_percent_change_from_baseline',
       'residential_percent_change_from_baseline'],
      dtype='object')

%%javascript
//hack to fix export
require.config({
  paths: {
    d3: 'https://cdnjs.cloudflare.com/ajax/libs/d3/5.9.2/d3',
    jquery: 'https://code.jquery.com/jquery-3.4.1.min',
    plotly: 'https://cdn.plot.ly/plotly-latest.min'
  },

  shim: {
    plotly: {
      deps: ['d3', 'jquery'],
      exports: 'plotly'
    }
  }
});

1/2.7

0.37037037037037035

cases.index.days

Int64Index([-44, -43, -42, -41, -40, -39, -38, -37, -36, -35,
            ...
            109, 110, 111, 112, 113, 114, 115, 116, 117, 118],
           dtype='int64', length=163)

coco.convert('France',to='ISO2')

'FR'

'FR'