steampunk heart

How the Modelers Went Wrong – Ferguson’s ICL study [PDF & Original code-GitHub]

Sweden’s shortfall from this expected death toll is good news for supporters of its comparatively mild and voluntary social distancing approach to the pandemic.

But it also represents a complete failure of the predictive model that jarred much of the world including the United States and the United Kingdom into imposing draconian shelter-in-place policies that have now persisted for over two months.

Back in early April, a team of epidemiologists at Uppsala University adapted the Imperial College model designed by crystal physicist Neil Ferguson to Sweden in an attempt to dissuade the country’s government from its hands-off approach.

Their results, like the more famous US and UK iterations of the ICL model, predicted disaster if the country did not change course immediately.

Ferguson’s ICL study is the most famous and influential example of a relatively recent type of epidemiology model known as an agent-based simulation.

Briefly summarized, the ICL approach purports to take known or approximated data about a virus’s infection and fatality rates and subjects them to a simulation of expected transmission within a country or region, calibrated to its population characteristics and related demographic inputs.

The model then runs a succession of computer simulations that allegedly calculate how quickly the virus spreads given what they assume about the frequency of human social interactions. Since the models are probabilistic, they’re usually carried out in repetition so the final result reflects multiple runs of the simulation.

To render the simulations useful for policymakers, modelers such as the ICL team then adjust their runs to account for a variety of proposed scenarios. While the first set of predictions might reflect a “do nothing” course of no interventions and an uncontrolled pandemic, a second scenario might include what they predict to happen if schools and large sporting events are closed. A third might predict adding voluntary social distancing to the mix. And a fourth might predict a full mandatory lockdown.

Under perfect knowledge of both a virus and human behavior in response to the virus and full understanding of how each affects transmission, a computer simulation of this type could at least – in theory – approximate an actual pandemic. Indeed, modelers such as Ferguson and the ICL team seem to believe they possess such knowledge and can accurately control for such complex scenarios to the point that it yields accurate predictive information about viruses.

In a sense it’s an epidemiological approach that treats the world as a real-life version of the old SimCity computer game, and reports what happens if you play that game in repetition under the conditions of a pandemic.

The simulation approach has a severe deficiency though in that the assumptions and inputs it takes for granted are both unknown and sufficiently complex to render them unknowable. Modelers have to fill in substantial gaps in their knowledge by imposing assumptions into their code – assumptions about the transmissibility of the virus, assumption about its duration and fatality rate, assumptions about the effectiveness of policy responses, and even assumptions about the rate that people will comply with or abide by those policies.

If we look to a key table from the ICL COVID model, we quickly find that these assumptions are little more than guesswork – particularly when it comes to the effectiveness of the proposed policy interventions. The table shows four modeled policy responses that purport to contain COVID-19. Note that all four adopt nice, even, round numbers as their parameters for modeling the proposed intervention: a 70% compliance rate with X, a 50% compliance rate with Y, and a 25% reduction in behavior Z, all allowing precise predictions of how the pandemic will supposedly play out.

Curiously, there appears to be little effort in these models to test and verify whether the underlying assumptions are correct. They also do very little to account for behavioral changes during the course of the pandemic that will almost certainly alter factors such as compliance rates with policy directives, or even voluntary behaviors that people undertake on their own to mitigate the risks of the disease (think about increased hand-washing). Instead, we have parameters that essentially amount to guesswork, all of it hard-coded into the model. And if any one of those underlying assumptions ends up being wrong, it potentially throws off the entire predictive ability of the model itself.

As the Swedish application of the ICL model reveals, one or more of its underlying assumptions about the coronavirus and the effectiveness of proposed policy interventions were clearly in error.

Its predictions were therefore wholly implausible, and have already been invalidated by reality as the June 1st numbers demonstrate.

In fact, we may see evidence that the ICL model was wholly unsuited to the coronavirus by looking into its history.

Ferguson and the other ICL authors first developed this model in the mid-2000s as one of the two major contributions to the computer simulation approach for pandemics.

The second came from a team led by Robert J. Glass, a modeler at the Los Alamos laboratory who adopted a similar approach.

Ferguson and Glass were both major figures in the epidemiology community’s shift away from traditional disease-mitigation strategies and toward wide-scale lockdowns, as first proposed in 2006 following a succession of government research inquiries into influenza pandemics and the threat of bioterrorism.

The introduction of their modeling approach precipitated a debate among epidemiologists over the effectiveness of unproven strategies such as society-wide lockdowns, as well as lesser interventions such as school closures and event cancellations.

Several figures in the medical wing of the discipline expressed doubts about top-down society-wide approaches such as lockdowns at the time, noting the lack of empirical evidence behind the modeling assumptions as well as the complete absence of a causal identification strategy for the claimed effectiveness of its policy prescriptions. The modeling approach of Ferguson and Glass caught the ears of public health officials at the time though, and has since come to dominate the COVID-19 response.

Yet if we look back to Ferguson’s original paper from 2006 in which he laid out the model that he later adapted to the current pandemic, we find another stunning revelation in its penultimate paragraph:

“Lack of data prevent us from reliably modelling transmission in the important contexts of residential institutions (for example, care homes, prisons) and health care settings; detailed planning for use of antivirals, vaccines and infection control measures in such settings are needed, however. We do not present projections of the likely impact of personal protective measures (for example, face masks) on transmission, again due to a lack of data on effectiveness.”

Among its many shortcomings, the original ICL model lacked a means of accounting for the transmission of viruses in residential institutions such as nursing homes and similar long-term care facilities.

As the last several months have shown however, nursing homes are acutely susceptible to the coronavirus and may be the single largest factor in explaining its high fatality rate. In many countries and US states, nursing homes even account for more than half of all total coronavirus fatalities. In virus hotspots such as New York, the nursing home problem was compounded even further by likely undercounting of deaths and an emerging scandal over Gov. Andrew Cuomo’s order forcing nursing homes to admit known coronavirus carriers as a way of mitigating hospital capacity strains that were never actually realized.

Returning to the question of epidemiology modeling, the nursing home issue may reveal a fatal flaw in the ICL model’s underlying assumptions.

If indeed it had no means of accounting for viral transmissions in residential institutions as Ferguson’s 2006 paper indicates, the ICL model completely missed what now turns out to be the single greatest vulnerability point for the coronavirus pandemic (the ICL team has thus far resisted calls to release its original code from the COVID simulation, and public versions of the same code are riddled with bugs and errors. The released version of its COVID paper does not give any indicator though that they modified the 2006 study to account for nursing homes).

Such an oversight would further imply that the predictive scenarios of the ICL model’s policy interventions are not only misdirected away from the primary vulnerability points and onto society at large, but also that the forecasted mortality ranges of its milder scenarios under the lockdown have little basis in reality.

If nursing homes account for the lion’s share of COVID mortality as the statistics now show, the realized death rates in countries that went under lockdown may only be said to follow the ICL model’s milder scenarios as a result of coincidence. Even when the total numbers in some countries appear to match up with the ICL’s midler scenarios, the deaths predicted by the model are not the same types of deaths we are seeing in reality.

When it comes to predicting the actual mechanisms of the pandemic, including the danger it poses to nursing homes, the modeling approach appears to be functionally useless and catastrophically off the mark.

Source: Phillip W. Magness – AIER

Note: This article was first published on 1 June, 2020

For download,

COVID-19 CovidSim Model – GitHub (zip – master – 90.1 MB)

Please also check this page to meet those who developed this “masterpiece”

Basically there are 4 people behind it:

Neil Ferguson, Matthew Matthon Dann, Ian Lynagh, “dlaydon”

For original zip – master please check here

“This is the COVID-19 CovidSim microsimulation model developed by the MRC Centre for Global Infectious Disease Analysis hosted at Imperial College, London.

CovidSim models the transmission dynamics and severity of COVID-19 infections throughout a spatially and socially structured population over time. It enables modelling of how intervention policies and healthcare provision affect the spread of COVID-19. It is used to inform health policy by making quantitative forecasts of (for example) cases, deaths and hospitalisations, and how these will vary depending on which specific interventions, such as social distancing, are enacted.

With parameter changes, it can be used to model other respiratory viruses, such as influenza.

The model is written in C++.

The primary platforms it has been developed and tested on are Windows and Ubuntu Linux.

It should build and run on any other POSIX compliant Unix-like platform (for example macOS, other Linux distributions). However, no active development occurs on them.

Running the model for the whole of the UK requires approximately 20GB of RAM. Other regions will require different amounts of memory (some up to 256GB).

It is strongly recommended to build the model with OpenMP support enabled to improve performance on multi-core processors. 24 to 32 core Xeon systems give optimal performance for large (e.g. UK, US) populations.

See for detailed build instructions.”

This masterpiece was developed from March 25 by Ian Lynagh (igfoo), followed on April 1 by Neil Ferguson and from April 2, 2020 by Matthew Matthon Dann and “dlaydon”

Also, but not the least, meet MRC Centre for Global Infectious Disease Analysis team

John Lees *johnlees – Research Fellow at Imperial MRC GIDA. Works on statistical genetics/bacteria

Rich FitzJohn *richfitz – Member of MRC Centre for Global Infectious Disease Analysis

Swapnil Mishra *s-mishra – Post-doc at School of Public Health, Imperial College London

It would be nice if this teams would answer some questions in front of a commission of inquiry.