Exaggerated estimates from epidemiological modelling were growing exponentially during the 'pandemic'
Dashboards reflecting reality or manufacturing it?
Acting in ‘journalist mode’ Martin recently wrote an article for the Conservative Woman online magazine, summarising an original article, The Dashboard that Ruled the World, written by Thomas Verduyn, about the John Hopkins University (JHU)1 Dashboard. The thrust of Verduyn’s claim is that the Covid-19 dashboards were using epidemiological computer models as surrogates or as supplemental to actual case/death data.
Verduyn cites the data from New York City as proof that the output from these epidemiological models was used to simulate a pandemic, i.e. simulated death data was being used as if they were actual deaths ‘on the ground’. This potentially explains why certain cities/regions experienced spikes in deaths whilst neighbouring cities/regions experienced nothing much at all (New York City and Bergamo, Italy for instance whilst neighbouring regions were unaffected).
This got us reflecting on what the epidemiological models were producing as output and how good the quality of the input data was. Verduyn cites an X/Twitter post by jockthedog showing that that JHU, dated January 29th, 2020, simulated (modelled) infection data for various countries on that date (one week after the dashboard went live). This is shown in column i) of the table below.
You will notice that the estimated number of infected cases is huge, even for counties where the number of confirmed cases is tiny, as shown in column ii). So, for South Korea we have 4 confirmed cases and 44,232 estimated infected cases. This pattern is repeated for the other countries.
We can compare this with another dashboard, OurWorldInData’s Covid-19 dashboard, which provided two measures of confirmed cases, columns (iii) and (iv), showing confirmed cases on the 29th January 2020 for the same countries (it probably won’t surprise you that OurWorldInData actually have two sets of confirmed case numbers available in different parts of their website).
You will notice that the table shows three different values for confirmed cases, two from OurWorldInData and one from JHU, reflecting a huge amount of uncertainty about these figures, suggesting chaotic differences in data definition, and data collection standards, which will be affected by administrative delays etc. A good article describing this problem in data collection was written by Justin Hart in April 2020 and can be found here.
So, for South Korea we either have 4, 3 or 0 confirmed cases. Yet JHU was estimating infections in column (i), from their computer model, at 44,232. The Australian confirmed cases were not much different at either 7, 4 or 0 yet the JHU estimated infections was one quarter the size at 9,084 (maybe the model probably took into account differences in population density or some suchlike). But the key message here is the exaggerated scale of the predicted cases compared to the actual cases.
It’s difficult to make predictions, especially about the future
Richard Feynman
Clearly these epidemiological models appear to ‘overshoot’ reality. An analogy would be finding a wasp’s nest in your attic and assuming that the nest will grow in size doubling every hour, like bacteria in a petri dish, until it takes over every other room in the house. Plainly wasp’s nests don’t operate like that, and we can see that the Covid-19 pandemic also didn’t.
These models are not only producing extremely large values but also values that are precise. Statisticians would expect to see confidence intervals here, to reflect inevitable uncertainties, but they are not provided. Furthermore, given the input data includes zero confirmed cases, such confidence intervals should include zero infections, but they do not. Hence, they have a predilection to exaggerate.
This problem can also be found in the computer models used by OurWorldInData. Column (v) shows the output from one of their computer models (they used three SEIR (Susceptible, Exposed, Infectious, Recovered) models - this column shows the output from one). Based on their confirmed case counts, 0 or 3, they estimated 69 infections in South Korea on 29th January 2020. So JHU estimated 44,232 and OurWorldInData estimated 69, which is a whopping difference. The question arises: If you worked in public health which one, would you pick to communicate to decision makers?
If we look back, the public were being told the pandemic was ‘going exponential’ at various stages throughout 2020, 2021 and 2022. The New York Times even took special care to ensure the public knew what this meant. As usual with the media there was no attempt to caveat or question the exponential assumption; it was simply taken for granted.
As we said OurWorldInData use three epidemiological models. The above table contains the results from one model, estimated infection estimates produced by the YYG model. Here is a comparison of the estimates from the other two of the OurWorldInData computer models (IHME and ICL (Imperial)) from, on 31st December 2021. Notice how much they differ from each other and how much they inflate confirmed cases: by up to 25 times (5 million versus approx. 200k).
Clearly, the modelled estimates are of such variability so as to be not only unusable for prediction purposes but also of such poor quality they should never have been used as surrogates for the ‘inevitable future’ (and we are leaving aside ethical questions for another day).
Moreover, and it probably will not surprise readers, the accuracy of these models had not been previously established using data from past ‘pandemics’. And what is more, there was no gold standard to determine if they could ever be trusted given the known issues with PCR testing and the myriads of unverified assumptions that such models rely on (social distance, size of social network, virus reproduction number, etc.).
It is worth reflecting on the fact that some people were suspicious of the dashboards right from the start in early 2020. Perhaps the earliest expression of concern is recorded in this article again by Justin Hart. In it he quotes John Ioannidis, professor of disease prevention, Stanford University:
“The data collected so far on how many people are infected and how the epidemic is evolving are utterly unreliable.”
We couldn’t have expressed the sentiment better ourselves.
Notice we chose to look at model predictions of infections before a pandemic had even been declared. These were huge exaggerations on 29th January 2020 but by March 2020 the exaggerations had themselves grown exponentially.
Ferguson’s model- exponential growth in exaggerations
Imperial College’s serial exaggerator Neil Ferguson, along with a bevy of coauthors, pulled out his now infamous 13+ year old software program for predicting infectious disease spread and cases and used it to model infections, hospitalisations and mortality from Covid-19. In this report Ferguson et al predicted 510,000 deaths in the UK, and over 2.2 million in the USA, in eight weeks, based on an infection fatality ratio of 0.9%.2
Ferguson’s model was programmed with inputs that included an arbitrarily assumed virus reproduction number referenced to two prior works:
A paper by two Swiss researchers that estimated the reproductive number in China using a simulation model, ‘estimations of uncertainty’ and a process of elimination using sparse available data, a ‘wide range of parameter combinations’ and the incredibly named variable ‘comparison to past emergencies’; and
A paper from Chinese researchers who acknowledge they lacked key epidemiological data but instead guestimated several important variables, outputting a weakly estimated reproductive number and the only definitive conclusion of their paper - that human-to-human transmission had occurred.
It is interesting to note that Ferguson’s model assumes values for reproduction number that are respectively 10% and 20% higher than those of either paper specifically cited for that value.
It was shocking to see the unfailingly doomsday predictions of Neil Ferguson being used yet again as a false premise to direct important government health policy response. At the time Ferguson presented his now infamous Covid-19 report, that almost everyone now agrees was erroneous and exaggerated, despite his history of failures - having already delivered numerous wildly exaggerated predictions for disease transmission and death that were the cause of controversial and expensive UK policymaking (here, here, here, here, here and here).
Here he is on twitter on 26th March 2020 confirming his doomsday predictions (without confidence intervals or any caveats of any kind!)
This was either an input to policy (or an output of policy if you are cynical and sceptically minded) and supposedly formed the basis for the ‘flatten the curve’ lockdown pushed by UK governmental advisor Dominic Cummings, which was supposed to reduce Covid-19 deaths by half.
With 500+ UK cases, Whitty says, many in No 10 still did not realise how fast things would move:
“This was a lot of people really not getting what exponential growth was going to mean.”
These policies lead to numerous presentations by scientific advisors and government officials where they exulted in the dangers of exponential growth. Here is the Guardian waxing nostalgically about the ‘best bits’ from the ‘pandemic’.
How well did the models predict Covid-19 deaths?
When we examine the total number of Covid-19 deaths for 2020 (that is, not just for Ferguson’s eight-week period but for the whole of March to December 2020) the variation and inaccuracies in the Covid-19 data that prominent researchers have previously identified several times (here, here and here) persisted.
No two UK government data sources agreed on any single figure. For Covid-19 deaths in 2020 the UK Govt Coronavirus Data website reports 86,765. The ONS reported 69,711 and the UKHSA reports 72,178. And the number reported to OurWorldInData is even higher, peaking at 94,998.
In any event, every one of these numbers is only a fraction of the Ferguson prediction of 500,000 or of half of the figure given by Ferguson assuming the lockdowns were effective.
A close look at mortality data for 2020 makes it clear the UK has not had to face a pandemic any worse than that presented by any historical flu season. This is shown in the chart where the age standardised mortality rate in 2020 was comparable to that in 2008. Compared to the 5 years preceding 2020 the excess mortality was circa 8%.
The complete absence of exponential growth in deaths had already been revealed by April 19th by the Times of Israel, which showed that many countries, most notably Sweden, did not experience any exponential growth in cases or deaths despite not enacting major lockdowns (expect in care homes)3. This was also reported throughout the press, such as here, but government policy was immune to this refutation.
It is also notable that the Flaxman paper, coauthored with Ferguson, unashamedly stated that “our model relies on fixed estimates of some epidemiological parameters (such as the infection fatality rate)”, despite the fact that these should have been treated as uncertain estimated values. When applied to Sweden this fixed fatality rate estimate would have resulted in 90,000 deaths by May 2020 when in fact there were only 4,350 deaths.
The FatEmperor compared Sweden’s experience, managed by Anders Tegnell who used the WHO 2019 pandemic guidelines, and compared them with that predicted using of Ferguson’s model and found that ‘Sweden won’ (here).
The Spectator magazine produced its own dashboards for the UK SAGE group’s modelling, accessible here. The one showing UK Chief Scientific Advisor Patrick Vallence’s prediction of Covid-19 cases, 21st September 2020, showed just how wrong the exponential exaggerations can be.
And here is the one for Covid-19 deaths in 2020 (and many of the actual deaths are themselves an exaggeration and not even due to a spreading virus):
If the UK didn’t go exponential could a ship?
So clearly the epidemiological models failed to predict the number of deaths in a large number of countries, some of which were simply ignored (Sweden). If they couldn’t predict what happened in a country could they do better in a more controlled, smaller and isolated environment, such as that presented by a cruise ship?
Luckily such a ship was available for study - The Diamond Princess.
Dr. Michael Levitt carefully examined events on the Diamond Princess cruise ship and used this to refute Ferguson’s computer model estimates. The full report can be found here.
What he found was stunning. The cruise ship offered the perfect environment to model the spread of the virus and its impact on an elderly population in quarantine:
“Cruise ships are ideal for studying characteristics of infectious disease, both in terms of population and environment. There are many features similar to a community taking place in small spaces with close contact: gambling rooms, theaters, performances, pools, and shared daily facilities like buffets, toilets, spas, elevators, and narrow hallways.”
The ‘epidemic’ began when one passenger began having symptoms on 19 January 2020 and boarded the Diamond Princess the next day. From then all passengers were repeatedly exposed, and hence had a high contact rate and exposure.
He found that the actual Covid-19 deaths were a fraction of those that might be estimated using Ferguson’s computer model, and that the mortality burden would only be approximately equal to an additional one month of normal mortality.
Levitt found that:
Ferguson predicted 59 deaths when in fact there were 7.
The deaths were ‘with’ Covid-19 not ‘of’ Covid-19.
He had to tweak an unrevealed ‘fudge factor’ or the herd immunity threshold to get Fergusons model to match his published estimates for the USA, UK and Diamond Princess.
The inaccuracy in Ferguson’s model was calculated by dividing the actual deaths by the predicted deaths (actual were 12% of predicted).
When applied to the UK the revised estimate was 61,310 deaths.
Levitt took these calculations to the UK SAGE group and Professor Sir David Spiegelhalter in March 2020, after Spiegelhalter had published an article on Covid-19 mortality risk. Levitt claims Spiegelhalter’s estimates also turned out to be exaggerations, but in any case, Levitt was simply ignored and warned off.
Levitt believed the reason the Ferguson (ICL-Ferguson) model was mainly wrong because:
“Part of the reason ICL-Ferguson's estimates were so wrong may be that they used a CFR as an IFR. Despite accusations levelled at those who opposed lockdowns that they were confusing infections with cases, it was in fact Neil Ferguson and ICL themselves who appear to have confused or conflated the two. An IFR should be lower than a CFR, and as Dr. Levitt knew from his work on China, the ICL-Verity IFR was far too high. It is actually much closer to case fatality ratio, not an infection fatality ratio.”
Additionally, the fact that it ignored asymptomatics, didn’t properly account for different infection rates in different population age strata and miscalculated herd immunity also led to overestimates.
Despite these issues lessons were not learnt. The Ferguson (Imperial) model, and other models used by OurWorldInData, as demonstrated in the previous section, continued to massively differ and exaggerate the level of infections and deaths and did so beyond 2020.
No one thought to look back at what earlier estimates were being produced by such models. If we look again at the estimates form JHU, we had almost 7 million infected in China on 29th January 2020 and 116,000 in Thailand. Applying Ferguson’s 1% infection fatality rate would predict over a thousand deaths in Thailand and 70,000 deaths in China. Yet Thailand would experience no deaths throughout 2020 and China peaked at just over 5,700 and did not reach that until 2023!
We think Levitt was being kind. But he himself was over certain of his prognostications. After all how is one supposed to know the exact parameter values needed to predict a pandemic right at the start of a pandemic? And how uncertain should they be assumed to be?
Also, how do you know you even have a pandemic on your hands? Levitt assumed the label ‘Covid-19’ was meaningful and independent of the PCR test and that the symptoms of Covid-19 were in some way differentiable from other respiratory illnesses. We now know this is not the case.
If a ship can’t go exponential how about a single person?
If epidemiological models fail spectacularly in complex systems, such as those presented by countries and ships, surely, they could predict viral infection of a single person or between two people?
Jonathan Engler has written about a SARS-CoV-2 virus challenge experiment, reported here, where they tried to give people ‘Covid-19’ and failed. This includes vaccinated and unvaccinated experimental volunteers, each of which were given the virus via nasal droplets. None developed symptoms (the paper says: “unable to induce sustained infection in any volunteers”).
Unfortunately, this isn’t new. Jonathan Engler points out that Rossenau could not force infection of influenza between people in 1918. Sure, there are studies that demonstrated flu shedding (here and here) but, critically, did not demonstrate transmission.
If infection cannot be forced in a controlled environment how can epidemiologists hope to model viral transmission dynamics in complex societies with mixed populations?
Neil and Fenton’s epidemiological model
As an aside, it is worth mentioning that we ourselves attempted to estimate the infection prevalence and fatality rates for Covid-19 and reported our model in this article published in the journal of risk research in May 2020. Our estimate of the infection fatality rate was 0.3%-0.5%, which wasn’t ‘too bad’ and perhaps comparable to the flu. However, as we have discussed here, we were naïve and too trusting of the authorities, but also at that time we were blind to the iatrogenic healthcare effects and the issues with PCR testing. Additionally, from the estimated community prevalence rates in different countries and regions it was clear there was no evidence for exponential spread.
Conclusion
We are left with the conclusion that epidemiological models of respiratory pathogens produce hopelessly exaggerated estimates based on assumptions about parameters that are either unknowable or are so difficult to measure that their credibility is highly suspect.
Given it is likely there are many models as PhD qualified epidemiologists it difficult to imagine which model might be trusted a priori. Even ‘averaging’ the results of many models looks fruitless given the wide uncertainties in the inherent inaccuracy of outputs produced by each.
Moreover, the fundamental mechanism at play - infection and transmission - is something we know next to nothing about and for some inexplicable and odd reason looks to be subject to ongoing disinterest from virologists, epidemiologists and public health wonks. Likewise, as well as being blind to the harmful effects of the non-pharmaceutical interventions being modelled, they also neglected the impact of harmful medical interventions and policies.
Finally, these computer models are built on the assumption a pandemic has started. Unfortunately this provides an inevitable hidden incentive to confirm the belief there is indeed a pandemic to be modelled and a reputation to be made.
Highly influential Professor Ingelsby, Director, Center for Health Security at John Hopkins published the seminar paper “Disease Mitigation Measures in the Control of Pandemic Influenza” where he argued against masks outside of clinical settings, against social distancing and against quarantine. The justification being that the forecast economic and social costs outweigh the medical benefits. Despite this, he then went on to chair the notorious Event 201 pandemic simulation exercise in October 2019 (co-sponsored by the Bill and Melinda Gates foundation) and directly contradicted his own academic conclusions in April 2020, by recommending masks and social distancing, on Fox News in the USA.
Other papers by Ferguson et al calculated the infection fatality rates, such as this Lancet paper published in June 2020. These did have confidence intervals and reported an infection fatality ratio of 0.657%, less than the March 2020 estimate. Over time the infection fatality estimate will naturally converge on a much lower value as more cases are reported, for a stabilising cumulative number of deaths, but Ferguson neglected to take into account this asymptotic relationship.
Many countries saw no change in mortality at all never mind exponential growth. Even within the UK Scotland and Northern Ireland saw no significant increase in mortality over the critical period.
...and didn't Feynman also say something to the effect that no matter how fancy your models are, if the don't reflect reality, they're wrong.
>"Even ‘averaging’ the results of many models looks fruitless given the wide uncertainties in the inherent inaccuracy of outputs produced by each."
It’s never a good idea to average the results of models unless you can be confident regarding the representativeness of the sample. If I may quote Prof. Professor Eric Winsberg of the University of South Florida (a philosopher who specialises in the treatment of uncertainty in mathematical models):
“Ensemble methods assume that, in some relevant respect, the set of available models represent something like a sample of independent draws from the space of possible model structures. This is surely the greatest problem with ensemble statistical methods. The average and standard deviation of a set of trials is only meaningful if those trials represent a random sample of independent draws from the relevant space—in this case the space of possible model structures. Many commentators have noted that this assumption is not met by the set of climate models on the market…Perhaps we are meant to assume, instead, that the existing models are randomly distributed around the ideal model, in some kind of normal distribution, on analogy to measurement theory. But modeling isn’t measurement, and so there is very little reason to think this assumption holds.”
He was talking about climate model ensembles but the problem is no less relevant with epidemiological models.