UK Covid Testing data: Remarkable relationship between number of tests and positivity rate when we drill down to regions
23 Dec update: This article provides further explanation for these observed results
Following on from this post where I expressed concern at the unavailability of testing data broken down into regions, the data are now available at https://coronavirus.data.gov.uk/details/testing. And the results reveal something startling that you simply cannot see by looking at just the UK data overall.
First of all here is the overall (not very informative) plot of the number of daily tests versus percentage of positive tests (also called the positivity rate) for the whole UK:
Now even if there was no genuine increase in the population infection rate then, when you massively increase the number of people being tested (as happened in early September), you would expect to see a corresponding increase in number of 'cases' (i.e. positive test results) simply because of the probability of false positives. Hence, many argued that the increase in cases observed when testing was ramped up in September was explained largely by the false positive rate. However, if this hypothesis were correct, we would not see an increase in the positivity rate - unless the false positive rate increased (e.g. due to increased human error) as more people were tested. The fact there was an increase in positivity rate in September therefore seems to confirm a major increase in infection rate in that period.
Since early November there is no obvious correlation between number of tests and positivity rate. Hence, is seems justifiable to conclude that the number of tests performed has no impact on the positivity rate (or vice versa) and hence that the recent (slight) increase in positivity rate since early Dec represents a genuine increase in spread of the virus that may justify the renewed lockdowns.
However, it turns out the overall UK graph is hiding what is happening at the regional level. The following regional graphs of number of tests versus positivity rate are taken straight from https://coronavirus.data.gov.uk/details/testing (Note that the website only has the regional breakdown for England; for Scotland Wales and N Ireland we do not even have the overall number of tests):
The trend plots in these main regions are different, with increases and decreases in positivity rates occurring at different times. Yet, in each region since mid-September, the positivity rate very closely correlates with the number of tests irrespective of the direction of movement.
But it doesn't stop there. The website also now provides the same data at a lower level of granularity - namely health authorities. And, when you drill down to those, you can see from these examples chosen at random (despite some very different trend plots at these local levels), the correlation between number of tests and positivity rates is almost perfect:
For example, both the number of tests and the positivity rate for Ashford have recently been steadily increasing while in North Tyneside both the number of tests and the positivity rate have been steadily decreasing.
So, what can we conclude from these remarkable local correlations and why they started in September? [UPDATE: see this article for explanation] It suggests some systematic non natural factor independent of a virus. Otherwise two possible (causal) hypotheses are:
That the testing is highly accurate and since September people have only got tested when they think they might have the virus. As the infection rate increases (resp. decreases) the number of people choosing to get tested increases (resp. decreases).
That as the number of people tested increases (resp. decreases) the false positive rate increases (resp. decreases) due to increased (resp. decreased) possibility of human error.
One definite problem with hypothesis 1 is that - if it were true - we would expect to see similarly identical (but delayed) trend plots in hospital admissions and deaths, which does not seem to be the case. Another definite problem with hypothesis 1 is that it relies on the assumption that people only get tested when they either have symptoms or have been in recent contact with a person recently tested positive. But we know this is not the case. A very large proportion of people tested since September (including tens of thousands of students) have had neither symptoms nor a recent positive contact.
One possible problem with hypothesis 2 is that it it fails to explain what happened between July and September when testing increased but there was no real increase in the positivity rate. Well, on the one hand testing levels were still sufficiently low for there to be plenty of checks in place (including confirmatory tests for positive results); that may not have been possible with such a big increase in testing in the autumn; also nobody doubts that, while the virus was almost gone in the summer there was an increase in infections in the autumn, so it is inevitable that the positivity rate would start rising then. But, in the absence of some systematic non natural factor independent of a virus, the fact we see remarkable correlations involving rises and falls since September suggest that this must at least in part be explained by hypothesis 2. In other words, if a local authority decides to increase testing then they would see an increase not just in number of 'cases' but also in the positivity rate. In that case a decision simply to increase testing could lead to a region being moved to a higher lockdown 'tier'.
Of course, because of the massive limitations of what can be concluded from simple data like number of tests and number of positive (as explained here) there is massive uncertainty about any of the above conclusions. The analysis certainly supports the need for the kind of additional data for which I have been I have been arguing for a long time; in particular, we need to know not just number of people being tested but also number of people tested who are asymptomatic and – of those testing positive – the number who subsequently developed real symptoms. Only then might we be able to accurately determine whether the infection rate is really increasing or decreasing.
See also
How to explain an increase in proportion testing positive if there is no increase in infection rate
Covid19 hospital admissions data: evidence of exponenial increase?
A privacy-preserving Bayesian network model for personalised COVID19 risk assessment and contact tracing
Covid-19: Infection rates are higher, fatality rates lower than widely reportedCoronavirus: country comparisons are pointless unless we account for these biases in testing