Pooling samples to increase SARS-CoV-2 testing
You could check each coin one by one, which could need upto eight weighings if you're unlucky (see Figure 1A). Another way would be to `pool' the coins into two sets of 4 coins and then weigh one pool and then the next. One of these pools will weigh 40g and the other less, so you will already know that 4 coins are real. Then you could split the coins in the second pool into two each and weigh them, and then finally test each coin from the group of 2 that weighed less than 20g. This would take 3 weighings to identify the fake coin (see Figure 1B).
A similar idea can be used to pool samples to test for the virus SARS-CoV-2 which causes COVID-19. Instead of coins, we have samples taken from the nose, throat or saliva of people we are testing. Instead of weighing them, we use an RT-qPCR assay to test for the presence of the virus.
Suppose we had samples from 8 people of whom one was infected. We could have tested each sample individually. In the worst case we would need 8 tests to identify the infected person. But if we pooled the samples into sets of 4, and then sets of 2, as with the coins, then we would need merely 3 tests! Pooling samples can thus save tests. This can save time, reagents and plastics, human resource, money, and most importantly -- by enabling more testing -- lives.
Testing for SARS-CoV-2 is more involved than the setting of this puzzle. For starters, we do not know how many people are infected. We could modify the coin puzzle so that we do not know how many coins are fake either. In that case too, pooling coins reduces the number of weighings needed as long as the number of fake coins is much smaller than the number of real coins. So pooled testing of SARS-CoV-2 should help even if we do not know beforehand how many people are infected.
Another difference is that the RT-qPCR assay has a limit to its sensitivity. Though this is one of the most remarkable assays known to humans --- it can reliably catch a single molecule present in the test tube --- it can still fail to detect very small amounts of the virus. This is because the RT-qPCR machine works with a fixed volume of sample. It might happen that the virus is so dilute in the sample that the volume of sample that found its way into the assay had no virus molecules. Then the assay has correctly reported that there were no virus molecules found in the tube. However, the implication that the patient is not infected turns out to be incorrect.
This happens rarely enough when a single sample is tested at a time. But the problem can get amplified when multiple samples have to be pooled. This is because we take less amounts of each sample, leading to further effective dilution of a sample that might have had molecules from the virus.
So pooling samples dilutes them and a sample with a low viral load may fall below the level detectable by RT-qPCR. It is as if the coin weighing machine could fit only one coin. So to weigh a set of 4 we would have to cut out a quarter piece of each coin and weigh the 4 pieces, rather than 4 full coins. For a large pool, say 10 coins, perhaps a weighing machine with limited sensitivity may not be able to tell the difference between all pieces being real and one piece being fake.
With samples from people, it is quite possible for a sample to have very few viruses, close to the limit of the detection by RT-qPCR. This depends on many things that are impossible to control, like how sick the person was from whom the sample was taken, or precisely how much material was taken from the nose, throat or saliva of the person, or how long the sample was stored before testing. From more than 30,000 samples that were tested positive at the testing centre at InStem (Institute for Stem Cell Science and Regenerative Medicine) and NCBS-TIFR (National Centre for Biological Sciences, Tata Institute of Fundamental Research), Bangalore, we found that the viral load ranged across 5 to 6 orders of magnitude.
Another difference works to the advantage of virus testing. A single RT-qPCR machine can test a little under 100 pools simultaneously in one “round” of testing - it is as if we had 100 weighing machines that we could load with coins.
Dorfman pooling is a variant of the coin weighing strategy described above. It has one advantage that it works in just two rounds with a single RT-qPCR machine, as long as the number of pools is a bit less than 100. So if there were 10 infected people amongst 200, we would need at most 90 tests done in two rounds, which would take one day. If instead we split 200 into two pools of 100 each, then pools of 50 each and so on, we would need fewer tests but it could take 8 rounds, which would take several days.
Is it better to save on tests at the cost of time, or save on time at the cost of more tests? There is no one correct answer to this question, but it’s important to remember that treatment of a person is not based on whether they test positive or not, it is based on the symptoms they exhibit. Testing is important for contact tracing and quarantining and thereby limiting the spread of the infection. So, one advantage of getting test results quickly is that any infected person will spend less time unknowingly meeting others. Any contact tracing effort will find it easier to identify the people who might have come into contact with the infected person because there will be fewer such people. Similarly, any uninfected person will more quickly have their anxiety lifted and can get back to work without fear of infecting others.
Simple pooling saves substantially only when relatively few people are infected. Taking the example of 200 people as above, suppose now 30 people were infected instead of 10 (15% instead of 5% prevalence). In that case, simple pooling could need as many as 190 tests. One might as well test all 200 individually! In the NCBS-TIFR and InStem testing centre, we reached this situation in July and therefore stopped doing simple pooling. Many hospitals are reporting that 15-20% of their
suspected COVID-19 patients are testing positive. Similarly, amongst health care workers the prevalence of infection is reaching similar levels. Even large scale surveys of several cities, such as Delhi, Mumbai, Pune, are reporting that 20-40% of people tested (using rapid antibody test) have had SARS-CoV-2. In such a scenario, are there cleverer ways of using pooled testing successfully?
In Tapestry Pooling, we use not just the positive or negative result of each test, but also the estimate of the total viral load of each pool that the RT-PCR provides. Again, using the analogy to the coin puzzle, the weighing machine does not just tell you which pool of coins is heavier or lighter than expected it gives you the actual weight of the pool. Similarly, the RT-PCR machine reports not just whether the sample or pool tested was positive or negative, it gives an estimate for the viral load in that test. This is additional information the computer scientists can use to reconstruct which individual is positive or negative with more accuracy. It allows us to obtain accurate results with a small number of tests for scenarios where simple pooling would not work, such as when the prevalence rate rises beyond 5%.
More details about how Tapestry pooling works and some of the tests used to validate it can be found in the preprints https://arxiv.org/abs/2005.07895 and https://www.medrxiv.org/content/10.1101/2020.04.23.20077727v2.
Glossary of terms:
RT-qPCR assay: This is a method to detect the presence of a specific sequence of RNA or DNA in a sample. For the SARS-CoV-2 virus we look for a sequence in its RNA that we know is not present in other viruses
Viral load: The number of copies of the virus per millilitre
Contact tracing: Finding the contacts of a person who has tested positive, and testing them for the virus.
Prevalence: The number of infected people divided by population size.
We thank C. S. Anirudh and Uma Ramakrishnan for their inputs on the essay