Wednesday, May 6, 2020

Sensitivity, specificity and Bayes

Now that we're ramping up testing, including serology, you're going to see words like sensitivity and specificity thrown around. Here's a brief primer.

Rather than give a bunch of definitions right away, I would rather start by asking a question and then showing how we can answer it by building a table, based on three numbers (really, ratios).

I just got a positive test back for X.  Do I really have X? 

We have to recognize that there is some uncertainty here.  Peter Norvig:
suppose I have a test machine that determines whether the subject is a flying leprechaun from Mars. I'm told the test is 99% accurate. I put person after person into the machine and the result is always negative (correctly). Finally one day, I put someone (say, Tom Hanks) into the machine and the light comes on that says "Flying Leprechaun!" Would you believe the machine? Of course not: that would be ridiculous, so we conclude that we just happened to hit the 1% where the test errs. 
We find it easy to completely reject a test result when it predicts something impossible (even if the test is very accurate); now we have to train ourselves to almost completely reject a test result when it predicts something almost completely impossible (even if the test is very accurate).
I used the following example for my students:  a person walks into the doctor's office with a very bad sore throat and a rapid Strep test (for Streptococcus pyogenes) is administered.  A positive result is obtained.  What is the probability that the patient really has a strep throat?

Let's construct a table:
                        strep    No strep   total
positive test                                      pos tests
negative test                                      neg tests
total w/ or w/o strep

The question we are asking is equivalent to the following:  take the number of people with a positive test that have Strep (that's the first number in row 1) and divide by the total number of people with a positive test (the third number in row 1).

Well, how do you do that?

Sensitivity is an umpire who sees a real strike and calls "strike."  It is the proportion of people who actually have the disease that also have a positive test result.  For rapid Strep tests, this number is about 90-95%, let's call it 90.  So out of every 100 people, we write

                        strep    No strep   total
positive test              90                       pos tests
negative test              10                       neg tests
total w/ or w/o strep     100                           

Another (compound) word to describe those 10 people who have Strep but got a negative test is false-negatives.

Specificity is an umpire who sees a pitch that is actually a ball, and calls "ball".  It is the proportion of people who do not have the disease and have a negative test result.  For rapid Strep tests, this number is claimed to be about 98%.
                        strep    No strep   total
positive test              90          2            pos tests
negative test              10         98            neg tests
total w/ or w/o strep     100        100                 

Those 2 people who have Strep but got a negative test are false-positives.

Now, to get the total number of positive tests can I just add the two values in the first row?  No!!!

The reason is that the totals on the bottom need to be adjusted.  In our table so far, we have equal numbers of people that have Strep or don't have it.  

But in the real world, the actual proportion of people who walk in complaining of sore throat that have Strep is about 10%.  This is called the prevalence (or incidence if you want to be picky).  It is determined by doing further tests (streaking a throat swab onto blood agar) and other things such as PCR.

To fix this, we scale all the values in column 2 so that the total on the bottom is 900.  Then the prevalence will be right.  We do that by multiplying each value by 9.

                        strep    No strep   total
positive test              90          18     108   pos tests
negative test              10         882     892   neg tests
total w/ or w/o strep     100         900    1000         

So the probability that you actually have Strep, given a positive test, is about 90/108 = 83%.

Now, you may think this is no big deal.  83% is still a lot, even if it's not 90 or 98%.  But this is a very good test, and the prevalence is reasonably high.

Suppose we have a different example with the same sensitivity and specificity but the prevalence is 1%.

                      disease  No disease   total
positive test               9          20      29   pos tests
negative test               1         970     971   neg tests
total w/ or w/o disease    10         990    1000         

That's not so good.  Now only one-third of people who receive a positive test result actually have the disease, even with a very good test.

Sensitivity is about the ump calling strikes correctly. Specificity is about the ump calling balls correctly. Prevalence is how often the pitcher throws a strike.

Prevalence has a big influence and this is often forgotten. Since the prevalence of people who have had COVID is extremely low (first written early April, 2020), even sensitivities and specificities in the high 90s will be problematic for population-level serology testing. I leave making the table for that as an exercise...

What's the connection to Bayes?  Bayes theorem leads to a system for probability which we can think of as starting with some prior likelihood for a particular statement and then updating as evidence becomes available.

In this case our prior for the hypothesis that a patient has the disease is the prevalence.  When updated by the test result we get the final probability.

[ To do the math more easily, start with the population and first calculate disease/no disease using the prevalence.  Then use the sensitivity and specificity to get the other numbers. ]