Anyone who's gone to med school knows about the Bayes theorem. Medical students learn how it helps us interpret laboratory test results. Less well appreciated is how Bayesian methods, in an analogous way, can be applied to the interpretation of research findings. In this series of posts I will try and explain why Bayesian methods are superior to the currently used frequentist statistical methods of clinical trial interpretation and give examples of appropriate and inappropriate use of Bayesian analysis.
First a little background. If you've forgotten a lot of the biostatistics you once learned, as I have, the math behind Bayes theorem can be pretty stiff. But here are some resources for those who want to drill down:
Wikipedia article here.
Science Based Medicine posts by Kimball Atwood here, here and here.
Companion pieces in Annals of Internal Medicine here and here.
JACC article here.
To appreciate this in the proper depth read and understand these resources. (For me that'll mean several re-reads and referring to some background material).
For today's purposes, though, here's a primer on Bayes theorem for poets, surgeons and the rest of us. The non-quantitative aspects of the Bayes theorem are easy to understand because they conform to how clinicians think intuitively. Simply put, Bayesian analysis seeks to interpret new information in light of what was known before. A judge, for example, when sentencing an offender, takes into account prior offenses.
Let's start with the more familiar area of diagnostic testing and consider Bayesian analysis of ECG stress tests. Simple ECG stress testing is not commonly used today but it's a well studied and easy to understand example for our discussion. Bayesian analysis of treadmill stress test results was described in the classic paper by Diamond and Forrester. Suppose you choose 1.5mm ST segment depression as your threshold for a positive result. The specificity, determined from studies comparing treadmill testing to the gold standard of coronary angiography, has been estimated to be around 80%. Now here's where people trip up. It's common to confuse specificity with positive predictive value. The usual mis-perception is that specificity is the probability that the patient with a positive test has the disease. That's what positive predictive value is, but specificity tells us nothing of the sort. Specificity applies the assumption that the patient does not have disease and tells us the true negative rate. The specificity, which is solely a test characteristic, may be very high. The positive predictive value of a test with high specificity, on the other hand, may be high or low depending on patient characteristics that were known before (or independently of) the performance of the test (remember that Bayesian analysis always takes into account what was known before!).
Data from Diamond's article and other sources illustrate how patient characteristics influence positive predictive value (also often referred to as post test probability of disease) with a fixed specificity of 80%. For a 50 year old male, for example, with atypical chest pain (some features of angina but not classic) who has a positive test at 80% specificity the positive predictive value is around 90%. In other words he has a 90% probability of disease. A 35 year old woman with atypical chest pain and that same test result, on the other hand, would have a positive predictive value of 20%. In both cases the specificity of the test is 80%. (If those numbers seem low remember that these data were collected in the 1970's and preceded our current epidemic of obesity and type 2 diabetes. Application of this analysis today might require additional patient characteristics beyond age, gender and chest pain pattern. Nevertheless it suits the purposes of our discussion). Using this same model consider a 35 year old woman whose chest pain is right inframammary, localized to an area the size of a quarter and is pleuritic (non-anginal chest pain). Again assuming 80% specificity and that same positive test result, her positive predictive value (post test probability of disease) would be next to zero. For that patient a diagnosis of “chest pain due to coronary artery disease” would be an extraordinary claim indeed. It would take much stronger evidence than a positive treadmill test, even with a specificity of 80%, to diagnose CAD in that patient. In fact, nothing short of the gold standard of a positive coronary arteriogram would be convincing. That brings us to a corollary in our primer on Bayesian analysis for poets, surgeons and the rest of us: extraordinary claims require extraordinary proof!
In a subsequent post I'll illustrate how that little maxim applies to clinical research (hint: can you say woo?).