Wednesday, September 19, 2007

Responsible "Science"? (And Other Interesting Links)

Try out this madlib:

A (Prestigious Academic Institution's Name) University study found that (food item of your choice) is associated with a greater/lesser risk of (disease of your choice).

How often do you hear such studies cited in the popular press? For most of you, the answer will be "quite." Now, how often are the results of these studies quickly overturned/discredited? Again, quite.

The New York Times magazine recently put out an interesting article about whether or not we really know anything the determinants of health and disease and the research that generates the often well-publicized variants of the above madlib (thanks to Brian and Erika for the link). I'm really glad that this article was written because it basically validates this long standing rant of mine: the vast majority of these oft-quoted medical studies suck.

Why do they suck? Mainly because the researchers who produce them have no idea how to do empirical research and make all sorts of obvious errors. I call this entire branch of pseudo-science "armchair epidemiology" - and the whole enterprise is as dangerous as that phrase is innocuous.

First, here is why I think these studies are of such poor quality:

1) Researchers who put out such studies have no concept of uncertainty and chance: The big thing in statistics is to find results with a p-value of less than 0.05. To abuse the language a bit, this means that we are more than 95% "certain" that our result is not due to random chance.

Here's the rub: if you test 20 coefficients in a regression, chances are pretty good that at least one of them will be significant at the 5% level. That comes from the whole intuition behind the 5% significance level.

To put it differently, you can test a bunch of different variables and find some significant result about 5% of the time, even when there is no "real" association between those two things, or there no ex ante theoretical reason to believe that those two things would truly be correlated in the real world. For example, you may find that shoes are associated with pancreatic cancer. Is there any theoretical reason to believe that these two things should be correlated? (No!)

Unfortunately, many researchers will take such findings and publish them, without seeing if it passes the "huh?!" test. And the public will listen and stop wearing shoes.

2) Researchers engage in heavy data mining without correcting for this in their statistical procedures: Raise your hand if you think this is good statistics: take a data set and keep running tons of regressions until you find statistically significant results. Present the best results.

This is called data mining and I'm not a big fan of it. If you use the data to inform further tests, you have to do a correction in order to obtain the true p-value. Think about it: if you beat the stuffing out of your data (i.e., run lots and lots of regressions), by random chance you are more likely to "find stuff" in the dataset. It makes sense that you should correct for this statistically.

If you are to data mine (i.e., your statistical investigation is not hypothesis or theory driven), the correct procedure would be to do all those regressions, obtain the results and either correct for your p-values (where the corrections take into account mining of the data) or try to replicate the results in another data set. The latter is called out of sample testing and most people don't do it.

And I haven't even mentioned (3) the whole correlation doesn't imply causation bit.

Fair enough, you might say, but why is this dangerous? Who cares? Here are my answers to that:

1) People care about their health, more so than most other things. And because of that, people (through the news, through the papers, whatever) want to know about the latest research findings. When they do find out, they will listen. In this context, the wrong information can be dangerous (the NYT article linked above has a particularly salient example of this).

2) Bad epidemiology of the "armchair" variety crowds out good epidemiology, and there is a sizable amount of the latter. People will ultimately get tired with the bad science, learn that medical studies are typically bad since the results flip back and forth, and give up on public health research. When all that is said and done, the public might not have patience even for the best of studies.

Final question: if such studies are so obviously bad, why aren't they weeded out (in favor of better studies)? Again, here are my views:

1) Its not obvious to most people that these are bad studies. I think the general public has a pretty poor understanding of basic statistics. This is why every high school/college kid should be made to take a stats course. (And they should take an intro microecon course while they are at it).

2) The incentives in the medical research market encourage bad studies. After all, its publish or perish at most medical schools. Quantity often carries the day and each researcher is incentivized to publish as much as possible, either by turning one study into three smaller papers or putting out trash like shoes cause cancer.

A related point: there is more data, more computing power, and more easy-to-use stats packages out there than ever before. While this reduces the cost of doing both good and bad studies, good studies still require a lot more time and care, the very things that go by the wayside when incentives to publish favor quantity.

3) People want to learn about health related things. This (plus 1) creates a huge public demand for such work.

4) Peer reviewed medical journals do not weed out these studies. Rather, they actually encourage trash research by publishing it. At times, some of the top medical journals are guilty of this as well.

I think this happens because a) many reviewers don't know good statistics from bad statistics, b) it pays to publish freaky results and c) by accepting other people's trash, it makes room in the literature for one's own trash.

Now, since my rant is over, I'll direct you to some other interesting links:

1) Steve Levitt at the Freakonomics blog has a great post about the Iraq War surge and how good econometrics and data can inform about the impact of various policies. Recommended reading.

2) Santosh Anagol of BMB fame has started a new series on how "non-economics people misunderstand basic micro-economics." Its great stuff for those with and without economics background: either way, you'll learn something.

3) Like Seinfeld? You'll love this clip. Trust me: its worth watching. (Credit to Xiaobo for the link).

No comments: