## Nothing To Do with Urinalysis

The coin of the realm in experimental science is the p-value (if that’s not news, perhaps this discussion will be more pertinent). To put it perhaps too simply, the p-value is a statistic that tells a researcher the chance that the results from an experiment are just a random fluke. The way it works is that the scientist makes some measurement, say of blood pressure or crop yield, on a bunch of people or wheat strains or some other sciency stuff. Often the subjects, be they human or agricultural, are assigned one of two groups. One group receives the special sauce of interest and the other one doesn’t, so the key question is whether the treatment group differs from the control group. Looking at the average values for each group doesn’t quite tell you the answer, because it doesn’t really account for the variability in the data. Let’s say the average of the treatment group is higher than the average for the control group. If the averages are pretty close and the data are all over the place, then the chance is pretty good that the difference in averages is really just due to dumb luck. If you did the experiment again, you’d be just about as likely to get the opposite result. Since the p-value derives both from the group averages and the variability, it helps you avoid making rash conclusions and you don’t end up renting a tux before the Nobel committee has even called. The smaller the p-value, the better. Unfortunately, the p-value only protects you from this kind of mistake if you are careful about how you use it. For example, you can’t do experiments over and over again until you get a p-value that you like. That’s called fishing, and it pretty much invalidates any experimental results you have. There is also a bias in science toward publishing research which confirms hypotheses, producing a sort of institutionalized version of p-value fishing. Technically, this doesn’t call into question the results of individual papers, but when a group of papers is considered as a whole, it means we overestimate the effects. For example, studies of a new medication might suggest that the drug works better than it actually does. In drug and device trials, some combination of publishing bias and more nefarious motives has produced a system in which many trials never see the light of day. Given all that is at stake, you’d think that these elementary statistical shenanigans would be regulated out of existence, and there has been some progress. However, a surprising number of clinical studies are never published, and it seems that the many of these are the ones that fail to show a positive result for the proposed drug or device. Amazingly, even GlaxoSmithKline (GSK) has joined a campaign to compel more disclosure of clinical trials. Hopefully over producers will follow suit

[…] of the study and identify statistical shenanigans that scientists can otherwise use to make a silk purse out of the sow’s ear of negative results. It also prevents research sponsors, like pharmaceutical companies, […]

Craigslist Killed the Newspaper, but Science Publishing Thrives (for All the Wrong Reasons) | Nucleus Ambiguous said this on May 23, 2013 at 9:44 pm |