The quantitatively inclined will no doubt recognize my reference to the recent book by Nate Silver about the potential and perils of prediction. While not exactly a reference for high level predictive techniques in statistics, the book was a good introduction to the general reader from a bright guy who is best known for revealing the uselessness of political pundits during recent election cycles.
And accurate prediction is at the heart of the scientific method; it’s what sets that method apart from other ways of knowing about the world. From the movement of the stars to the constituents of atoms, the true test of any scientific hypothesis is not the elegance of its theory (though that is typically held in high regard as well) but its ability to make concrete (typically quantitative) and accurate predictions about events that have either not been observed or not yet happened at all.
But to paraphrase either Niels Bohr or Yogi Berra (or someone completely different), ‘prediction is difficult, especially about the future.’ No less so in neuroscience, with its famously squishy subject matter. Whether you stick an electrode into a neuron and measure its membrane potential or image the combined activity of billions of neurons (and glia, by the way) with an fMRI scanner, there is a lot of variability in the response that seems to persist no matter how meticulously you control the inputs to the system. The typical approach to deal with this problem is to do your experiments over and over again with the expectation that the “noise” in the system (whatever its source) will eventually average out. So, you present a stimulus to a single cell or a network or a whole brain, measure the result, and maybe on that experimental trial the response is a little stronger. You repeat the stimulus. On the next trial, despite keeping everything as precisely identical as you can, the response is a little weaker. Rinse and repeat.
After a while you’ll have enough examples of the response that you can average all these together and expect the ups and downs not associated with your stimulus to balance each other (overall). It’s exactly the same principle as taking the average of all the scores on a given test for all of the students in a class. You expect that the average will tell you something about the performance of the class as a whole independent of the individual background and daily drama of the particular students within the class.
This leads to one of the most important issues with separating the signal from the noise. The difference between the two is mostly dependent on what information you want to extract. It’s like being at a party and trying to watch something on TV. For you, all that chit-chat is noise, a distraction from what you are interested in, while for someone else at the party that damn TV is interfering with her efforts to follow a conversation. Given a set of data about student grades, a teacher may be interested in the variability that relates to teaching methods while a demographer might be interested in differences associated with socio-economic status and a policy-maker might be concerned with how differences in funding in different schools are reflected in achievement (Needless to say, any of these people would likely have at least some interest in the other sources of variability as well).
Still, there are some examples of noise that are not just “shit that doesn’t interest me.” Some of it is “shit I just can’t get my head around.” Imagine a pair of dice, for example. At the macro, everyday, craps table level, they are pretty much unpredictable (random), meaning that all the variability in each throw is unexplained (really no signal there at all, unless you believe you have a “system”). Still you can imagine that if you had enough information about the mass, precise shape, and molecular composition of the dice (and table), and enough control over the throw, that at least in principle you could predict the outcome.
Nonetheless, at the micro (or rather nano, or ato) level, sometimes it’s not even possible in principle to make fully accurate predictions. Quantum theory argues that the very small bits that make up our universe don’t behave in that nice Newtonian billiard ball regime we are so used to. The counter-intuitiveness of that fundamental, intrinsic, elephants-all-the-way-down, randomness famously led Einstein to protest that “God doesn’t play dice with the world.” In other words, he thought the indeterminacy of quantum physics reflected “shit I just can’t get my head around” rather than true randomness.
There is one other source of unpredictability: chaos. Chaotic behavior is a feature of some systems that, despite being essentially deterministic, are fundamentally unpredictable, except over very short time horizons. Without going too far into the details, the important point is that the unpredictability of chaotic systems comes not from intrinsic randomness, but from the fact that they can produce wildly erratic behavior from the most infinitesimal differences in starting points.
Coming back to neuroscience, it turns out that the sources of “noise” in the nervous system can be quite controversial (and with important consequences for computational theories). As I said above, variability between trials using the same stimulus, between different neurons, between different brains, subjects, or days of the week are all vexingly real in experimental neuroscience. Nonetheless, in many experiments it remains maddeningly unclear whether the variability comes from intrinsic randomness percolating up from the nano-scale fluctuations of individual molecules, from the vast number of unmeasured and uncontrolled variables in any network, or from more strictly defined chaotic dynamics. Kind of like elections. At least we don’t have worry about the variability caused by the Koch brothers.