The Null Hypothesis
04/08/15 10:10
Lately I’ve been thinking a lot about the null hypothesis. That is, in effect, what we call it when we think we didn’t learn anything new or exciting from the work we have done. Of course that isn’t true: it can’t all be Science and Nature, but we have to sometimes explore the alleys and dead ends to figure out the complete landscape of a problem.
I’m working on a project like this now. Early in the summer, I was pretty excited by it. The initial data suggested a big answer was forthcoming, if we just had a larger sample size. Now, with larger sample size, the data just suggest an answer. Nothing big, but an answer.
The trick for many of us scientists is that it is really easy to start projects, to get excited about something new, something that will bring respect and funding and inflate our ego. It is really difficult to finish projects - they aren’t done until they are communicated, published, available to be discussed. Even if there isn’t that much to discuss! This is what I was sort of getting at in my last post: there are a lot of projects lurking on hard drives and GitHub and DropBox in my domain that are about 90% done. The last push can be really difficult when it isn’t exciting any more.
And yet, if we don’t get it out there.... if we don’t write it down, all the steps, all the ideas that were generated from the data, all the analyses.... all of it never. happened. It never happened as far as anybody outside the lab can tell. They might be interested in trying the same thing, and if they can’t read about what we have done - they will perhaps run into the same dead ends or less exciting data.
You may have heard of “hashing”. It is a run, usually only 5k or so, where the participants don’t know the route and are trying to catch the “hares”, 2 runners leaving a trail with flour or other temporary markers. There is usually a beer stop. And, there are numerous places where the way forward is ambiguous, where the path may lead in any direction. What happens in a hash is that the fast runners tend to get to these points first, they split up and check out the different paths until they figure out which are dead trails and which one is the correct one to follow. ON-ON! they shout, and sometimes arrange sticks or scuff an arrow in the dirt to let the rest of the runners follow when they reach that intersection - on the best path.
So, we hit a few dead ends. We support a few null hypotheses. I seem to be good at that these days, and that’s OK. We’re all heading toward the same understanding in the end.
I’m working on a project like this now. Early in the summer, I was pretty excited by it. The initial data suggested a big answer was forthcoming, if we just had a larger sample size. Now, with larger sample size, the data just suggest an answer. Nothing big, but an answer.
The trick for many of us scientists is that it is really easy to start projects, to get excited about something new, something that will bring respect and funding and inflate our ego. It is really difficult to finish projects - they aren’t done until they are communicated, published, available to be discussed. Even if there isn’t that much to discuss! This is what I was sort of getting at in my last post: there are a lot of projects lurking on hard drives and GitHub and DropBox in my domain that are about 90% done. The last push can be really difficult when it isn’t exciting any more.
And yet, if we don’t get it out there.... if we don’t write it down, all the steps, all the ideas that were generated from the data, all the analyses.... all of it never. happened. It never happened as far as anybody outside the lab can tell. They might be interested in trying the same thing, and if they can’t read about what we have done - they will perhaps run into the same dead ends or less exciting data.
You may have heard of “hashing”. It is a run, usually only 5k or so, where the participants don’t know the route and are trying to catch the “hares”, 2 runners leaving a trail with flour or other temporary markers. There is usually a beer stop. And, there are numerous places where the way forward is ambiguous, where the path may lead in any direction. What happens in a hash is that the fast runners tend to get to these points first, they split up and check out the different paths until they figure out which are dead trails and which one is the correct one to follow. ON-ON! they shout, and sometimes arrange sticks or scuff an arrow in the dirt to let the rest of the runners follow when they reach that intersection - on the best path.
So, we hit a few dead ends. We support a few null hypotheses. I seem to be good at that these days, and that’s OK. We’re all heading toward the same understanding in the end.