06 February, 2008

Emoto's Emotive Water Crystals

You may have heard about Masaru Emoto before--he's the guy who claims that emotions, when directed at water just prior to freezing, will cause the water molecules to freeze in patterns that are associated with the emotion in question. He is, as you might imagine, completely fucking retarded.
Nevertheless, because the US gov't is also completely fucking insane, they actually fund studies to deal with such issues from time to time (okay, all the time). The one I want to bring to everyone's attention today was sponsored by the National Institute of Health.
I will quote directly from the abstract:
A group of approximately 2,000 people in Tokyo focused positive intentions toward water samples located inside an electromagnetically shielded room in California. That group was unaware of similar water samples set aside in a different location as controls. Ice crystals formed from both sets of water samples were blindly identified and photographed by an analyst, and the resulting images were blindly assessed for aesthetic appeal by 100 independent judges. Results indicated that crystals from the treated water were given higher scores for aesthetic appeal than those from the control water (P = .001, one-tailed), lending support to the hypothesis.
For those of you who skipped over the primary source quote because of some delusion that secondary sources are better, what this is saying is that they tested whether or not water crystals looked "more aesthetically appealing" after "positive intentions" were directed at the water prior to freezing, and the result was statistically significant.
I bet Emoto peed his pants when he found out the results were in his favor.
But for those of you who think I am showing this because I want to convince you to the Emotive Water Hypothesis point of view, please stop being an idiot. The point, instead, is to explain why even when a double blind experiment takes place, its results are not necessarily conclusive.
Whenever someone does an experiment, you get a result. If you do the experiment well, you'll get lots of results, because you'll do things lots of times under exceedingly similar conditions, and compare them to control conditions. The idea is that if you just do the experiment once, you may get a result which is not ordinary; perhaps because there was some error in the performance of your experiment, or even just because you happened to get a high maximal result or a low minimal result of a range of results that you could have gotten.
By this, what I mean is that if you measure the length of a board only once, you may have accidentally measured incorrectly. Or you may have measured in a specific place which gave the longest possible measure of the board. The only way to really get an accurate measurement is to redo the measuring multiple times. If you get the same result twice in a row, that gives you more confidence in your result. Even better if two separate measurers get the same result. And likely there will be a range of answers--some results will be high, and some low. The 'real' measure is somewhere in between. (For you philosophers out there, the existence of a 'real' measure is actually disputed itself, but that's a topic for a future journal entry.)
Anyway, the hope is that by measuring multiple times, you are more likely to not have all the measures be too high, or too low, or consistently mismeasured. This is why, in the NIH experiment with Emoto's water, you have multiple people on every side of the experiment, all giving results multiple times. They were attempting to make it less likely for all the measures to be consistently incorrect.
But even though you have lots of people working together on measuring and remeasuring, there is still the possibility that everyone will, wholly by chance, consistently measure too long a length. We want this chance to be as small as possible, so we say some results are statistically insignificant, even if they give results higher or lower than expected.
For example, let's say the length of board is 100 units (u) long. If we do the measurement twenty times, then depending on the method of measurement used, we might expect to get results back of 98u, 101u, and maybe even a 105u. 90% of the time, these are the results we would get back from measuring. So if we got these results back, and the hypothesis was that the board was 100u long, we'd say that these results corroborated the hypothesis. But remember I said that these measurements are of the kind you might expect 90% of the time. The other 10% of the time, you might get twenty measurement results of which ALL are 102u and above. This would be a statistically significant difference. If these were the results, we'd say the hypothesis that the board is 100u long is less likely true than an alternate hypothesis that said it was 105u long. And we'd say this EVEN IF the 'real' measurement was just 100u long.
Rearrange the above figures so that instead of just a 90% probability, you instead use a 99.9% standard, and you can see even more extreme distances. .1% of the time, a multiply repeated experiment might result in concluding that a hypothesis of the board being 125u long is corroborated, even if it is only 100u long. This is a rather extreme example, but you get the idea.
The end result is that this NIH study is a lottery winner. It is a true rarity--it gives corroboration to the 125u long hypothesis, even though that hypothesis is wildly incorrect. Over time, if the experiment is repeated again and again, you'd expect the results to get closer to reality. But that would mean the NIH would have to sponsor yet another study on emotive water with American tax dollars.

Now I'm going to go eat a cold slice of pizza as a reward for actually updating this blog.


  1. This is the first time I've seen you insult your readers! I think it is an engaging technique but makes you less respectable.

  2. I was trying to look up this article (Explore [1550-8307] Radin yr:2006 vol:2 iss:5 pg:408 -11) to see if it really was funded by the NIH, but I wasn't able to. Where did you see that it was NIH sponsored/funded?

  3. To carpo:

    The insult was purely a new way of engaging the reader. In the past, I've repeatedly set up one premise only to switch midstream to another focus, so that part isn't new for me. But you're right in that I've never tried a transition by insulting the reader.

    I think you're right about the less respectable part. But I've been reading a lot of Richard Dawkins lately, especially his responses to the much more tepid atheists like Neil deGrasse Tyson, and I can't help but be enticed by Dawkin's much more harsh style. He pushes atheism hard on the reader, while Tyson is always very gentle about it, almost apologizing for his lack of religious beliefs.

    I'm not sure I'll do the whole reader insulting thing again. On rereading, it didn't come out quite the way I wanted it. But Dawkin's harsh technique still captivates me, so expect some add'l experiments along this line later on in some of my entries.

  4. To anonymous:

    To be honest, I didn't check my sources on the whole NIH funding thing. The reason I assumed it was NIH funded was because a friend sent me the link to it on nih.gov, stating that it was funded by the NIH.

    It came up in conversation about Stephen Jay Gould's Streak of Streaks article, where he discusses improbable events like the one that occurred in the above study.

    I apologize for not checking my sources before repeating the info my friends gave me over IM. It was unprofessional of me.