Thursday, April 24, 2014

Testing Between Causal and Spurious Effects

Tom Cruise in the movie Top Gun
 that was set in Fallon NV.
In the late 1990s there was a spike in childhood leukemia cases in the town of Fallon NV, the famed home of Top Gun.  What was the cause of the spike?  We still don't know.  

There are three possibilities:
1.  The spike was due to environmental factors such as the arsenic in the drinking water or the exposure to the heavy metal tungsten.
2.  The spike was related to the fact that the town had a large number of Navy personnel or some other set of unknown characteristics of the town's population.
3.  The spike was a statistical fluke.

If we are interested in determining whether the leukemia spike was due to environmental factors then we can think of the relationship between the environment and leukemia as a "causal" relationship or a "spurious" relationship.

Let X represents the environment of Fallon, Y the number of leukemia cases, and U some unobserved cause of both a family's location in Fallon and leukemia.  It could be that X is directly determining Y or that U is determining both X and Y.
Causal relationship

Spurious relationship
How could we distinguish between the two possibilities?  In both cases we will see that a families location choice and families likelihood of having a child with leukemia are correlated.  

Judea Pearl argues we should conduct an experiment.  That is we should introduce a policy to purposely change X.  If we move families out of Fallon or assign them to other locations and see a reduction in leukemia cases among families not located in Fallon, then we know that Fallon is the cause of the spike.  If we don't see any change in the likelihood that children in these families get leukemia then we know the relationship between Fallon's environment and leukemia cases is spurious.  That is, we can rule out (1) and know it may be due to some other cause (2) or a statistical fluke (3).

What if there is both a causal relationship and a spurious relationship?  That is, what if there is something in the environment of Fallon that is leading to increases in leukemia, but the magnitude and direction of the effect is being mediated by some unobserved characteristic such as a family's propensity to be in the Navy.  In this case Pearl's experiment still determines whether there is a directed arrow from X to Y, but we learn nothing about how that relationship is being mediated by U.

If we were able to randomly assign families to Fallon NV, then we could determine that something in Fallon's environment is increasing the likelihood of a child in the family having leukemia.  What we don't learn is whether there are other factors that either mitigate or propagate the effect of Fallon's environment on the propensity to get childhood leukemia.

Pearl's experiment allows us to determine whether the relationship is causal or spurious.  It does not provide information on the appropriate policy response to the problem.

No comments:

Post a Comment