Correlation and causation explained

Author: Priyantha Wijayatunga

Photo by Victorgrigas courtesy of wikimedia

Though the word correlation means usually how two quantities vary together, perhaps it may be due to extensive use of Pearson’s correlation coefficient. It is often associated with a linear relationship between the two. Even many scientific people misinterpret zero correlation as independence of the two quantities forgetting about what it really means; there is no linear association between the two quantities concerned.

Furthermore many confuse correlation with causation, i.e., many interpret non–zero correlation as an implied causal relationship. There are many examples of it even in scientific literature. Here our point is not to discuss all these misinterpretations, but to look at another thing on correlation, that is also related to causation in some sense.

In our day–to–day life, causation is transitive: that is, if one thing, say, A causes another, say, B and this B also causes a third, say, C, then we are ready to accept that A causes C. Of course, this concept is quite familiar to most of us and it is often called indirect causation. Such a relationship is said to have the property of transitivity.

But then what about the correlation: more precisely, if A and B are positively correlated and B and C are positively correlated then are A and C also positively correlated? Is the positive correlation is also transitive? For example, if the price of stock B increases along with the price of stock A, and the price of stock C increases with the price of the stock B. Then is it the case that the price of the stock C increases with the price of the stock A always?

Yes, one may jump into conclusion that they do so. Not because of he/she may be thinking about Pearson’s correlation coefficient, but it may due to thinking in line with causation. In fact, if one knows about Pearson’s correlation coefficient he/she may not conclude so. An article by Langford, Schwertman and Owens in The American Statistician gives rather deep look at the problem. But thinking in terms of causation one often concludes that A and C are also positively correlated always, i.e., the price of the stock C increases with that of A, always. But, since causation and correlation are two different things there is no assurance that a property of causation is also held in correlation.

Let’s get back to the above article, in this they use the data for all of the New York Yankees players with at least 300 “at bat” at the end of the year 2000 regular baseball season. “At bat” means a batter facing a pitcher, so each batter listed has at least 300 times of facing pitchers. As authors describe the variables X, Y and Z represent the number of triples, base hits and home runs for the players where each row corresponds to one particular player with his name also are shown. In baseball a hit is called a base hit, a triple is an act of a batter reaching safely the third base and of a home run is a score that the team gets when a batter is able to reach home safely in one play. The data is shown in the following table and the graphs are drawn between each pair of variables to see how they vary among themselves, for example the graph in the middle shows how the number of home runs (Z) changes with the number of base hits (X) for the players. They increase linearly with each other, but you may notice that it is not a sharp relationship between them.

It can be found that Pearson correlation coefficient between X and Y is 0.526, that between Y and Z is 0.293 but that between X and Z is -0.096, a negative correlation. That is the first two correlations are positive but the third is not a positive one, therefore it is not a transitive relation. But if one guesses that fourth player is an outlier (he has rather high Z value when taken with his X value compared to others) the above correlations become 0.559, 0.321 and 0.067. That is, when first two correlations are improved to some extent, that is, if they are made more positive and then the third also becomes positive thus giving the transitivity property.

So one may think that every correlation triplet such as above can be made to follow transitivity property, at least in cases where there are enough data cases to delete such suspected outliers so that first two correlations can be improved. Indeed this is true when the linear relationship between the first two pairs, namely, X and Y and Y and Z are adequately strong, i.e., when they are rather high positive correlations. The authors prove a theorem that says P2xy + P2yz + P2xz ≤ 1 + 2Pxy Pyz Pxz where Pxy stands for the correlation coefficient between X and Y and similarly the others.

Note that a correlation coefficient between two quantities is a measure of how close the data points on the two quantities to a hypothesized linear relationship between them. When you have three quantities where two of the correlations are positive but rather weak meaning that data points on respective quantities are not “that” close to their hypothesized linear relationships then the third correlation can be negative.

Bookmark and Share

Comment on this article

Submit your comment
  1. Image of unique ID


Ashish Soni

This is really a nice article. I had many of the 'default' assumptions that you have mentioned in this article. But I am happy to read your articles and correct my misconceptions.

Thanks for the article.

Ashish Soni
...your statistical partner...

reply to this comment


Of course, the difference between a positive and a negative correlation is not one of "causality" but one of scale, since scale (origin, direction and length) are arbitrary.  There are, for instance, two measures of size:  hugeness (a number gets bigger as the size gets bigger) or smallness (a number gets bigger as the size gets smaller). 

In fact, a correlation is (in one of its many definititions) the cosine of the angle between two vectors.  As such, one can play an experiment where we take 2 wires about 8 inches in length, bend them in the middle and tie the 2 bent wires together along one of the "axis".  When we twist the 2 wires around the axis, this gives us a sense of the range of correlations possible (the cosine of the angle).  The higher the correlation between A and B (the cosine of the angle on one wire), and the higher the correlation between B and C (the cosine of the second angle where B is the common axis), the more restrained is the correlation between A and C. 

The sense of what this means in terms of "causality" becomes fuzzier when we throw in error measurement and displacement over time.  For example, if even A actually caused event B 8 minutes later (in our thought experiement) but we measured them at the same time, our measure might (or might not) reflect that caused event. 

Very interesting topic!!! 



reply to this comment

zakaria aden salad hashi

good enough explanation

reply to this comment

An economist

It's a little weird for you to claim that all economists want unsustainable growth -- there are whole fields of economics devoted to including the "most important bits." 

reply to this comment

Skip to Main Site Navigation / Login

Site Search Form

Site Search