When one variable (in this case, the number of comments) changes in a particular way when anoth
er variable (in this case, the length of the blog post) changes, we call this a correlation. Determining the correlation between two variables is accomplished with a simple statistical test. If the correlation is perfect (that is, 100%), it would mean that as the length of the blog post increased, we would see a comparable decrease in the number of comments. (Actually, our example is an inverse or indirect correlation because we are suggesting that as one variable increases the other decreases). Hardly ever do things change so precisely in real life, so in most cases, the correlation will be much less than 1.0 or perfect. The less the actual correlation statistic (it can be anything from 1.00 to -1.00) is, the less likely that there is a relationship between the variables --in our case, length and number of comments.
So, let's try it. Here's how I'm going about it. First, I need a random sample. The best random sample would be to take several examples of blogs from a number of sources and tabulate their lengths and the number of comments for each. For this little pretend project, I'm just going to take a few examples from this blog and tabulate the lengths and the number of comments and see what the correlation is. Yes, I understand that my sample is not truly random, but this is a demonstration, so bear with me.
(graphics from iannoon.wordpress.com)










