Correlation vs. Causation: Differences & Definition
Correlation vs. Causation
Brandy works in a clothing store. As she is restocking shelves, she notices that the sweaters are completely gone. She goes into the inventory area of the store and finds the sweater boxes. In the meantime, she gets a call: another one of her co-workers is calling in sick. That’s the third person this week! As she restocks the sweaters, Brandy has a thought. Are the sweater sales causing her coworkers to become ill? Brandy is faced with a common problem, correlation versus causation.
In this lesson, you will learn about correlation and causation, the differences between the two and when to tell if something is a correlation or a causation.
First, correlation and causation both need an independent and dependent variable. An independent variable is a condition or piece of data in an experiment that can be controlled or changed. A dependent variable is a condition or piece of data in an experiment that is controlled or influenced by an outside factor, most often the independent variable.
If there is a correlation, then sometimes we can assume that the dependent variable changes solely because the independent variables change. This is where the debate between correlation and causation occurs. However, there is a difference between cause and effect (causation) and relationship (correlation). Sometimes these areas can be confused and muddled when analyzing data.
Defining Correlation
You probably know that a correlation is the relationship between two sets of variables used to describe or predict information. There is an emphasis here on relationship. Sometimes we can use correlation to find causality, but not always. Remember that correlation can either be positive or negative.
Graph 1 is called a positive correlation, where the dependent variables and independent variables in a data set increase or decrease together.
This means that there is a positive relationship between the number of sweaters sold at Brandy’s store and the frequency of illnesses that occur with Brandy’s coworkers.
If the numbers sloped downward, like the line in Graph 2, then you have a data set with a negative correlation, where the dependent variables and independent variables in a data set either increase or decrease opposite from one another.
That means if the independent variables decrease, then the dependent variable would increase, and vice versa. In this example, Brandy notices that the more shorts that are sold, the fewer illnesses there are, but the more vacation time her coworkers use.
So the question is: do shorts or sweater sales cause illnesses or vacation? You might have guessed that it isn’t the clothing that is causing this change; these things are just correlated, but not cause and effect.
Defining Causation
Causation, also known as cause and effect, is when an observed event or action appears to have caused a second event or action. For example, I bought a brand new bed comforter and placed it in my washing machine to be cleaned. After cleaning the comforter, my washing machine stopped working. I may assume that the first action, washing the comforter, caused the second action, broken washing machine.
Brandy decides to rearrange the inventory on her floor. She puts the athletic wear and shoes in a prominent spot in the store, puts the swimwear next to the front register and moves the business attire to a less conspicuous spot. Over the next few weeks she notices a change in her employees. They are more active, eat healthier and take walks on their breaks. Could the athletic wear in a prominent spot cause the employees to have the motivation to be healthier? She tries an experiment, exchanging the athletic wear and the business wear. Over the next few weeks, Brandy doesn’t notice a change in the employees’ behavior. She asks them what caused them to suddenly want to work out and live a healthier life style. Was it the athletic wear? No, they tell her. It was the swimsuits by the front register reminding them that spring break was coming around the corner.
Identifying Correlation or Causation
Unfortunately, there is no tried and true way of identifying causation. We can find many correlations in research, but the causation often requires a separate experiment. For example, Brandy did not know if the athletic wear was the causation or just a correlation until she rearranged the inventory a second time. However, you can identify instances of likely causation. Let’s look at a few examples.