Confounder Variable
Confounder Variable
A confounder is a variable that is related to both the independent and dependent variable, and partially, or even entirely, accounts for the relationship between these two.
An important thing to note about confounders, is that they are not included in the hypothesis and they are generally not measured. This makes it impossible to determine what the actual effect of a confounder was.
The only thing to do is to repeat the study, and control the confounder by making sure it takes on the same value for all participants.
Example
If we want to estimate house price and in our data set we have those variables :
- Size
- Number of bedrooms
- Address
- ZipCode
- Age of house
We might find that size is strongly correlated to the price of the house, but in our dataset, the size also correlated to the zip code. and if we loop deeper we might find that the size of the house is correlated to the zip code, for example, larger houses exist more where there is a school nearby, so families usually buy those houses, and the size it's not the main variable that influence the price.
So What
In case we miss such variable (Confounder), we might get the wrong conclusion, for example, the larger the house the higher the price.
If we are consulting company we might advise our clients to build larger houses regardless the location or zip code and that will not be a profitable investment for them our for ourselves if we the client in this case, this could become a more serious issue with huge loses then analytic fault.
So if we repeat the study and make sure the zip code is identical in our data set we probably will end with different results.
Finally, it's important to remember correlation does not imply causation.
Finally, it's important to remember correlation does not imply causation.
Comments
Post a Comment