Why customers should not expect their Marketing Mix Models (MMM) to have very high R squared value.
Somehow over the years, two myth has been propagated :
▪ High R squared value = good
▪ R square is a sign of predictive power of a model
I guess we statisticians are partially to blame for propagating these myths.
Let’s break these myths:
1) R Squared value is not a measure of predictive power
Though some have this misconception, R squared value is more about retrodiction than prediction. R square value is more about ‘goodness of fit’.
2) There is no ideal R squared value
Because of the myth that R square is a measure of predictive power,
many clients develop the expectation that they need to have a ‘high R squared value’ else the model is not useful.
I have seen cases where a vendor developed and presented client a MMM model with R squared of 0.9 (a highly overfit model) just because of the fear that the model with relatively lower R squared would not be accepted !!
A high R squared value does not necessarily mean it is good. It depends on the data and the domain.
Sadly, the high R squared value fixation becomes a manifestation of Goodhart’s law. “When a measure becomes a target, it ceases to be a good measure.”
In the pursuit of high R squared values, one might end up with overfitted and poorly specified MMM models.
It will be prudent to not over fixate on R square values.
P.S: I would urge the readers to read the Lecture notes by Dr. Cosma Shalizi. The notes expands on some of the popular myths about R squared value. Link in resources.
Resources:
Lecture Notes by Dr Cosma Shalizi: https://www.stat.cmu.edu/~cshalizi/mreg/15/lectures/10/lecture-10.pdf
Incase the above link does not work. Please refer this link https://www.stat.cmu.edu/~cshalizi/TALR/TALR.pdf. The link is that of the book “The Truth About Linear Regression”. The same notes about R squared can be found from Pg 179-182.