Stepwise Regression and MMM

Don’t Stepwise Regression your MMM model

So recently a client hired us to build MMM models for them after failed attempts to in-house the MMM capability.

Earlier their in-house Machine Learning engineers (with no statistics background) had built their MMM models thinking that it is just ‘linear regression’.

We sat down with the MLEs and wanted to know how they went about building the models.

The MLEs said they chose variables based on ‘Trial and Error’. Which then turned out to be backward stepwise regression. If you want a refresher on what is stepwise regression (links are in comments).

Further we noticed that they had not applied proper adstock on any of
their media variables (but this could be a topic for another post).

But coming back to the topic, stepwise regression is a recipe for disaster if your goal is attribution and inference.

MMM is all about attribution.

I will first highlight the problems stepwise regression cause to your model. The below are taken from Frank Harrell’s book Regression Modeling strategies.

1. It yields R-squared values that are badly biased to be high.

2. The F and chi-squared tests quoted next to each variable on the printout do not have the claimed distribution.

3. The method yields confidence intervals for effects and predicted values that are falsely narrow; see Altman and Andersen (1989).

4. It yields p-values that do not have the proper meaning, and the proper correction for them is a difficult problem.

5. It gives biased regression coefficients that need shrinkage (the coefficients for remaining variables are too large; see Tibshirani [1996]).

6. It has severe problems in the presence of collinearity.

7. It is based on methods (e.g., F tests for nested models) that were intended to be used to test prespecified hypotheses.

8. Increasing the sample size does not help very much; see Derksen and Keselman (1992).

Finally this one is my favorite

9. It allows us to not think about the problem.

If you notice above, one of the main problem caused by stepwise regression is Bias. Bias is kryptonite of attribution. If your estimates are biased, you will no longer have a accurate picture of your marketing ROI.

Our humble suggestions to companies looking to in-house MMM is this:

▪ Hire trained statisticians with experience in MMM and good business understanding.

▪ MMM is not just Linear Regression

▪ MMM can’t be built without proper statistical knowledge.

▪ MMM is not just ‘click-one-button-get-all-results’ exercise. Anybody selling you such tool is fooling you.

If you have trouble in-housing MMM, just hire experienced hands like Aryma Labs. We know our MMMs.

Resources on stepwise regression and alternatives for it are below.

Resources:

Problems with Stepwise regression – https://stats.stackexchange.com/a/20856/394729

Stopping stepwise: Why stepwise selection is bad and what you should use instead – https://towardsdatascience.com/stopping-stepwise-why-stepwise-selection-is-bad-and-what-you-should-use-instead-90818b3f52df

https://en.wikipedia.org/wiki/Stepwise_regression

Facebook
Twitter
LinkedIn

Recommended Posts

Chebyshev’s Inequality for Marketing Mix Model Diagnostics

Chebyshev’s Inequality for Marketing…

At Aryma Labs, we constantly endeavor to add as much science as possible…

How to use Robyn’s…

In my last post (ICYMI link in resources), I talked about the similarities…

Similarities between Decomp RSSD and Bayesian Priors in Marketing Mix Modeling (MMM)

Similarities between Decomp RSSD…

Open source Marketing Mix Modeling (MMM) tools are great for democratizing MMM. But…

Scroll to Top