Baseline vs Media Trade-offs

One of the most confusing experiences in MMM is this:

  • two specifications can fit the target series almost equally well
  • both can have acceptable diagnostics
  • yet they can assign very different amounts of the target to media versus baseline

This is not necessarily a bug in the software. It is a structural feature of the problem.

This page explains how that trade-off appears in Abacus and why you should expect it.

1. The decomposition problem

At a high level, Abacus builds the expected target from several additive components.

In the retained PanelMMM build path, the mean function can include:

  • intercept_contribution
  • channel_contribution
  • control_contribution, if you configure control_columns
  • mundlak_contribution, if use_mundlak_cre=True
  • yearly_seasonality_contribution, if yearly_seasonality is enabled
  • additional additive effects you attach before build, such as events or trend effects

The likelihood sees the sum of these pieces, not a directly observed “ground-truth baseline” and “ground-truth media” split.

That means the total fit can be easier to identify than the decomposition.

2. Why the trade-off exists

Suppose revenue rises every December and TV spend also rises every December.

Several stories can fit the same sales data reasonably well:

  • December uplift is mostly seasonality
  • December uplift is mostly TV
  • December uplift is partly both

If the model includes both a seasonal term and media terms, they will compete to explain the same observed movement.

This is the core baseline-versus-media trade-off:

the data often identify total explained variation better than they identify which component deserves the credit

Classical econometricians already know this as collinearity and omitted-variable competition. Bayesian MMM does not make that problem disappear. It makes the uncertainty around it more explicit.

3. What counts as “baseline” in Abacus

In Abacus, the baseline side comes from the terms you specify inside the PyMC graph.

Depending on configuration, that can include:

  • a static intercept
  • a time-varying intercept
  • yearly Fourier seasonality
  • controls
  • events
  • trend-like additive effects
  • Mundlak CRE adjustments in panel settings

So when people say “baseline absorbed the effect”, they usually mean one or more of those components, not a separate external decomposition engine.

4. How media can lose attribution

Media can lose attribution when the non-media side of the model is too good at explaining the same movements.

Common cases:

  • a flexible time-varying intercept captures medium-run swings that media could also explain
  • strong seasonal terms absorb repeating peaks that coincide with campaign timing
  • control variables proxy for media timing or market conditions too strongly
  • event effects explain demand spikes that were previously being picked up by channel coefficients

In each case, the model may still predict well. The question is how the variation is partitioned.

5. How media can steal attribution from baseline

The reverse failure is also common.

If the baseline side is under-specified, media channels can absorb variation that is not truly incremental media response.

Examples:

  • missing seasonality leaves recurring annual structure for media to explain
  • missing controls leave competitor, pricing, or macro effects for media to explain
  • missing events leave spikes for channels to absorb
  • insufficient baseline flexibility forces media to act as a trend proxy

This usually inflates media contribution and makes optimisation outputs look better than they should.

6. Why good fit does not settle the argument

You might hope that whichever specification predicts better must also have the more trustworthy attribution split.

Unfortunately, that does not follow.

A model can reproduce the observed target series very well while still having ambiguous attribution. Predictive adequacy is necessary, but it is not enough to identify the correct media decomposition.

That is why:

7. Signs that the trade-off is driving your result

Be cautious when you see any of the following:

  • very similar model fit with materially different channel contributions
  • large channel swings after adding or removing a seasonal or trend term
  • media ROI rankings that flip after adding controls or events
  • one highly flexible baseline term dominating decomposition while media contributions collapse
  • implausibly smooth media contributions paired with a very wiggly baseline, or vice versa

These are not proofs of misspecification, but they are strong prompts for sensitivity analysis.

8. What to do in practice

A disciplined Abacus workflow is usually better than trying to argue theoretically about the “right” split in the abstract.

Recommended approach:

  1. Start with a specification that has the minimum baseline structure you can defend.
  2. Add seasonal, control, event, or time-varying terms only when you can justify them substantively or diagnostically.
  3. Refit and compare decomposition stability, not just target fit.
  4. Report instability when attribution changes materially across defensible specifications.
  5. Where possible, bring in external evidence such as lift tests or calibration.

The important point is not to force one narrative prematurely. It is to show which attribution conclusions remain stable after reasonable specification changes.

9. Abacus-specific interpretation

In Abacus, you should treat the decomposition outputs as conditional on the configured structure:

  • the chosen controls
  • whether yearly_seasonality is on
  • whether the intercept is time-varying
  • whether media effects are time-varying
  • whether you added events or other additive effects
  • whether use_mundlak_cre=True

Change the structure, and the attribution can change even when predictive fit does not move much.

That is normal. It is the software telling you where the data alone are not decisive.

10. Bottom line

Baseline-versus-media trade-offs are unavoidable in MMM because the observed target only reveals the sum of the contributing processes.

Abacus makes this explicit by fitting all configured terms inside one additive Bayesian graph. That is a strength, but it also means you need to read the decomposition as a conditional statement:

given this model structure, priors, and data, this is the most plausible attribution split

That is much more defensible than pretending the split is uniquely observed in the data.