Adstock and Saturation

In classical econometrics, you model diminishing returns by taking the logarithm of spend: $\log(\text{spend})$ enters the regression, and the coefficient captures an elasticity. Carryover effects, if considered at all, are handled with lagged dependent variables or Koyck distributed lags. These approaches are simple and familiar. They are also, for media measurement, inadequate.

This document explains the two non-linear transformations at the heart of every modern Marketing Mix Model — adstock (carryover) and saturation (diminishing returns) — and shows why they are more flexible, more interpretable, and more economically grounded than the classical alternatives. We also address a subtle but important modelling decision: whether to apply adstock before saturation, or saturation before adstock.


1. The Problem with Log-Linear Specifications

The classical $\log(\text{spend})$ specification makes a single, rigid assumption: the marginal return to an additional pound of media spend decreases at a rate governed by the reciprocal of current spend. Doubling spend from £100 to £200 produces the same incremental effect as doubling from £1,000 to £2,000. The curvature is fixed by the functional form. You cannot learn it from the data.

This creates two problems in practice.

The first is that the log transform cannot capture saturation at high spend levels. If a channel is already saturated — say, you have bought every available TV slot in the UK — the log transform will still predict positive incremental returns for every additional pound. The curve never flattens. In reality, the marginal return from saturated media is effectively zero, and you need a function that can reach a ceiling.

The second is that the log transform says nothing about carryover. A TV advertisement aired in week 10 does not affect sales only in week 10. Viewers remember the ad. Brand salience persists. The effect decays over subsequent weeks. A pure $\log(\text{spend}_t)$ specification attributes the entire effect to the week the money was spent, ignoring the temporal diffusion of advertising impact. You can add lagged terms manually ($\log(\text{spend}_{t-1})$, $\log(\text{spend}_{t-2})$, and so on), but each lag consumes a degree of freedom, and you must choose the lag length arbitrarily.

Abacus replaces both of these ad hoc treatments with two purpose-built, parameterised transformations whose shapes are learned jointly from the data inside the Bayesian graph.

2. Adstock: Modelling Carryover

Adstock captures a simple economic intuition: advertising has a lingering effect. A pound spent on TV in week 10 generates some response in week 10, a smaller response in week 11, a still smaller response in week 12, and so on until the effect has fully decayed.

The default implementation in Abacus is geometric adstock. The transformation takes the raw weekly spend series and replaces each observation with a weighted sum of current and past spend, where the weights decay geometrically:

$$x^*_t = x_t + \alpha \cdot x^*_{t-1}$$

The parameter $\alpha$ (between 0 and 1) controls the rate of decay. When $\alpha$ is close to zero, the effect is concentrated in the week of exposure — the ad is forgotten almost immediately. When $\alpha$ is close to one, the effect persists for many weeks — the brand impression lingers. The maximum lag length l_max truncates the convolution at a finite horizon for computational efficiency.

For an econometrician, recognise that this is precisely a Koyck distributed lag model, but with two critical differences. First, the decay parameter $\alpha$ is not estimated from lagged dependent variables (which introduces Nickell bias in short panels). It is estimated directly as a parameter of the transformation, with its own Bayesian prior — by default a Beta(1, 3) distribution that gently favours shorter decay while allowing the data to push toward longer persistence if warranted. Second, you do not need to choose the lag length by hand. You set l_max as a generous upper bound (say, 8 or 12 weeks), and the geometric decay structure ensures that distant lags receive negligible weight automatically.

Abacus also provides alternative adstock functions, including Weibull PDF and Weibull CDF adstock, which allow for non-monotonic decay patterns (an effect that peaks one or two weeks after exposure rather than immediately). These capture the empirical reality that some channels — particularly upper-funnel brand advertising — take time to build mental availability before generating measurable sales response.

3. Saturation: Modelling Diminishing Returns

Saturation captures the second economic intuition: each additional pound of spend on a channel is worth less than the last. The first £10,000 of TV spend reaches new audiences and generates substantial incremental sales. The next £10,000 reaches many of the same people again and generates less. Eventually, you have saturated the available audience, and further spend generates almost nothing.

The default implementation in Abacus is logistic saturation:

$$f(x) = \beta \cdot \frac{1 - e^{-\lambda x}}{1 + e^{-\lambda x}}$$

Two parameters govern the shape. The parameter $\lambda$ controls the steepness of the curve — how quickly diminishing returns set in. A large $\lambda$ means the channel saturates rapidly (steep initial response, early flattening). A small $\lambda$ means the channel has a long runway before saturation (gradual response, late flattening). The parameter $\beta$ controls the asymptotic maximum — the ceiling of the response, representing the maximum possible contribution from this channel regardless of spend.

Compare this to the classical $\log(\text{spend})$ specification. The logistic saturation curve has a genuine asymptote: beyond a certain spend level, the curve is effectively flat. The log specification has no such ceiling. The logistic curve also has a tunable inflection point (governed by $\lambda$), allowing the data to determine where diminishing returns begin. The log curve always bends at the same relative rate.

The default priors in Abacus encode mild economic beliefs. The prior on $\lambda$ is Gamma(3, 1), which centres mass on moderate saturation rates while allowing the data to push toward very steep or very gradual curves. The prior on $\beta$ is HalfNormal(sigma=2), which keeps the channel contribution positive and moderately scaled.

4. Joint Estimation Inside the Bayesian Graph

Here is the critical difference between the Abacus approach and classical pre-processing. In many legacy MMM implementations (and in some textbook treatments), the adstock and saturation transformations are applied as a pre-processing step: the analyst picks fixed values for $\alpha$ and $\lambda$ (perhaps through grid search or “expert judgement”), transforms the raw spend data, and then runs a linear regression on the transformed data.

This approach severs the chain of uncertainty. The regression treats the transformed spend as a known quantity, ignoring the fact that $\alpha$ and $\lambda$ were estimated (or guessed). The standard errors on the media coefficients are conditional on the pre-selected transformation parameters being exactly correct. They are too narrow.

In Abacus, the adstock parameter $\alpha$, the saturation parameters $\lambda$ and $\beta$, and the media coefficient are all estimated simultaneously inside a single PyMC model. The MCMC sampler explores the joint posterior over all parameters at once. When the sampler draws a high value of $\alpha$ (long carryover), it simultaneously adjusts $\lambda$ and the media coefficient to maintain consistency with the observed data. The resulting posterior credible intervals for media contribution honestly reflect uncertainty about the transformation shape, the coefficient magnitude, and their interactions.

This is analogous to the distinction between two-stage least squares (where the first-stage residuals inject estimation error into the second stage, requiring corrected standard errors) and full-information maximum likelihood (where all parameters are estimated jointly). The Bayesian joint estimation in Abacus is closer in spirit to FIML, but with the added benefit of prior regularisation.

5. The Ordering Decision: Adstock First or Saturation First

When you initialise a PanelMMM in Abacus, you choose adstock_first=True (the default) or adstock_first=False. This decision controls the order in which the two transformations are composed, and it encodes a substantive economic assumption about how the media channel operates.

When adstock_first=True, the pipeline is: raw spend → adstock → saturation. The economic interpretation is that carryover accumulates first in the consumer’s memory (brand salience builds up over multiple weeks of exposure), and only then does the accumulated stock of impressions hit diminishing returns. This makes sense for brand-building channels like TV, outdoor, and sponsorship, where the advertising effect is cumulative and the saturation constraint applies to the total accumulated exposure rather than to a single week’s spend.

When adstock_first=False, the pipeline is: raw spend → saturation → adstock. The economic interpretation is that diminishing returns apply immediately to each week’s spend (this week’s audience is saturated by this week’s spend alone), and only then does the already-saturated response carry over into future weeks. This makes sense for direct-response channels like paid search or performance display, where each week’s impressions hit a ceiling independently (you can only capture so many searches in a week), but the conversion effect persists.

The distinction matters quantitatively. Under adstock-first, the model allows a sequence of moderate spend weeks to accumulate into a heavily saturated state — even if no single week was high-spend on its own. Under saturation-first, each week’s spend is capped independently, so a steady moderate spend never reaches the saturation ceiling.

In practice, most MMM practitioners default to adstock-first for all channels, which is why Abacus sets adstock_first=True as the default. But if you have strong prior knowledge that a particular channel exhibits immediate per-week saturation (because the audience pool is fixed and refreshes weekly), switching the order is a principled modelling choice.

6. Why This Matters for Econometricians

The adstock-saturation framework replaces several ad hoc classical specifications with a coherent, jointly estimated non-linear model. To summarise the mapping:

The classical Koyck lag model is replaced by geometric adstock with a Bayesian prior on the decay rate. You no longer need to choose lag lengths manually or worry about Nickell bias from lagged dependent variables.

The classical $\log(\text{spend})$ specification is replaced by logistic saturation with learnable steepness and ceiling parameters. You gain a genuine asymptote (something $\log$ cannot provide) and data-driven curvature (something $\log$ fixes by assumption).

The classical two-stage approach (transform then regress) is replaced by joint Bayesian estimation. Your credible intervals honestly propagate uncertainty from the transformation parameters through to the media contribution estimates.

The result is a media response model that is more flexible than any classical specification, more honest about uncertainty, and grounded in the same economic intuitions — carryover and diminishing returns — that econometricians have always recognised. The difference is that Abacus lets the data determine the shape of these phenomena rather than imposing it through functional form.