Abacus Documentation

Abacus is a Bayesian MMM library built on PyMC and PyTensor. It provides PanelMMM for multi-market panel modelling, structured YAML-driven configuration, a staged pipeline runner, and budget optimisation — all designed for reproducible, production-grade MMM workflows.

Where to start

You want to… Start here
Install and run your first model InstallationQuickstart: Python
Build a model from YAML config Quickstart: YAMLYAML Configuration
Run the full staged pipeline Quickstart: RunnerRunner Overview
Understand the model equation Model OverviewAdstock and Saturation
Interpret results and outputs Post-ModelingResponse Curves
Optimise a budget allocation Budget OptimisationInterpreting Results
Learn the econometrics behind it FAQ — explainers for practitioners

Subsections of Abacus Documentation

Getting Started

This section helps you install Abacus and run your first model.

Start here if you want to:

  • set up a local environment from this repository
  • fit PanelMMM directly from Python
  • build a model from YAML
  • run the structured pipeline against one of the bundled demo configs

Pages

Subsections of Getting Started

Installation

These instructions assume you are working from a local checkout of the Abacus repository.

Prerequisites

Item Notes
Python The package requires Python 3.11 or later. The repo development environment uses Python 3.12.
Local checkout Install from the repository root, not from a published package index.
Writable temp/cache directory Useful for PyTensor compiledir and local verification commands.

This is the supported local development path for the repository.

conda env create -f environment.yml
conda activate abacus-dev
python3 -m pip install -e .

This gives you:

  • the repo-managed development environment from environment.yml
  • an editable install, so local code changes are picked up immediately

Minimal pip install from source

If you do not want the full Conda environment, you can still install Abacus directly from the repository root.

Standard install

python3 -m pip install .

Editable install

python3 -m pip install -e .

Use the editable install if you are changing code, configs, or docs locally.

Optional extras

Abacus defines a small set of optional extras in pyproject.toml.

Extra Install command Use when you need
lint python3 -m pip install .[lint] Ruff, MyPy, and related local linting tools
test python3 -m pip install .[test] Pytest and test-only dependencies
planner python3 -m pip install -e ".[planner]" Dash and Plotly for the scenario planner surfaces

If you created the environment from environment.yml, most development dependencies are already present.

Verify the install

A quick smoke check from the repository root:

python3 -c "from abacus.mmm.panel import PanelMMM; print(PanelMMM.__name__)"

For a real end-to-end verification path, use the repo smoke target:

make smoke_mmm

If you are working on the repo itself, the main local verification commands are:

make test
make verify_local
make verify_package

Runtime defaults for restricted environments

Some local runs need writable cache directories. If you hit PyTensor compiledir or cache-permission issues, export the same defaults used by the repo verification scripts:

export PYTENSOR_FLAGS="base_compiledir=/tmp/pytensor,linker=py"
export JAX_PLATFORMS=cpu
export XDG_CACHE_HOME=/tmp

Next steps

Quickstart: Python API

This page shows the fastest direct path from a pandas dataset to a fitted PanelMMM.

If you have not prepared your dataset yet, read Data Preparation first.

Load a dataset

The repository includes bundled demo datasets under data/demo/. The timeseries bundle is the simplest starting point because it has no extra panel dimensions.

import pandas as pd

dataset = pd.read_csv("data/demo/timeseries/dataset.csv")
dataset["date"] = pd.to_datetime(dataset["date"])

X = dataset.drop(columns=["revenue"])
y = dataset["revenue"].rename("revenue")

Construct PanelMMM

from abacus.mmm import GeometricAdstock, LogisticSaturation
from abacus.mmm.panel import PanelMMM

mmm = PanelMMM(
    date_column="date",
    target_column="revenue",
    channel_columns=[
        "channel_1",
        "channel_2",
        "channel_3",
        "channel_4",
        "channel_5",
        "channel_6",
    ],
    yearly_seasonality=2,
    adstock=GeometricAdstock(l_max=4),
    saturation=LogisticSaturation(),
)

This example uses a plain timeseries. If your dataset has panel dimensions such as geo or brand, add them with dims=(...) and keep those columns in X.

Fit the model

You can call fit() directly. If the model graph has not been built yet, Abacus builds it for you.

idata = mmm.fit(
    X,
    y,
    draws=200,
    tune=200,
    chains=2,
    cores=2,
    progressbar=False,
    compute_convergence_checks=False,
    random_seed=42,
)

fit() returns an arviz.InferenceData object and also stores it on the model instance as mmm.idata.

Prior and posterior predictive checks

You can sample prior predictive draws before fitting:

prior = mmm.sample_prior_predictive(
    X=X,
    y=y,
    samples=50,
    random_seed=42,
)

After fitting, you can sample posterior predictive draws:

post = mmm.sample_posterior_predictive(
    X=X,
    progressbar=False,
    random_seed=42,
)

By default, this also stores posterior predictive draws on mmm.idata.

When to call build_model()

Call build_model(X, y) explicitly when you want to inspect or modify the PyMC graph before sampling.

For example, you might build first so that you can add stored original-scale deterministics:

mmm.build_model(X, y)
mmm.add_original_scale_contribution_variable(
    var=["channel_contribution", "y"]
)

After that, fit the already-built model:

idata = mmm.fit(
    X,
    y,
    draws=200,
    tune=200,
    chains=2,
    cores=2,
    progressbar=False,
    compute_convergence_checks=False,
    random_seed=42,
)

Basic outputs

After fitting, common next steps are:

mmm.save("mmm.nc")
fig, axes = mmm.plot.posterior_predictive()

You can also inspect:

  • mmm.posterior
  • mmm.posterior_predictive
  • mmm.summary
  • mmm.diagnostics

Next steps

Quickstart: YAML Builder

Use the YAML builder when you want the model specification to live in a config file instead of Python code.

The builder entry point is:

from abacus.mmm.builders.yaml import build_mmm_from_yaml

Smallest useful workflow

The bundled demo config at data/demo/timeseries/config.yml is a working starting point. It already points to a combined dataset with data.dataset_path.

import pandas as pd

from abacus.mmm.builders.yaml import build_mmm_from_yaml

dataset = pd.read_csv("data/demo/timeseries/dataset.csv")
X = dataset.drop(columns=["revenue"])
y = dataset["revenue"].rename("revenue")

mmm = build_mmm_from_yaml(
    "data/demo/timeseries/config.yml",
    X=X,
    y=y,
)

build_mmm_from_yaml(...) returns a PanelMMM instance with the PyMC graph already built.

Minimal config structure

At minimum, the YAML config needs the flow-oriented blocks that describe the dataset, target, and media specification directly.

data:
  dataset_path: dataset.csv
  date_column: date

target:
  column: revenue
  type: revenue

media:
  channels:
    - channel_1
    - channel_2
    - channel_3
  adstock:
    type: geometric
    l_max: 4
  saturation:
    type: logistic

fit:
  draws: 1000
  tune: 1000
  chains: 4
  random_seed: 42

How data loading works

The builder supports two data-loading patterns.

Pattern What you provide
Combined dataset data.dataset_path in YAML, or X and y already split in Python
Separate files data.x_path and data.y_path in YAML

If you use data.dataset_path, the target column must be present in that file. The builder splits it out into X and y before building the model.

The builder also normalises X[date_column] with pd.to_datetime(...) after loading the data.

Configured relative paths are resolved relative to the YAML file location.

Fit after building

The builder does not fit the model for you. Fit it in the usual way:

idata = mmm.fit(
    X,
    y,
    draws=200,
    tune=200,
    chains=2,
    cores=2,
    progressbar=False,
    compute_convergence_checks=False,
    random_seed=42,
)

If you rely on data.dataset_path, either split the combined dataset in Python before fitting, or load it once in Python and pass the same X and y into both build_mmm_from_yaml(...) and fit(...).

Optional top-level YAML blocks

The builder recognises several optional top-level sections in addition to data, target, and media.

Key Purpose
dimensions Panel-dimension columns such as geo or brand
scaling Optional scaling rules for target and channels
effects Additive effects to attach before model build
priors Model-level priors passed into PanelMMM
fit Sampler defaults used by the runner or by Python overrides
holidays Holiday/event configuration applied before build
original_scale_vars Add original-scale deterministic variables after build
inference_data Attach existing inference data if the file exists
calibration Apply calibration steps after the model is built

Override config values from Python

Use model_kwargs when you want to keep most settings in YAML but override a subset from Python.

For example, you can override the fit config for a lighter quickstart run:

mmm = build_mmm_from_yaml(
    "data/demo/timeseries/config.yml",
    X=X,
    y=y,
    model_kwargs={
        "sampler_config": {
            "draws": 200,
            "tune": 200,
            "chains": 2,
            "cores": 2,
            "progressbar": False,
            "compute_convergence_checks": False,
            "random_seed": 42,
        }
    },
)

model_kwargs takes precedence over the YAML defaults.

Next steps

Quickstart: Pipeline Runner

Use the pipeline runner when you want a full staged run instead of only an in-memory model fit.

The runner writes:

  • a run manifest
  • copied and resolved config files
  • fitted model artefacts
  • posterior predictive assessment outputs
  • decomposition, diagnostics, and response-curve artefacts

Fastest first run: bundled demo

From the repository root, the quickest way to see a real structured run is the demo launcher:

python3 runme.py --demo timeseries

Other bundled demos are:

  • geo_panel
  • geo_brand_panel

List them explicitly with:

python3 runme.py --list-demos

runme.py is a convenience wrapper around the structured pipeline. It resolves the demo config under data/demo/<demo_name>/config.yml and runs the pipeline for you.

Run the pipeline from Python

The direct Python API is:

from pathlib import Path

from abacus.pipeline import PipelineRunConfig, run_pipeline

result = run_pipeline(
    PipelineRunConfig(
        config_path=Path("data/demo/geo_panel/config.yml"),
        output_dir=Path("results"),
        run_name="geo_panel_quickstart",
        prior_samples=10,
        draws=200,
        tune=200,
        chains=2,
        cores=2,
        random_seed=42,
        curve_samples=50,
        curve_points=50,
    )
)

print(result.run_dir)
print(result.manifest_path)

If the YAML config already contains data.dataset_path, you do not need to pass dataset_path again.

Run the thin CLI directly

The pipeline also exposes a thin CLI in abacus.pipeline.runner:

python3 -m abacus.pipeline.runner \
  --config data/demo/geo_panel/config.yml \
  --output-dir results \
  --run-name geo_panel_quickstart \
  --prior-samples 10 \
  --draws 200 \
  --tune 200 \
  --chains 2 \
  --cores 2 \
  --random-seed 42 \
  --curve-samples 50 \
  --curve-points 50

The CLI prints the final run directory when the pipeline completes.

Override data paths

Use one of these patterns:

Pattern Arguments
Combined dataset override dataset_path= in Python or --dataset-path in the CLI
Separate feature and target files x_path= and y_path= in Python or --x-path and --y-path in the CLI
Target column override target_column= in Python or --target-column in the CLI

Configured relative paths are resolved relative to the YAML config directory.

If you want Stage 50 to use different warn/fail cutoffs, add a runner-only diagnostics.thresholds block to the YAML. See YAML Configuration.

What you get back

run_pipeline(...) returns a PipelineRunResult with:

  • run_dir
  • manifest_path

The output directory contains stage folders such as:

  • 00_run_metadata
  • 20_model_fit
  • 30_model_assessment
  • 50_diagnostics
  • 60_response_curves

60_response_curves now includes three complementary curve families:

  • saturation-only transformation artefacts
  • forward-pass direct contribution artefacts built from scaled observed history
  • adstock carryover artefacts

When to use the runner

Choose the runner when you want:

  • a reproducible run directory on disk
  • structured metadata and manifest files
  • staged artefacts for diagnostics and reporting
  • a config-driven workflow for repeated runs

If you only need to fit a model interactively in a notebook or script, start with Quickstart: Python API or Quickstart: YAML Builder.

Data Preparation

This section explains how to prepare X and y for PanelMMM. It covers the required columns, how panel rows are organised when you use dims, and how Abacus scales channels and the target before fitting.

Pages

  • Input Data Requirements — Required X and y inputs, column roles, alignment rules, and common input errors.
  • Panel Data Layout — How to structure rows for no panel dims, one dim such as geo, or multiple dims such as geo and brand.
  • Scaling and Preprocessing — What Abacus scales automatically, how Scaling works, and what to preprocess yourself.

Subsections of Data Preparation

Input Data Requirements

Use this page together with Panel Data Layout and Scaling and Preprocessing when you prepare a dataset for PanelMMM.

Core contract

For direct Python use, PanelMMM expects:

  • X as a pandas.DataFrame
  • y as a pandas.Series named target_column, or a one-dimensional NumPy array of the same length as X

X must contain the date column, all media columns, and any configured control_columns or dims columns. y carries only the target values.

Role Where it must be present Required Notes
date_column X Yes Normalise to datetimes or parseable date strings.
channel_columns X Yes Every listed channel column must exist in X.
target_column y Yes y.name should match target_column.
control_columns X No If configured, every listed control column must exist in X.
dims X No One column per configured panel dimension, such as geo or brand.

X and y

When you call fit(X, y) or build_model(X, y):

  • Keep the target out of X.
  • Keep X and y row-aligned.
  • If both are pandas objects, keep the same index on both. The shared regression builder checks index equality before fitting.
  • If you pass y as a NumPy array, its length must match len(X).
  • For panel models, each date_column + dims combination must appear exactly once. Duplicate rows are rejected.

Abacus uses target_column as the target name throughout the panel reshape. If y is a Series, its name must match target_column.

Date column

date_column is required in X.

Abacus expects calendar dates, not integer date codes. In practice:

  • Use datetime64[ns] where possible.
  • Parse string dates with pd.to_datetime(...) before fitting when you use the Python API.
  • Do not rely on numeric date values such as 0, 1, 2. Pandas can interpret them as offsets from the Unix epoch, which is usually not what you want.

The YAML builder normalises X[date_column] with pd.to_datetime(...) after loading the dataset. Direct Python use does not add an equivalent preprocessing step for you.

Channel columns

channel_columns is a required constructor argument and must be a non-empty list.

Each listed channel:

  • must be present in X
  • must be fully observed for every row you pass into fit or posterior prediction; Abacus does not silently convert missing channel values to zero
  • should represent the raw media variable that you want the adstock and saturation transformations to consume

Target column

target_column names the dependent variable. It defaults to "y", but you can set a different name such as "sales" or "conversions".

For direct Python use:

  • pass the target as y
  • name the Series with target_column
  • keep the target fully observed; missing target values are rejected rather than zero-filled

For combined-file YAML or pipeline flows:

  • keep the target column in the source dataset
  • Abacus splits it out of the combined dataset before fitting

Control columns

control_columns is optional.

If you configure it, every listed control column must be present in X. Controls stay in the design matrix as separate regressors; they are not part of y.

Like channels, configured controls must be fully observed for every row passed into fit or posterior prediction.

Abacus does not automatically scale controls. See Scaling and Preprocessing.

Panel dimensions with dims

dims is optional. Use it when you want a panel model, for example by geo, brand, or market.

If you set dims=("geo", "brand"):

  • X must contain geo and brand columns
  • each row in X represents one date + geo + brand observation
  • each new date must include every fitted panel slice when you later call posterior-predictive methods with new data

Do not use reserved internal names in dims:

  • date
  • channel
  • control
  • fourier_mode

For row layout and rectangularity guidance, see Panel Data Layout.

Supported shapes and alignment

Workflow Supported shape
Direct PanelMMM.fit() / build_model() X: DataFrame; y: Series or 1D ndarray
YAML builder with data.dataset_path One tabular file containing both predictors and the target column
Pipeline runner with dataset_path Same as above
Pipeline runner with x_path and y_path Separate feature and target files; the runner extracts target_column from the target file

Abacus also has an internal alignment helper that can work with a MultiIndex target Series indexed by [date_column, *dims], but that is mainly used in fit-data rebuild and load flows. For normal fitting, keep y row-aligned with X.

Python example

import pandas as pd

from abacus.mmm import GeometricAdstock, LogisticSaturation
from abacus.mmm.panel import PanelMMM

dataset = pd.DataFrame(
    {
        "date": pd.to_datetime(
            ["2025-01-06", "2025-01-06", "2025-01-13", "2025-01-13"]
        ),
        "geo": ["UK", "US", "UK", "US"],
        "tv": [120.0, 150.0, 125.0, 152.0],
        "search": [40.0, 55.0, 42.0, 58.0],
        "price_index": [1.02, 0.99, 1.01, 1.00],
        "sales": [820.0, 910.0, 835.0, 925.0],
    }
)

X = dataset.drop(columns=["sales"])
y = dataset["sales"].rename("sales")

mmm = PanelMMM(
    date_column="date",
    channel_columns=["tv", "search"],
    target_column="sales",
    control_columns=["price_index"],
    dims=("geo",),
    adstock=GeometricAdstock(l_max=8),
    saturation=LogisticSaturation(),
)

mmm.fit(X, y)

YAML note

If you use a combined dataset in YAML, the file at data.dataset_path must contain every configured column:

  • date_column
  • every entry in channel_columns
  • every entry in control_columns, if any
  • every entry in dims, if any
  • target_column

Example:

data:
  dataset_path: panel_dataset.csv
  date_column: date

target:
  column: sales
  type: revenue

dimensions:
  panel: [geo]

media:
  channels: [tv, search]
  controls: [price_index]
  adstock:
    type: geometric
    l_max: 8
  saturation:
    type: logistic

Common pitfalls

  • Missing date_column, channel, control, or dimension columns in X
  • Passing a y Series whose name does not match target_column
  • Passing pandas X and y with different indexes
  • Passing a NumPy y with a different length from X
  • Passing duplicate panel rows or incomplete panel slices for a given date
  • Passing missing observed channel, control, or target values and expecting Abacus to treat them as structural zeroes
  • Expecting the YAML builder or pipeline to find a target column that is not present in the combined dataset
  • Leaving date values as numeric codes instead of normalising them first

Panel Data Layout

This page explains how PanelMMM expects panel rows to be organised in X. For the column-level contract, see Input Data Requirements.

What “panel” means in Abacus

In Abacus, a panel dataset repeats the same time axis across one or more categorical dimensions in dims.

Each row represents:

  • one date_column value
  • one combination of dims values, if any
  • one set of channel and optional control values for that slice

With no extra panel dims, each date appears once. With dims=("geo",), each date appears once per geo. With dims=("geo", "brand"), each date appears once per geo + brand combination.

How dims work

Pass panel dimensions when you construct the model:

from abacus.mmm import GeometricAdstock, LogisticSaturation
from abacus.mmm.panel import PanelMMM

mmm = PanelMMM(
    date_column="date",
    channel_columns=["tv", "search"],
    target_column="sales",
    dims=("geo", "brand"),
    adstock=GeometricAdstock(l_max=8),
    saturation=LogisticSaturation(),
)

dims columns stay in X. They are not moved into y.

Abacus reserves these names for internal coordinates, so do not use them in dims:

  • date
  • channel
  • control
  • fourier_mode

No extra panel dims

If dims=(), X should have one row per date.

date tv search sales
2025-01-06 120 40 820
2025-01-13 125 42 835
2025-01-20 130 45 850

Internally, Abacus reshapes this into:

  • channels: (date, channel)
  • target: (date,)
  • controls, if present: (date, control)

Single panel dim example: geo

If dims=("geo",), each date should appear once for each geo value.

date geo tv search sales
2025-01-06 UK 120 40 820
2025-01-06 US 150 55 910
2025-01-13 UK 125 42 835
2025-01-13 US 152 58 925

Internally, Abacus reshapes this into:

  • channels: (date, geo, channel)
  • target: (date, geo)
  • controls, if present: (date, geo, control)

Multiple panel dims example: geo and brand

If dims=("geo", "brand"), each row identifies one date, one geo, and one brand.

import pandas as pd

X = pd.DataFrame(
    {
        "date": pd.to_datetime(
            [
                "2025-01-06",
                "2025-01-06",
                "2025-01-06",
                "2025-01-06",
                "2025-01-13",
                "2025-01-13",
                "2025-01-13",
                "2025-01-13",
            ]
        ),
        "geo": ["UK", "UK", "US", "US", "UK", "UK", "US", "US"],
        "brand": ["A", "B", "A", "B", "A", "B", "A", "B"],
        "tv": [80.0, 55.0, 92.0, 60.0, 82.0, 58.0, 95.0, 63.0],
        "search": [20.0, 18.0, 24.0, 19.0, 21.0, 18.5, 25.0, 20.0],
    }
)

y = pd.Series(
    [510.0, 370.0, 590.0, 405.0, 520.0, 380.0, 605.0, 418.0],
    name="sales",
)

For a rectangular panel, the row count is:

n_dates * n_geo * n_brand

Internal reshape

Abacus converts the pandas inputs into xarray datasets before building the PyMC model.

Input role Internal variable xarray dims
X[channel_columns] _channel (date, *dims, channel)
X[control_columns] _control (date, *dims, control)
y _target (date, *dims)

The channel and control dimensions come from the configured column names, not from row values.

Rectangularity, duplicates, and missing rows

Abacus builds xarray coordinates from the unique values it sees in:

  • date_column
  • each configured dimension column
  • the configured channel or control names

That has three practical consequences:

  • Keep the panel rectangular. Provide one row for every expected date_column + dims combination.
  • Use explicit zeroes for structural no-spend or no-activity rows.
  • Keep declared channel, control, and target values observed within those rows. Abacus rejects missing metric cells instead of silently converting them to zeroes.
  • Do not use missing rows to mean “unknown”. Abacus validates panel shape before reshape and raises an error if panel cells are missing.

Abacus also requires each date_column + dims combination to appear exactly once. It does not aggregate duplicates for you. If you have duplicate rows, deduplicate or aggregate them before fitting or posterior prediction.

Sorting and uniqueness

Sort your data before fitting:

  • first by date_column
  • then by each entry in dims

Abacus keeps dates in the order they appear in X, and time-varying features infer time resolution from adjacent rows. A sorted dataset makes the time axis deterministic and easier to reason about.

Also make sure that each date_column + dims combination appears once in the prepared table, and that every expected panel slice is present for every date.

DataFrame versus MultiIndex handling

For normal fitting:

  • use a regular DataFrame for X
  • keep date_column and any dims as columns in that DataFrame
  • use a row-aligned Series for y

Abacus does have internal helpers that can align a MultiIndex target Series indexed by [date_column, *dims], but that is not the main user-facing data preparation pattern for fit().

Practical checklist

  • One row per date_column + dims combination
  • No duplicate rows for the same panel cell
  • Same set of dates for every panel slice
  • Explicit zeroes for true zero activity
  • No missing observed channel, control, or target values
  • Sorted rows before fitting

For scaling choices once the layout is correct, see Scaling and Preprocessing.

Scaling and Preprocessing

Abacus scales channels and the target automatically before it builds the PyMC graph for PanelMMM. This page explains what is scaled, how the Scaling configuration works, and what you still need to preprocess yourself.

What Abacus scales automatically

Abacus computes scales from the reshaped xarray dataset immediately before model construction.

Variable role Automatic scaling Notes
Target (y) Yes Divided by target_scale before the likelihood is built.
Channels (channel_columns) Yes Divided by channel_scale before adstock and saturation.
Controls (control_columns) No Controls enter the model on their original scale.
Date and dims columns No These define coordinates, not modelled numeric inputs.

Abacus stores the resulting scalers in the model as xarray data:

  • _target scaler data in model.scalers["_target"]
  • _channel scaler data in model.scalers["_channel"]

Default behaviour

If you do not pass scaling, PanelMMM uses:

Scaling(
    target=VariableScaling(method="max", dims=dims),
    channel=VariableScaling(method="max", dims=dims),
)

This means:

  • the target is divided by the maximum over date and all configured dims
  • each channel is divided by its maximum over date and all configured dims

With no extra panel dims:

  • target_scale is a scalar
  • channel_scale has dimension channel

With dims=("geo",) and the default scaling:

  • target_scale is still a scalar, because scaling reduces over both date and geo
  • channel_scale still has dimension channel, so each channel is pooled across all geos

If you want per-panel scales instead of pooled scales, set dims=() inside the relevant VariableScaling. See Dimension semantics.

Scaling and VariableScaling

Use abacus.mmm.scaling.Scaling and abacus.mmm.scaling.VariableScaling to control automatic scaling.

Setting Purpose Allowed values
VariableScaling.method Reduction used to compute the scale "max" or "mean"
VariableScaling.dims Extra dimensions to reduce across, in addition to date String or tuple of strings
Scaling.target Scaling rule for the target VariableScaling
Scaling.channel Scaling rule for channels VariableScaling

Rules enforced by the implementation:

  • date is always assumed in the reduction and must not be listed in VariableScaling.dims.
  • Duplicate scaling dims are not allowed.
  • Target scaling dims must come from the model dims.
  • Channel scaling dims must come from the model dims, with optional inclusion of channel.

You can pass either:

  • a Scaling object
  • a plain dictionary with target and channel keys

If the dictionary omits one side, Abacus fills the missing target or channel rule with the default method="max", dims=dims configuration.

Dimension semantics

VariableScaling.dims tells Abacus which dimensions to reduce across in addition to date. It does not tell Abacus which dimensions to keep.

Assume a model with dims=("geo",) so channel data has dimensions (date, geo, channel) and target data has dimensions (date, geo).

Configuration Reduction performed Resulting scale dims Meaning
target.dims=() over date (geo,) One target scale per geo
target.dims=("geo",) over date, geo () One pooled target scale
channel.dims=() over date (geo, channel) One scale per geo-channel pair
channel.dims=("geo",) over date, geo (channel,) One pooled scale per channel
channel.dims=("geo", "channel") over date, geo, channel () One pooled scale for all channels

Python example

This example keeps separate scales for each geo by reducing only over date:

from abacus.mmm import GeometricAdstock, LogisticSaturation
from abacus.mmm.panel import PanelMMM
from abacus.mmm.scaling import Scaling, VariableScaling

mmm = PanelMMM(
    date_column="date",
    channel_columns=["tv", "search"],
    target_column="sales",
    dims=("geo",),
    scaling=Scaling(
        target=VariableScaling(method="mean", dims=()),
        channel=VariableScaling(method="max", dims=()),
    ),
    adstock=GeometricAdstock(l_max=8),
    saturation=LogisticSaturation(),
)

In that configuration:

  • the target is divided by the per-geo mean over time
  • each channel is divided by the per-geo, per-channel maximum over time

YAML example

The YAML builder accepts the same structure through a top-level scaling block:

data:
  date_column: date

target:
  column: y
  type: revenue

dimensions:
  panel: [market]

media:
  channels: [channel_1, channel_2]
  adstock:
    type: geometric
    l_max: 8
  saturation:
    type: logistic

scaling:
  target:
    method: max
    dims: []
  channels:
    method: max
    dims: [market]

In this example:

  • target is scaled separately for each market
  • channel is scaled across date and market, leaving one scale per channel

Original units versus model scale

The model is fit on scaled target and channel data.

That affects downstream interpretation:

  • posterior likelihood and many contribution variables live in scaled target space
  • channel inputs are transformed after scaling, not in raw units

If you want stored deterministics in original target units, add them explicitly after build_model(...):

mmm.add_original_scale_contribution_variable(
    var=["channel_contribution", "y"]
)

The YAML builder supports the same workflow through original_scale_vars:

original_scale_vars:
  - channel_contribution
  - y

original_scale_vars adds extra original-scale deterministic variables. It does not change how the model is fit.

What Abacus does not preprocess for you

Abacus does not automatically:

  • scale controls
  • impute missing data in a domain-aware way
  • reinterpret missing observed channel, control, or target values as zeroes
  • sort the dataset for you
  • repair non-rectangular panel layouts
  • tolerate duplicate panel rows or incomplete panel slices
  • coerce Python-API dates to datetimes before fitting

Practical preprocessing advice

Before fitting:

  • normalise date_column with pd.to_datetime(...)
  • sort by date_column and then by dims
  • make panel gaps explicit instead of leaving missing rows
  • ensure every date_column + dims panel cell appears exactly once
  • impute missing observed channel, control, and target values before fitting or posterior prediction instead of relying on implicit zero-fill
  • decide whether controls should be centred, standardised, log-transformed, or otherwise prepared before they go into control_columns
  • choose scaling dims deliberately instead of relying on the default when you use panel data

Common pitfalls

  • Expecting the default scaling to be per-group when it actually pools across the configured panel dims
  • Adding date to VariableScaling.dims; Abacus rejects this
  • Forgetting that controls are left on their original scale
  • Treating VariableScaling.dims as dimensions to keep rather than dimensions to reduce across
  • Assuming original_scale_vars changes fitting scale rather than adding extra outputs

For the input table shape that scaling operates on, see Panel Data Layout.

Model Specification

This section explains how PanelMMM is defined: the model structure, media transforms, priors, panel dimensions, optional time variation, and calibration hooks.

Pages

  • Model Overview - The actual PanelMMM mean structure, scaled-space formulation, and optional components.
  • Adstock and Saturation - Supported media transforms, their priors, and the adstock_first composition order.
  • Priors and Configuration - Default prior keys, model_config, transform-prior overrides, and directional control priors.
  • Time-Varying Parameters - How time_varying_intercept and time_varying_media use SoftPlusHSGP.
  • Seasonality and Trends - Built-in yearly seasonality plus custom Fourier, trend, and event effects.
  • Panel Dimensions - How dims change the shape of the data and parameters.
  • Calibration - Lift-test and cost-per-target calibration for a built model.

Subsections of Model Specification

Model Overview

PanelMMM is an additive Bayesian marketing mix model built in PyMC. This page describes the model structure that Abacus actually builds.

For input layout, see Data Preparation. For individual configuration surfaces, see the other pages in this section.

Core structure

At fit time, Abacus builds the model mean in scaled target space as:

mu =
  intercept_contribution
  + sum(channel_contribution over channel)
  + sum(control_contribution over control), if control_columns are configured
  + mundlak_contribution, if use_mundlak_cre=True
  + yearly_seasonality_contribution, if yearly_seasonality is enabled
  + any additional mu_effects

The observed target is then attached through the configured likelihood distribution with mu=mu.

What is scaled and what is not

Before the PyMC graph is built:

  • channel data is scaled according to Scaling.channel
  • the target is scaled according to Scaling.target
  • controls are not scaled automatically

That means media and target priors operate on the model scale, not directly on the original business units. For the scaling surface, see Scaling and Preprocessing.

Model components

Component Built when Shape
intercept_contribution Always effectively ("date", *dims) in the model mean
channel_contribution Always ("date", *dims, "channel")
control_contribution control_columns is set ("date", *dims, "control")
mundlak_contribution use_mundlak_cre=True dims
yearly_seasonality_contribution yearly_seasonality is set ("date", *dims)
Additional additive effects You add entries to mu_effects ("date", *dims)

Abacus also adds total_media_contribution_original_scale automatically as a deterministic on the original target scale.

Media path

Each channel column goes through the configured media transform path:

  1. scale channel input
  2. apply adstock and saturation through forward_pass(...)
  3. optionally apply a time-varying media multiplier
  4. contribute the result through channel_contribution

See Adstock and Saturation and Time-Varying Parameters.

Controls

Controls enter the model as a separate additive term:

control_contribution = control_data * gamma_control

Use controls for non-media regressors such as price, macro indicators, or competitor measures. Controls are configured with control_columns and use gamma_control priors from model_config.

Panel dimensions

dims adds extra indexing axes such as geo, brand, or market.

With dims=("geo",), the model is indexed over date and geo. With dims=("geo", "brand"), it is indexed over date, geo, and brand.

Abacus does not automatically add hierarchical pooling just because dims is set. By default, parameters are indexed over the configured panel coordinates. If you want hierarchical shrinkage across those coordinates, encode it in the priors you pass to transforms or model_config.

See Panel Dimensions.

Optional components

Need Main setting
Extra non-media regressors control_columns
Correlated random effects adjustment use_mundlak_cre=True
Built-in yearly seasonality yearly_seasonality=<int>
Time-varying intercept time_varying_intercept=True or custom HSGPBase
Time-varying media time_varying_media=True or custom HSGPBase
Additional additive effects append to mmm.mu_effects or use YAML effects
Calibration add_lift_test_measurements(...), add_cost_per_target_calibration(...)

What target_type changes

target_type is semantic metadata, not a different likelihood family.

It affects downstream reporting such as the default efficiency metric label:

  • "revenue" -> ROAS
  • "conversion" -> CPA

It does not change the fitted functional form on its own.

Python example

from abacus.mmm import GeometricAdstock, LogisticSaturation
from abacus.mmm.panel import PanelMMM

mmm = PanelMMM(
    date_column="date",
    target_column="sales",
    channel_columns=["tv", "search"],
    control_columns=["price_index"],
    dims=("geo",),
    yearly_seasonality=2,
    adstock=GeometricAdstock(l_max=8),
    saturation=LogisticSaturation(),
)

This specification gives you:

  • an intercept
  • transformed media contributions for tv and search
  • a control contribution for price_index
  • yearly Fourier seasonality
  • a panel axis over geo

Next steps

Adstock and Saturation

PanelMMM requires one adstock transform and one saturation transform. Abacus applies them inside the model graph rather than as a fixed preprocessing step.

For the econometrics framing of these transforms, see Adstock and Saturation for Econometricians.

How the transform path works

Abacus combines the two transforms through forward_pass(...):

  • if adstock_first=True, the order is adstock then saturation
  • if adstock_first=False, the order is saturation then adstock

The transformed result becomes channel_contribution on the model scale.

Adstock options

PanelMMM accepts any AdstockTransformation. The built-in options include:

Class Key parameter priors by default Notes
GeometricAdstock alpha ~ Beta(1, 3) Standard geometric carryover
BinomialAdstock alpha ~ Beta(1, 3) Alternative finite-lag carryover
DelayedAdstock alpha ~ Beta(1, 3), theta ~ HalfNormal(1) Allows a delayed peak
WeibullPDFAdstock lam ~ Gamma(mu=2, sigma=1), k ~ Gamma(mu=3, sigma=1) Flexible PDF-shaped carryover
WeibullCDFAdstock lam ~ Gamma(mu=2, sigma=1), k ~ Gamma(mu=3, sigma=1) Flexible CDF-shaped carryover

All adstock transforms also take:

  • l_max: maximum lag
  • normalize: whether the carryover weights are normalised
  • mode: convolution mode

Saturation options

PanelMMM accepts any SaturationTransformation. Common built-ins include:

Class Key parameter priors by default Notes
LogisticSaturation lam ~ Gamma(alpha=3, beta=1), beta ~ HalfNormal(2) Default retained choice
MichaelisMentenSaturation alpha ~ Gamma(mu=2, sigma=1), lam ~ HalfNormal(1) Common diminishing-returns form
HillSaturation slope ~ HalfNormal(1.5), kappa ~ HalfNormal(1.5), beta ~ HalfNormal(1.5) Flexible Hill curve
HillSaturationSigmoid sigma ~ HalfNormal(1.5), beta ~ HalfNormal(1.5), lam ~ HalfNormal(1.5) Sigmoid Hill variant
RootSaturation alpha ~ Beta(alpha=1, beta=2), beta ~ Gamma(mu=1, sigma=1) Square-root style curvature
TanhSaturation b ~ HalfNormal(1), c ~ HalfNormal(1) Hyperbolic tangent form
TanhSaturationBaselined x0, gain, r, beta all HalfNormal(1) Baselined tanh form

Default prior dims

When you pass a transform to PanelMMM, Abacus assigns default prior dims for any transform prior that does not already have explicit dims:

  • adstock priors default to (*dims, "channel")
  • saturation priors default to (*dims, "channel")

If you want a different structure, set the prior dims explicitly on the transform.

Configure transforms in Python

from abacus.mmm import GeometricAdstock, LogisticSaturation
from abacus.mmm.panel import PanelMMM

mmm = PanelMMM(
    date_column="date",
    channel_columns=["tv", "search"],
    target_column="sales",
    adstock=GeometricAdstock(l_max=8),
    saturation=LogisticSaturation(),
    adstock_first=True,
)

Example with more customised transforms:

from pymc_extras.prior import Prior

from abacus.mmm import MichaelisMentenSaturation, WeibullCDFAdstock

adstock = WeibullCDFAdstock(
    l_max=12,
    priors={
        "lam": Prior("Gamma", mu=2, sigma=1, dims=("geo", "channel")),
        "k": Prior("Gamma", mu=3, sigma=1, dims=("geo", "channel")),
    },
)

saturation = MichaelisMentenSaturation(
    priors={
        "alpha": Prior("Gamma", mu=2, sigma=1, dims=("geo", "channel")),
        "lam": Prior("HalfNormal", sigma=1, dims="geo"),
    }
)

Configure transforms in YAML

data:
  date_column: date

target:
  column: sales
  type: revenue

media:
  channels: [tv, search]
  adstock:
    type: geometric
    l_max: 8
  saturation:
    type: logistic

Override transform priors through priors

Transform priors also appear in model_config under prefixed variable names. For example:

  • adstock_alpha
  • adstock_lam
  • adstock_k
  • saturation_lam
  • saturation_beta

That means you can override transform priors centrally through the top-level priors if you prefer. See Priors and Configuration.

Choose the composition order

adstock_first is part of the model specification, not a plotting choice.

The current public YAML schema does not expose adstock_first; it uses the library default. If you need to change the composition order, use the Python API.

Use adstock_first=True when you want the model to interpret carryover before diminishing returns. Use False when you want each period’s spend to saturate before the carryover step.

The code path is explicit:

  • True -> saturation(adstock(x))
  • False -> adstock(saturation(x))

Common pitfalls

  • Forgetting that l_max is required for adstock classes
  • Assuming dims automatically change transform priors even when you have already set explicit incompatible dims on the transform
  • Using adstock_first=False without a substantive reason
  • Treating transform priors as if they were on original business units rather than the model scale

Next steps

Seasonality and Trends

Abacus supports one built-in seasonality switch on PanelMMM and a broader additive-effect mechanism for custom seasonality, trend, and event terms.

Built-in yearly seasonality

Set yearly_seasonality=<int> to add a yearly Fourier term directly to the main model specification.

mmm = PanelMMM(
    date_column="date",
    target_column="sales",
    channel_columns=["tv", "search"],
    yearly_seasonality=3,
    adstock=GeometricAdstock(l_max=8),
    saturation=LogisticSaturation(),
)

This creates:

  • fourier_contribution as the unsummed Fourier basis contribution
  • yearly_seasonality_contribution as the additive contribution to mu

The prior for this built-in seasonality comes from model_config["gamma_fourier"].

What yearly_seasonality means

yearly_seasonality is the Fourier order passed to YearlyFourier.

It must be a positive integer. Abacus validates this at construction time.

Custom additive effects

For anything beyond built-in yearly seasonality, use mu_effects.

Each effect must add a tensor with dims:

("date", *dims)

Abacus ships three retained additive-effect types:

Effect Use it for
FourierEffect Custom seasonal structure such as weekly or monthly Fourier terms
LinearTrendEffect Piecewise linear trend with changepoints
EventAdditiveEffect Dated events such as launches, promotions, or holidays

FourierEffect

FourierEffect wraps a FourierBase implementation such as:

  • YearlyFourier
  • MonthlyFourier
  • WeeklyFourier

Example:

from abacus.mmm.additive_effect import FourierEffect
from abacus.mmm.fourier import WeeklyFourier

mmm.mu_effects.append(
    FourierEffect(
        fourier=WeeklyFourier(n_order=3, prefix="weekly_fourier")
    )
)

LinearTrendEffect

LinearTrendEffect wraps LinearTrend, which models piecewise linear trend changes through changepoints.

Example:

from abacus.mmm import LinearTrend
from abacus.mmm.additive_effect import LinearTrendEffect

mmm.mu_effects.append(
    LinearTrendEffect(
        trend=LinearTrend(
            n_changepoints=8,
            include_intercept=False,
            dims=("geo",),
        ),
        prefix="trend",
    )
)

Events

For events, the retained public surface on PanelMMM is add_events(...).

Example:

import pandas as pd

from pymc_extras.prior import Prior

from abacus.mmm.events import EventEffect, GaussianBasis

df_events = pd.DataFrame(
    {
        "name": ["promo", "launch"],
        "start_date": pd.to_datetime(["2025-02-01", "2025-03-10"]),
        "end_date": pd.to_datetime(["2025-02-07", "2025-03-14"]),
    }
)

effect = EventEffect(
    basis=GaussianBasis(
        priors={"sigma": Prior("Gamma", mu=7, sigma=1, dims="event")}
    ),
    effect_size=Prior("Normal", mu=0, sigma=1, dims="event"),
    dims=("event",),
)

mmm.add_events(
    df_events=df_events,
    prefix="event",
    effect=effect,
)

The event effect dims must include the event prefix plus the model dims.

When to register effects

Add custom effects before you build or fit the model.

That applies to:

  • mmm.mu_effects.append(...)
  • mmm.add_events(...)

If you build the model first and only then append effects, those new terms are not part of the existing graph.

YAML effects

The YAML builder supports top-level effects: entries. Example:

effects:
  - type: linear_trend
    prefix: trend
    n_changepoints: 8
    include_intercept: false
  - type: weekly_fourier
    order: 3
    prefix: weekly_fourier

The builder appends these effects before calling build_model(...).

Choosing between built-in and custom seasonality

Use yearly_seasonality when you need a compact built-in annual effect.

Use FourierEffect when you need:

  • weekly seasonality
  • monthly seasonality
  • multiple seasonal effects together
  • custom Fourier prefixes or priors

Common pitfalls

  • Adding effects after the model has already been built
  • Using event effect dims that do not include the required prefix
  • Treating yearly_seasonality and a custom yearly Fourier effect as if they were separate concepts when they are both additive seasonal terms

Next steps

  • Read Time-Varying Parameters if you want trend or media behaviour to vary smoothly over time.
  • Read Calibration if you want to constrain the specification with external measurements.

Priors and Configuration

Abacus uses model_config to control priors on the underlying PyMC variables. Transform priors can be configured either on the transform objects themselves or through their prefixed variable names in model_config.

Where configuration lives

Surface Use it for
model_config Intercept, likelihood, controls, seasonality, Mundlak terms, time-varying config, and prefixed transform priors
adstock=... and saturation=... Transform choice plus direct transform-prior overrides
control_impacts and control_sign_policy Directional expectations for controls

Default model_config

PanelMMM.default_model_config is built from the current model state.

The default keys are:

Key Default
intercept Prior("Normal", mu=0, sigma=2, dims=dims)
likelihood Prior("Normal", sigma=Prior("HalfNormal", sigma=2, dims=dims), dims=("date", *dims))
gamma_control Prior("Normal", mu=0, sigma=2, dims=(*dims, "control"))
gamma_fourier Prior("Laplace", mu=0, b=1, dims=(*dims, "fourier_mode"))
gamma_channel_mundlak Added only when use_mundlak_cre=True
gamma_control_mundlak Added only when use_mundlak_cre=True
intercept_tvp_config Added when time_varying_intercept is enabled
media_tvp_config Added when time_varying_media is enabled

Abacus also merges in the transform-specific config exposed by the selected adstock and saturation objects.

Configure priors in Python

Use pymc_extras.prior.Prior objects when you want explicit control:

from pymc_extras.prior import Prior

from abacus.mmm import GeometricAdstock, LogisticSaturation
from abacus.mmm.panel import PanelMMM

model_config = {
    "intercept": Prior("Normal", mu=0, sigma=1, dims=("geo",)),
    "likelihood": Prior(
        "Normal",
        sigma=Prior("HalfNormal", sigma=1.5, dims=("geo",)),
        dims=("date", "geo"),
    ),
    "gamma_control": Prior("Normal", mu=0, sigma=1, dims=("geo", "control")),
    "saturation_lam": Prior("Gamma", alpha=3, beta=1, dims=("geo", "channel")),
}

mmm = PanelMMM(
    date_column="date",
    target_column="sales",
    channel_columns=["tv", "search"],
    control_columns=["price_index"],
    dims=("geo",),
    adstock=GeometricAdstock(l_max=8),
    saturation=LogisticSaturation(),
    model_config=model_config,
)

Configure priors in YAML

YAML config can express the same priors as serialised distribution mappings:

data:
  date_column: date

target:
  column: sales
  type: revenue

dimensions:
  panel: [geo]

media:
  channels: [tv, search]
  adstock:
    type: geometric
    l_max: 8
  saturation:
    type: logistic

priors:
  intercept:
    distribution: Normal
    mu: 0
    sigma: 1
    dims: ["geo"]
  likelihood:
    distribution: Normal
    sigma:
      distribution: HalfNormal
      sigma: 1.5
      dims: ["geo"]
    dims: ["date", "geo"]
  saturation_lam:
    distribution: Gamma
    alpha: 3
    beta: 1
    dims: ["geo", "channel"]

Abacus parses these mappings into runtime Prior or HSGPKwargs objects.

Transform priors and prefixed names

Transform parameters appear in the model under prefixed variable names.

Examples:

  • adstock alpha -> adstock_alpha
  • saturation lam -> saturation_lam
  • saturation beta -> saturation_beta

So you can override transform priors in either of these ways:

  1. pass priors={...} to the transform object
  2. override the prefixed variable in model_config

Use one style consistently within a project if you want the configuration to be easy to read.

Directional control priors

Controls are the right place for exogenous drivers whose effect may be negative, such as competitor spend, competitor price pressure, or supply-side disruptions. By default, control coefficients remain unrestricted.

You can declare expected control directions with:

  • control_impacts
  • control_sign_policy

Allowed impact values:

  • positive
  • negative
  • unrestricted

Allowed policies:

  • soft: bias the prior toward the expected sign
  • strict: use a sign-constrained prior

Python example

mmm = PanelMMM(
    date_column="date",
    channel_columns=["tv", "search"],
    control_columns=["competitor_spend", "price_index"],
    control_impacts={
        "competitor_spend": "negative",
        "price_index": "negative",
    },
    control_sign_policy="strict",
    adstock=GeometricAdstock(l_max=8),
    saturation=LogisticSaturation(),
)

YAML note

The current public YAML schema does not expose control_impacts or control_sign_policy. If you need directional control settings today, use the Python API for that part of the specification.

Constraints for directional controls

When control_impacts is configured, Abacus expects:

  • gamma_control and gamma_control_mundlak to be Normal or TruncatedNormal
  • scalar numeric mu and sigma values for those priors
  • the prior dims to include "control"

If you violate those assumptions, model build fails with a validation error.

Time-varying configuration keys

When you enable a boolean time-varying effect, Abacus uses these model_config keys:

  • intercept_tvp_config
  • media_tvp_config

Those keys can be:

  • an HSGPKwargs instance
  • a dict with HSGPKwargs fields
  • a dict in SoftPlusHSGP.parameterize_from_data(...) style, such as {"ls_lower": 1, "ls_upper": 10}

See Time-Varying Parameters.

Important scope note

Directional control priors apply to control_columns, not channel_columns. Media channels are modelled through the adstock and saturation path.

If you need full manual control over the control prior, override gamma_control and gamma_control_mundlak directly in model_config.

Common pitfalls

  • Putting control priors on media variables instead of using transform priors
  • Forgetting the prefixed transform variable names in model_config
  • Assuming dims automatically create hierarchical priors
  • Using directional control priors with incompatible gamma_control distributions

Next steps

Panel Dimensions

Use dims when your dataset is a panel rather than a single timeseries.

Examples of useful panel dimensions:

  • geo
  • brand
  • market
  • country

For the input row layout, see Panel Data Layout.

What dims does

dims tells PanelMMM which extra categorical axes exist alongside date.

With no extra dims, the model is indexed by:

  • date
  • channel
  • optionally control

With dims=("geo",), the model is indexed by:

  • date
  • geo
  • channel
  • optionally control

With dims=("geo", "brand"), it is indexed by:

  • date
  • geo
  • brand
  • channel
  • optionally control

What changes inside the model

Setting dims changes the coordinates and parameter shapes used in the PyMC graph.

Quantity No extra dims dims=("geo",)
channel_data ("date", "channel") ("date", "geo", "channel")
target_data ("date",) ("date", "geo")
channel_contribution ("date", "channel") ("date", "geo", "channel")
control_contribution ("date", "control") ("date", "geo", "control")
intercept prior dims by default () ("geo",)

Reserved names

Do not use these names in dims:

  • date
  • channel
  • control
  • fourier_mode

Abacus rejects them because they are reserved for internal coordinates.

dims does not imply automatic pooling

This is the most important modelling point.

By default, dims gives you parameters indexed by the panel coordinates, but not automatic hierarchical shrinkage across those coordinates.

For example:

  • the default intercept prior is Normal(..., dims=dims)
  • transform priors default to (*dims, "channel")
  • control coefficients default to (*dims, "control")

Those defaults create per-slice parameters. If you want hierarchical pooling across geo, brand, or another dimension, you need to encode that in the priors you supply.

Example: independent panel slices

mmm = PanelMMM(
    date_column="date",
    target_column="sales",
    channel_columns=["tv", "search"],
    dims=("geo",),
    adstock=GeometricAdstock(l_max=8),
    saturation=LogisticSaturation(),
)

With this specification, the default priors are geo-indexed, but not hierarchical by default.

Example: explicit hierarchical prior

If you want hierarchical structure, define it in the prior itself.

from pymc_extras.prior import Prior

model_config = {
    "intercept": Prior(
        "Normal",
        mu=0,
        sigma=Prior("HalfNormal", sigma=0.3),
        dims="geo",
    ),
}

You can do the same for transform priors and additive effects.

Mundlak CRE and panel dimensions

use_mundlak_cre=True only makes sense when you have at least one panel dim. Abacus enforces that.

When enabled, Abacus builds extra correlated-random-effects terms from training period means:

  • channel_mundlak_contribution
  • control_mundlak_contribution
  • mundlak_contribution

These terms live on the panel coordinates defined by dims.

Custom HSGP dims

If you use a custom SoftPlusHSGP for time-varying effects, its dims must be compatible with the panel structure.

Examples:

  • no extra dims: ("date",) or ("date", "channel") for media
  • dims=("geo",): ("date", "geo") or ("date", "geo", "channel") for media

See Time-Varying Parameters.

YAML example

data:
  date_column: date

target:
  column: sales
  type: revenue

dimensions:
  panel: [geo, brand]

media:
  channels: [tv, search]
  adstock:
    type: geometric
    l_max: 8
  saturation:
    type: logistic

Your dataset must then contain both geo and brand columns.

Common pitfalls

  • Using reserved names in dims
  • Assuming dims implies automatic partial pooling
  • Enabling use_mundlak_cre with no panel dimensions
  • Forgetting that every date + dims combination must be present in the data

Next steps

Time-Varying Parameters

Abacus supports time-varying intercept and media effects through SoftPlusHSGP, a Hilbert Space Gaussian Process approximation.

Two ways to enable time variation

For both time_varying_intercept and time_varying_media, you can pass:

  • True to use the config-driven default path
  • a custom HSGPBase instance such as SoftPlusHSGP(...)

Boolean mode

The simplest entry point is a boolean flag:

mmm = PanelMMM(
    date_column="date",
    target_column="sales",
    channel_columns=["tv", "search"],
    adstock=GeometricAdstock(l_max=8),
    saturation=LogisticSaturation(),
    time_varying_intercept=True,
    time_varying_media=True,
)

When you do this, Abacus builds a SoftPlusHSGP internally from:

  • model_config["intercept_tvp_config"]
  • model_config["media_tvp_config"]

The default config keys are HSGPKwargs with:

  • m=200
  • L=None
  • eta_lam=1
  • ls_mu=5
  • ls_sigma=10
  • cov_func=None

What the boolean defaults mean

With boolean mode:

  • time_varying_intercept=True creates intercept_latent_process over ("date", *dims)
  • time_varying_media=True creates media_temporal_latent_multiplier over ("date", *dims)

That second point matters:

boolean time_varying_media=True gives you one shared temporal multiplier per panel slice, not a different time-varying multiplier per channel

If you want channel-specific time variation, pass a custom HSGP with channel in its dims.

Custom SoftPlusHSGP

Use a custom HSGP instance when you need precise control over dims, covariance, or priors.

Example: channel-specific time-varying media in a simple timeseries model.

import numpy as np

from abacus.mmm import SoftPlusHSGP

n_dates = X["date"].nunique()

media_hsgp = SoftPlusHSGP.parameterize_from_data(
    X=np.arange(n_dates),
    dims=("date", "channel"),
    cov_func="matern32",
)

mmm = PanelMMM(
    date_column="date",
    target_column="sales",
    channel_columns=["tv", "search"],
    adstock=GeometricAdstock(l_max=8),
    saturation=LogisticSaturation(),
    time_varying_media=media_hsgp,
)

For a panel model with dims=("geo",), valid media HSGP dims include:

  • ("date", "geo")
  • ("date", "geo", "channel")

For the intercept, the custom dims should align with the target axes, typically ("date",) or ("date", *dims).

Supported covariance choices

For SoftPlusHSGP.parameterize_from_data(...), the supported covariance keywords are:

  • "expquad"
  • "matern32"
  • "matern52"

Config formats for boolean mode

The *_tvp_config entries in model_config support two formats.

HSGPKwargs style

from abacus.hsgp_kwargs import HSGPKwargs

model_config = {
    "intercept_tvp_config": HSGPKwargs(
        m=50,
        L=None,
        eta_lam=1.0,
        ls_mu=5.0,
        ls_sigma=10.0,
        cov_func=None,
    )
}

Equivalent dict form is also accepted.

parameterize_from_data style

You can also pass a dict that is forwarded to SoftPlusHSGP.parameterize_from_data(...):

model_config = {
    "intercept_tvp_config": {
        "ls_lower": 1.0,
        "ls_upper": 10.0,
    }
}

Abacus preserves that dict and uses it when constructing the HSGP.

How the latent process enters the model

Time-varying intercept

Abacus creates:

intercept_contribution = intercept_baseline * intercept_latent_process

Time-varying media

Abacus first creates a baseline transformed media contribution, then multiplies it by the temporal latent process:

channel_contribution =
  baseline_channel_contribution * media_temporal_latent_multiplier

If the custom media HSGP dims include channel, the multiplier can vary by channel. Otherwise it is broadcast across channels.

Save and load

Custom SoftPlusHSGP instances round-trip through PanelMMM.save(...) and PanelMMM.load(...).

That includes custom dims such as:

  • ("date",)
  • ("date", "channel")
  • ("date", "geo")
  • ("date", "geo", "channel")

Common pitfalls

  • Expecting time_varying_media=True to create channel-specific media multipliers
  • Using custom HSGP dims that do not align with the model dims
  • Forgetting that boolean mode uses model_config["intercept_tvp_config"] and model_config["media_tvp_config"]

Next steps

Calibration

Calibration lets you add external evidence to a built PanelMMM.

Abacus currently supports two retained calibration paths:

  • lift-test measurements through add_lift_test_measurements(...)
  • cost-per-target calibration through add_cost_per_target_calibration(...)

General rule

Both calibration methods operate on a built model, not a bare constructor.

Typical sequence:

mmm.build_model(X, y)

# optional calibration step(s) here

idata = mmm.fit(X, y)

If you try to add calibration before the model graph exists, Abacus raises an error.

Lift-test calibration

Use add_lift_test_measurements(...) to add external lift measurements against the modelled saturation behaviour.

df_lift_test = pd.DataFrame(
    {
        "channel": ["tv", "search"],
        "x": [100.0, 80.0],
        "delta_x": [20.0, 10.0],
        "delta_y": [15.0, 6.0],
        "sigma": [3.0, 2.0],
    }
)

mmm.build_model(X, y)
mmm.add_lift_test_measurements(df_lift_test)

Required columns for lift tests

Lift-test data always needs:

  • channel
  • x
  • delta_x
  • delta_y
  • sigma

It also needs:

  • every configured entry in dims
  • any additional coordinate columns required by the calibrated variables

In practice, time-varying media models usually require date, because the time-varying multiplier is indexed by date.

What Abacus does

add_lift_test_measurements(...):

  1. validates the mapping columns
  2. scales the lift-test channel and target values to the model scale
  3. maps the rows to the model coordinates
  4. adds a likelihood term named lift_measurements by default

If time_varying_media is enabled, Abacus includes the media temporal multiplier in the calibrated saturation function automatically.

Practical notes

  • Lift measurements must be monotonic in the sense enforced by the calibration graph helpers.
  • The calibration distribution defaults to pm.Gamma.
  • You can change the registered variable name with name=....

Cost-per-target calibration

Use add_cost_per_target_calibration(...) when you want soft penalties on channel cost-per-target values.

mmm.build_model(X, y)
mmm.add_original_scale_contribution_variable(var=["channel_contribution"])

calibration_data = pd.DataFrame(
    {
        "geo": ["UK", "US"],
        "channel": ["tv", "search"],
        "cost_per_target": [30.0, 45.0],
        "sigma": [2.0, 3.0],
    }
)

mmm.add_cost_per_target_calibration(
    data=X,
    calibration_data=calibration_data,
    name_prefix="cpt_calibration",
)

Required prerequisites

Before you add cost-per-target calibration:

  1. build the model
  2. add channel_contribution_original_scale

The second step is required because cost-per-target calibration operates against original-scale channel contribution.

Required columns for calibration_data

calibration_data must include:

  • channel
  • cost_per_target
  • sigma
  • every configured entry in dims

Requirements for data

The data argument is the spend dataset used to compute calibrated cost per target.

After Abacus reshapes it into xarray form, its coordinates must match the built model’s:

  • same shape
  • same coordinate labels
  • same channel list

If the reshaped spend data does not match the model coordinates, Abacus raises a validation error instead of silently reordering it.

YAML calibration

The YAML builder supports calibration through a top-level calibration: list. Each step must provide an explicit method plus a params mapping.

Supported YAML calibration methods:

  • add_lift_test_measurements
  • add_cost_per_target_calibration

Example:

original_scale_vars:
  - channel_contribution

calibration:
  - method: add_lift_test_measurements
    params:
      df_lift_test:
        dataframe:
          data:
            channel: ["channel_1", "channel_2"]
            x: [100.0, 80.0]
            delta_x: [20.0, 10.0]
            delta_y: [15.0, 6.0]
            sigma: [3.0, 2.0]

Important YAML constraints:

  • calibration steps run after build_model(...)
  • original_scale_vars is applied before calibration
  • only the supported calibration methods above are available in YAML
  • dist is not supported in YAML yet for add_lift_test_measurements
  • other calibration actions should be applied through the Python API until they have explicit YAML support

Choose the right calibration path

Use lift tests when you have measured incremental response data for a specific spend change.

Use cost-per-target calibration when you want the fitted channel contribution to stay consistent with observed cost efficiency.

You can use either or both, provided the model has been built first.

Common pitfalls

  • Adding calibration before build_model(...)
  • Forgetting to add channel_contribution_original_scale before cost-per-target calibration
  • Omitting required dims columns from calibration data
  • Assuming YAML supports every Python calibration argument; dist does not currently round-trip through YAML

Next steps

  • Read Model Fitting for the fit workflow once the model has been fully specified.
  • Read Save and Load if you plan to keep calibrated models on disk.

Model Fitting

This section covers the core fitting workflow for PanelMMM: running MCMC, checking priors before fitting, and saving a fitted model for later reuse.

Pages

  • Fitting the Model - How fit() works, how sampler settings are applied, and what you get back.
  • Prior Predictive Checks - How to sample and inspect prior predictive draws before fitting.
  • Save and Load - How to persist a fitted model to NetCDF and rebuild PanelMMM from saved InferenceData.

Subsections of Model Fitting

Fitting the Model

Use this page after you have prepared X and y for PanelMMM. For input requirements, see Data Preparation.

Basic workflow

fit() is the main entry point for posterior sampling.

import pandas as pd

from abacus.mmm import GeometricAdstock, LogisticSaturation
from abacus.mmm.panel import PanelMMM

dataset = pd.read_csv("data/demo/timeseries/dataset.csv")
dataset["date"] = pd.to_datetime(dataset["date"])

X = dataset.drop(columns=["revenue"])
y = dataset["revenue"].rename("revenue")

mmm = PanelMMM(
    date_column="date",
    target_column="revenue",
    channel_columns=[
        "channel_1",
        "channel_2",
        "channel_3",
        "channel_4",
        "channel_5",
        "channel_6",
    ],
    yearly_seasonality=2,
    adstock=GeometricAdstock(l_max=4),
    saturation=LogisticSaturation(),
)

idata = mmm.fit(
    X,
    y,
    draws=500,
    tune=500,
    chains=2,
    cores=2,
    progressbar=False,
    random_seed=42,
)

fit() returns an arviz.InferenceData object and also stores it on mmm.idata.

What fit() does

When you call fit(X, y), Abacus:

  1. checks that pandas X and y use the same index, if both are pandas objects
  2. builds the PyMC graph automatically if it has not been built already
  3. merges sampler settings from the model’s sampler_config and your call-time kwargs
  4. runs pymc.sample(...)
  5. computes deterministic variables and adds them to the posterior group
  6. stores the training data in an InferenceData.fit_data group
  7. writes model metadata into idata.attrs

That means fitted contribution variables such as channel_contribution, intercept_contribution, and yearly_seasonality_contribution are available in mmm.posterior after fitting when they are part of the configured model.

Configure the sampler

You can configure PyMC sampling in two places:

Where Use it for Precedence
sampler_config= in PanelMMM(...) Stable defaults you want to reuse across fits Lower
fit(..., **kwargs) Run-specific overrides such as draws, chains, or random_seed Higher

Abacus merges them so that explicit fit() kwargs win.

mmm = PanelMMM(
    date_column="date",
    target_column="revenue",
    channel_columns=["channel_1", "channel_2"],
    adstock=GeometricAdstock(l_max=4),
    saturation=LogisticSaturation(),
    sampler_config={
        "draws": 1000,
        "tune": 1000,
        "chains": 4,
        "target_accept": 0.9,
        "progressbar": False,
    },
)

# Overrides draws from sampler_config, keeps target_accept
idata = mmm.fit(X, y, draws=500, random_seed=42)

Common sampler arguments

These are passed through to pymc.sample(...).

Argument What it controls
draws Posterior samples kept after tuning
tune Warm-up or adaptation iterations
chains Number of MCMC chains
cores Number of worker processes used by PyMC
target_accept HMC or NUTS acceptance target
progressbar Whether PyMC shows a progress bar
random_seed Sampling reproducibility

If you do not specify progressbar, Abacus defaults it to True unless your sampler_config already sets it.

When to build first

For a standard workflow, call fit() directly.

Call build_model(X, y) first only when you need to inspect or modify the graph before sampling. For example:

mmm.build_model(X, y)
mmm.add_original_scale_contribution_variable(
    var=["channel_contribution", "y"]
)

idata = mmm.fit(
    X,
    y,
    draws=500,
    tune=500,
    chains=2,
    progressbar=False,
    random_seed=42,
)

This pattern is also useful when you need to add events before fitting. Call add_events(...) before build_model(...) or fit(...).

Inspect fitted results

After fitting, common entry points are:

  • mmm.idata
  • mmm.posterior
  • mmm.model
  • mmm.plot
  • mmm.summary
  • mmm.diagnostics

Example:

posterior = mmm.posterior
channel_mean = posterior["channel_contribution"].mean(dim=["chain", "draw"])

Common pitfalls

  • Leaving the target column inside X
  • Passing pandas X and y with different indexes
  • Changing the model graph after fitting and expecting existing samples to stay valid
  • Assuming constructor sampler_config overrides explicit fit() kwargs; it does not
  • Adding events after the model has already been built

Next steps

Prior Predictive Checks

Run prior predictive checks before fitting when you want to test whether your configured priors imply plausible target behaviour.

If you want the econometrics framing for this workflow, see Prior Predictive Checks for Econometricians.

Sample prior predictive draws

Use sample_prior_predictive(...) on PanelMMM:

prior = mmm.sample_prior_predictive(
    X=X,
    y=y,
    samples=100,
    random_seed=42,
)

In normal PanelMMM use, pass the same X and y structure that you plan to fit.

sample_prior_predictive(...):

  • builds the model if it has not been built yet
  • samples from pymc.sample_prior_predictive(...)
  • stores prior and prior_predictive on mmm.idata by default
  • returns an extracted xarray.Dataset of prior predictive draws

How many draws Abacus uses

If you do not pass samples=..., Abacus uses:

  • sampler_config["draws"] when that key exists
  • otherwise 500

If you want prior predictive checks to use a different sample count from model fitting, pass samples explicitly.

Plot prior predictive draws

After sampling, you can use the retained plotting surface:

figure, axes = mmm.plot.prior_predictive(
    var=mmm.output_var,
    hdi_prob=0.85,
)

You can then access the stored groups directly:

prior_group = mmm.prior
prior_predictive_group = mmm.prior_predictive

Trying to access these groups before sampling raises a runtime error.

Example prior predictive output:

Prior predictive example Prior predictive example

What to inspect

A useful prior predictive check is about plausibility, not fit.

Check:

  • scale: are draws on roughly the same order of magnitude as the observed target?
  • support: do the draws violate obvious business constraints such as non-negativity?
  • volatility: do the draws imply far more or far less variation than the real series?
  • structure: do the trajectories look broadly plausible for the business and model configuration?

If the prior predictive distribution is implausible, change the model before you fit it.

Adjust the model before fitting

Typical changes include:

  • tightening intercept or likelihood priors in model_config
  • revising media transformation priors
  • reducing unnecessary model flexibility
  • checking whether your scaling choices make priors too loose on the model scale

See Priors and Configuration for the configuration surface.

Prior predictive before and after fit

If you run prior predictive checks first and then call fit(), Abacus keeps the existing prior and prior_predictive groups on mmm.idata.

That makes it practical to compare:

  • prior assumptions
  • posterior fit
  • posterior predictive behaviour

within one saved InferenceData object.

Common pitfalls

  • Skipping prior predictive checks and only noticing implausible priors after a long fit
  • Treating prior predictive checks as a substitute for posterior predictive assessment
  • Forgetting that sample_prior_predictive(...) returns extracted predictive draws, while the full prior and prior_predictive groups are stored on mmm.idata

Next steps

After the prior predictive behaviour looks reasonable, fit the model with Fitting the Model.

Save and Load

Use save and load when you want to persist a fitted PanelMMM and rebuild it later without redefining the whole model configuration in code.

Basic round trip

The standard workflow is:

mmm.fit(
    X,
    y,
    draws=500,
    tune=500,
    chains=2,
    progressbar=False,
    random_seed=42,
)

mmm.save("mmm.nc")

loaded = PanelMMM.load("mmm.nc")

save() writes the model’s InferenceData to NetCDF. load() reads that file, recreates the PanelMMM configuration from stored metadata, restores loaded.idata, and rebuilds the PyMC graph from the saved training data.

What Abacus stores

Abacus relies on more than the posterior draws for a full round trip.

Stored item Why it matters
posterior and other InferenceData groups Preserve sampled results
fit_data Rebuild the model graph with the original training data
idata.attrs Reconstruct PanelMMM init kwargs and validate compatibility

The stored attrs include both the shared model metadata and PanelMMM-specific configuration such as:

  • date_column
  • channel_columns
  • target_column
  • target_type
  • dims
  • control_columns
  • control_impacts
  • adstock and saturation
  • adstock_first
  • yearly_seasonality
  • time_varying_intercept and time_varying_media
  • scaling
  • model_config
  • sampler_config
  • serialised mu_effects

save() behaviour

save(fname, **kwargs) is a thin wrapper over self.idata.to_netcdf(...).

Important constraints:

  • the model must already be fitted
  • self.idata must contain a posterior group
  • any extra kwargs are passed directly to InferenceData.to_netcdf(...)

If you call save() before fitting, Abacus raises:

RuntimeError: The model hasn't been fit yet, call .fit() first

load() and compatibility checks

By default, PanelMMM.load(...) validates that the saved file matches the current model class and configuration:

loaded = PanelMMM.load("mmm.nc", check=True)

With check=True, Abacus verifies:

  • the saved model version
  • the saved model id derived from the serialised configuration

If those checks fail, Abacus raises DifferentModelError.

If you need to bypass those checks, you can set check=False:

loaded = PanelMMM.load("mmm.nc", check=False)

Use that only when you understand why the saved metadata does not match.

Load from an in-memory InferenceData

If you already have an InferenceData object, use load_from_idata(...) instead of saving to disk first:

loaded = PanelMMM.load_from_idata(idata, check=True)

This is the same round-trip path that load() uses internally after reading the NetCDF file.

Where build_from_idata() fits

build_from_idata(idata) is the lower-level rebuild step. It:

  1. restores supported serialised mu_effects
  2. reads idata.fit_data
  3. splits that saved training data back into X and y
  4. rebuilds the PyMC graph

You usually do not need to call build_from_idata() yourself because load() and load_from_idata() already do it.

Round-trip limitations

Not every fitted object can be restored fully.

EventAdditiveEffect does not round-trip

Abacus does not deserialize EventAdditiveEffect because the original df_events DataFrame is not stored in the saved attrs. In that case, PanelMMM.load(...) fails fast while rebuilding the model.

Do not drop fit_data if you want to reload

Because rebuild uses idata.fit_data, do not save a partial file that omits that group if you want to call PanelMMM.load(...) later.

For example, this is valid NetCDF output:

mmm.save("posterior_only.nc", groups=["posterior"])

But it is not a full PanelMMM round-trip artefact, because the saved file no longer includes the training data needed for build_from_idata(...).

Practical advice

  • Use the default save() behaviour for round trips.
  • Keep check=True unless you have a specific compatibility reason not to.
  • Prefer PanelMMM.load(...) over loading NetCDF manually.
  • Refit or rebuild event effects explicitly rather than expecting saved event state to deserialize.

Next steps

After loading a model, you can go straight to posterior predictive sampling, diagnostics, decomposition, or optimisation using the restored idata and rebuilt graph.

Post-Modeling

Use this section after fitting PanelMMM.

It covers posterior predictive checks, diagnostics, contribution analysis, response curves, efficiency metrics, and the tabular summary surfaces that Abacus exposes from fitted InferenceData.

Pages

  • Posterior Predictive: Sample fitted or future predictions and compare them with observed data where available.
  • Diagnostics: Run design-matrix, MCMC, and predictive diagnostics and export machine-readable reports.
  • Contributions and Decomposition: Inspect channel, baseline, control, seasonality, and event contributions.
  • Response Curves: Sample and summarise posterior saturation and adstock curves, and understand the runner’s forward-pass direct contribution artefacts.
  • ROAS and Metrics: Calculate ROAS, CPA-style metrics, spend tables, and predictive error metrics.
  • Summary and Export: Work with MMMSummaryFactory, HDI settings, time aggregation, and DataFrame export.

Subsections of Post-Modeling

Diagnostics

Abacus exposes diagnostics through mmm.diagnostics.

Use this surface to check the design matrix, posterior sampling quality, and posterior predictive fit. For fitted-value plots and predictive sampling, see Posterior Predictive.

Diagnostic surfaces

mmm.diagnostics provides three groups of checks.

Area Summary method Report method What it covers
Raw input screening design_summary(X) design_report(X) Collinearity, constants, and near-constant regressors on raw input columns
MCMC mcmc_summary() mcmc_report() r_hat, ESS, divergences, BFMI, tree depth, acceptance rate
Predictive predictive_summary() predictive_report() RMSE, MAE, NRMSE, NMAE, CRPS, residual moments

The summary methods return pandas DataFrames. The report methods return typed report objects with a to_dict() method for JSON-ready export.

Raw input screening

Use design_summary(X) on the raw design matrix you want to inspect:

design = mmm.diagnostics.design_summary(X)

By default, Abacus checks:

  • all channel_columns
  • all control_columns, when present

You can limit the check to specific variables:

design = mmm.diagnostics.design_summary(
    X,
    variables=["tv", "search", "price_index"],
    vif_threshold=10.0,
    near_constant_threshold=0.99,
)

The returned table includes:

  • variable
  • mean
  • std
  • n_unique
  • dominant_share
  • is_constant
  • is_near_constant
  • vif
  • high_vif
  • max_abs_corr

design_report(X) returns a compact roll-up with matrix rank, condition number, maximum VIF, maximum absolute correlation, and lists of flagged variables.

Screening requirements

Raw input screening requires:

  • all requested columns to exist in X
  • all checked columns to be numeric

Abacus raises a ValueError if a variable is missing or non-numeric.

The method names stay design_summary() and design_report(), but the pipeline now treats them as raw input screening rather than transformed model geometry.

MCMC diagnostics

Use mcmc_summary() after fitting:

mcmc = mmm.diagnostics.mcmc_summary(
    rhat_threshold=1.01,
    ess_threshold=400.0,
)

The summary comes from arviz.summary(..., kind="diagnostics") and adds flag columns such as:

  • high_rhat
  • low_ess_bulk
  • low_ess_tail

mcmc_report() adds model-level diagnostics, including:

  • divergence_count
  • divergence_rate
  • max_rhat
  • min_ess_bulk
  • min_ess_tail
  • bfmi_mean
  • bfmi_min
  • max_tree_depth_hits
  • max_tree_depth_observed
  • mean_acceptance_rate

If idata is missing, Abacus raises an error and tells you to fit the model first.

Example MCMC diagnostic output:

Trace plot example Trace plot example

Predictive diagnostics

Predictive diagnostics use the observed target and stored posterior predictive samples:

mmm.sample_posterior_predictive(
    X=X,
    random_seed=42,
    progressbar=False,
)

predictive = mmm.diagnostics.predictive_summary(original_scale=True)

The predictive summary is a one-row DataFrame with:

  • scale
  • num_observations
  • rmse
  • mae
  • nrmse
  • nmae
  • crps
  • residual_mean
  • residual_std

Abacus aligns target and prediction coordinates before flattening. That includes mixed datetime coordinate dtypes when needed.

Example residual diagnostics:

Residuals over time Residuals over time

Residual histogram Residual histogram

Residuals versus fitted Residuals versus fitted

Residual autocorrelation Residual autocorrelation

Export reports

Use the report objects when you want a compact export format:

import json

report = mmm.diagnostics.mcmc_report()
payload = report.to_dict()

with open("mcmc_report.json", "w", encoding="utf-8") as handle:
    json.dump(payload, handle, indent=2)

The same pattern works for design_report(...) and predictive_report().

Pipeline outputs

The pipeline diagnostics stage uses the same retained diagnostic surfaces to write report tables and text summaries. If you run the pipeline, those stage artefacts should match the behaviour documented here.

In the structured pipeline, the raw-input screening rows in diagnostics_report.csv use the phase label raw_input_screening instead of design so the machine-readable output matches the wording here.

Common pitfalls

  • Running mcmc_summary() or mcmc_report() before fitting
  • Running predictive diagnostics before sampling posterior predictive values
  • Passing non-numeric columns into design_summary(X)
  • Treating predictive diagnostics as a substitute for design or MCMC checks

Posterior Predictive Checks

Use posterior predictive draws to check in-sample fit and to generate predictions for new rows that follow the fitted panel layout.

For diagnostic metrics after sampling, see Diagnostics. For table export, see Summary and Export.

Sample posterior predictive draws

Use PanelMMM.sample_posterior_predictive(...) on a fitted model:

posterior_predictive = mmm.sample_posterior_predictive(
    X=X,
    random_seed=42,
    progressbar=False,
)

sample_posterior_predictive(...):

  • requires X
  • uses the fitted posterior stored on mmm.idata
  • reshapes X into the model’s panel xarray layout
  • runs pymc.sample_posterior_predictive(...)
  • returns an extracted xarray.Dataset

By default, combined=True, so the returned dataset uses a sample dimension. If you want separate chain and draw dimensions, set combined=False.

Store or return only

By default, Abacus also writes the predictive samples back to mmm.idata:

posterior_predictive = mmm.sample_posterior_predictive(
    X=X,
    extend_idata=True,
    random_seed=42,
    progressbar=False,
)

With extend_idata=True, Abacus adds:

  • idata.posterior_predictive
  • idata.posterior_predictive_constant_data

If you only want the returned samples and do not want to update mmm.idata, set extend_idata=False.

Check training-fit values against observed data

For an in-sample check, pass the same design matrix you used for fitting. This is the same pattern used by the pipeline’s Stage 30 training-fit assessment:

mmm.sample_posterior_predictive(
    X=X,
    random_seed=42,
    progressbar=False,
)

fit_table = mmm.summary.posterior_predictive(hdi_probs=[0.94])
figure, axes = mmm.plot.posterior_predictive(
    var=[mmm.output_var],
    hdi_prob=0.94,
)

Example posterior predictive output:

Posterior predictive example Posterior predictive example

Fitted versus observed over time Fitted versus observed over time

mmm.summary.posterior_predictive() returns a table with:

  • observed target values
  • posterior predictive mean and median
  • HDI bound columns such as abs_error_94_lower and abs_error_94_upper

You can also access the predictive draws directly:

predictive = mmm.data.get_posterior_predictive(original_scale=True)
errors = mmm.data.get_errors(original_scale=True)

Blocked holdout validation

For the structured pipeline’s Stage 35 validation, Abacus fits a fresh model on the training window and then scores only the holdout dates:

holdout_predictive = validation_mmm.sample_posterior_predictive(
    X=X_holdout,
    include_last_observations=True,
    random_seed=42,
    progressbar=False,
)

That holdout path is different from the in-sample check above:

  • the model is fit on X_train and y_train only
  • the holdout X contains only future dates
  • include_last_observations=True keeps lag history for adstock carryover
  • the returned samples are used to compute holdout metrics such as RMSE, MAE, NRMSE, NMAE, CRPS, bias, and coverage at 50%, 80%, and 94%

The holdout stage is more expensive than the in-sample check because it adds a second fit.

Predict on new dates

For future prediction, pass a new X with the same structural columns as the training data:

future_predictive = mmm.sample_posterior_predictive(
    X=X_future,
    include_last_observations=True,
    random_seed=42,
    progressbar=False,
)

sample_posterior_predictive(...) does not take y. For a holdout or future window, keep the actual target outside the model and align it yourself if you want external evaluation.

Use include_last_observations correctly

Set include_last_observations=True when the forecast window needs lag history for adstock carryover.

When enabled, Abacus:

  • prepends the last adstock.l_max training observations internally
  • samples posterior predictive values on the padded data
  • removes the prepended rows from the returned result

This only works when the input dates do not overlap with the training dates. If they do overlap, Abacus raises a ValueError.

Practical guidance

  • Use the training X for fitted-versus-observed checks.
  • Use future-only dates for forward prediction.
  • Use the training-window refit pattern for blocked holdout validation.
  • Keep combined=True if you want a simpler sample dimension.
  • Use combined=False if you need explicit chain and draw dimensions.
  • Call sample_posterior_predictive(...) before using mmm.diagnostics.predictive_summary() or mmm.summary.posterior_predictive().

Common pitfalls

  • Calling sample_posterior_predictive(...) without X
  • Expecting y to be passed into the predictive method
  • Using include_last_observations=True on dates that overlap with training data
  • Forgetting that the returned object is extracted samples, while the stored idata.posterior_predictive group keeps the native posterior predictive structure

Contributions and Decomposition

Abacus stores additive contribution terms for fitted PanelMMM models and exposes them through the data wrapper, summary tables, and plotting suite.

Use this page to inspect media, baseline, control, seasonality, and event effects. For channel efficiency ratios built from media contributions, see ROAS and Metrics.

Contribution surfaces

You can work with contributions at three levels.

Surface Use it for
mmm.data Raw xarray contribution samples
mmm.summary DataFrames with posterior means, medians, and HDIs
mmm.plot Time-series and waterfall visualisations

Read raw contribution samples

The lowest-level accessor is mmm.data.get_contributions(...):

contributions = mmm.data.get_contributions(
    original_scale=True,
    include_baseline=True,
    include_controls=True,
    include_seasonality=True,
    include_events=True,
)

Depending on the fitted model, the returned dataset can contain:

  • channels
  • baseline
  • controls
  • seasonality
  • events

baseline includes the intercept contribution and any Mundlak contribution when the fitted model uses Mundlak CRE terms.

For media-only contribution samples, use:

channel_contributions = mmm.data.get_channel_contributions(original_scale=True)

Summarise one contribution type

Use mmm.summary.contributions(...) when you want a tidy table with posterior summary statistics:

channel_df = mmm.summary.contributions(
    component="channel",
    hdi_probs=[0.80, 0.94],
)

Supported component values are:

  • channel or channels
  • control or controls
  • seasonality
  • baseline

The returned table includes:

  • identifying columns such as date, channel, control, and any panel dims
  • mean
  • median
  • HDI bound columns such as abs_error_94_lower and abs_error_94_upper

mmm.summary.contributions(...) does not expose event effects. For event effects, use mmm.data.get_contributions(include_events=True) or mmm.summary.mean_contributions_over_time().

Create a wide decomposition table

Use mmm.summary.mean_contributions_over_time(...) when you want one row per time point and panel slice:

decomposition = mmm.summary.mean_contributions_over_time(
    original_scale=True,
)

This table contains posterior means only. It widens the contribution data so that each retained component becomes a column.

Typical output looks like this:

date geo TV Search baseline seasonality
2024-01-01 UK 1240.5 822.1 5110.7 -95.4
2024-01-08 UK 1302.8 801.6 5076.9 22.7

When present, the wide table also includes:

  • control columns
  • event columns named from posterior variables that end with _total_effect

Aggregate total contribution by component

Use mmm.summary.total_contribution(...) when you want one row per date and component type after summing across individual channels or controls:

totals = mmm.summary.total_contribution(frequency="monthly")

This is useful when you want a component-level roll-up, for example total media versus baseline.

Inspect change over time

Use mmm.summary.change_over_time(...) for percentage change in channel contributions between consecutive periods:

changes = mmm.summary.change_over_time(frequency="monthly")

This summary requires a date dimension. Do not use frequency="all_time".

Plot decomposition outputs

Use the plotting suite for visual inspection:

waterfall_figure, waterfall_axes = mmm.plot.waterfall_components_decomposition(
    original_scale=True,
)

area_figure, area_axes = mmm.plot.media_contribution_over_time(
    original_scale=True,
)

Useful plotting methods are:

  • waterfall_components_decomposition(...)
  • media_contribution_over_time(...)
  • contributions_over_time(...)
  • channel_contribution_share_hdi(...)

Example decomposition output:

Waterfall decomposition example Waterfall decomposition example

Media contribution over time Media contribution over time

Practical guidance

  • Use original_scale=True when you want business-unit interpretation.
  • Use mmm.summary.contributions(...) for tidy per-component tables.
  • Use mmm.summary.mean_contributions_over_time() for decomposition exports.
  • Use mmm.summary.total_contribution() when you only need component-level totals.

Common pitfalls

  • Expecting mmm.summary.contributions(...) to include event effects
  • Forgetting that baseline can include more than the intercept when Mundlak CRE is enabled
  • Using frequency="all_time" with mean_contributions_over_time() or change_over_time()

Response Curves

Use response curves to inspect the fitted media transformations directly.

Abacus exposes posterior saturation and adstock curves through both the fitted model and mmm.summary. For decomposition of realised contributions over time, see Contributions and Decomposition.

Sample saturation curves

Use sample_saturation_curve(...) on a fitted PanelMMM:

saturation_curve = mmm.sample_saturation_curve(
    max_value=1.0,
    num_points=100,
    num_samples=500,
    random_state=42,
    original_scale=True,
)

The returned xarray.DataArray contains:

  • the curve axis x
  • channel
  • any panel dims
  • a posterior sample dimension

original_scale=True converts the curve’s y-values to original target units. It does not convert the x-axis. x remains in scaled channel units.

If you want to choose max_value from original channel units, divide by the relevant value from mmm.data.get_channel_scale().

Sample adstock curves

Use sample_adstock_curve(...) to inspect carryover weights:

adstock_curve = mmm.sample_adstock_curve(
    amount=1.0,
    num_samples=500,
    random_state=42,
)

The returned array contains:

  • time since exposure
  • channel
  • any panel dims
  • a posterior sample dimension

The adstock curve is the fitted decay pattern for an impulse of size amount. It does not use an original_scale option because the returned weights are not target-unit contributions.

Runner-generated direct contribution artefacts

If you use the retained pipeline runner, Stage 60_response_curves also writes a forward-pass direct contribution artefact alongside the saturation and adstock transformation curves:

  • forward_pass_contribution_curve.nc
  • forward_pass_contribution_curve_summary.csv
  • forward_pass_contribution_curve.png

This artefact is different from the saturation-only curve:

  • the saturation-only curve shows the fitted saturation transform itself
  • the forward-pass direct contribution curve runs spend through the full fitted model path, including adstock and saturation

The retained Stage 60 forward-pass plot uses one explicit scenario so the curve is interpretable: it rescales the full observed historical spend path from 0% to 200%, then plots total channel spend against total channel contribution in original units. The marker at 100% highlights the fitted total contribution for the observed historical spend path.

Summarise curves as DataFrames

If you want tabular summaries, use mmm.summary:

saturation_df = mmm.summary.saturation_curves(
    hdi_probs=[0.80, 0.94],
    num_points=100,
    num_samples=500,
    random_state=42,
    original_scale=True,
)

adstock_df = mmm.summary.adstock_curves(
    hdi_probs=[0.94],
    amount=1.0,
    num_samples=500,
    random_state=42,
)

These methods return DataFrames with posterior mean, median, and HDI bound columns.

saturation_curves(...) includes an x column. adstock_curves(...) uses time since exposure.

MMMSummaryFactory requirement

Curve summaries need access to both the fitted data and the fitted model transformations.

mmm.summary already satisfies that requirement. If you construct MMMSummaryFactory manually, pass model=mmm:

from abacus.mmm.summary import MMMSummaryFactory

summary = MMMSummaryFactory(mmm.data, model=mmm)
curves = summary.saturation_curves()

If you omit model=mmm, Abacus raises a ValueError.

Plot saturation curves

You can plot sampled curves directly:

curve = mmm.sample_saturation_curve(
    num_points=100,
    random_state=42,
    original_scale=True,
)

figure, axes = mmm.plot.saturation_curves(
    curve=curve,
    original_scale=True,
)

You can also inspect the fitted relationship in the observed data with:

figure, axes = mmm.plot.saturation_scatterplot(original_scale=True)

Example curve output:

Saturation curve example Saturation curve example

Adstock curve example Adstock curve example

Practical guidance

  • Use num_samples to trade off speed against posterior resolution.
  • Use original_scale=True when you want the saturation y-axis in target units.
  • Keep in mind that the saturation x-axis stays on the scaled channel axis.
  • Use the summary methods when you need exportable tables.

Common pitfalls

  • Reading x from saturation curves as original spend units
  • Forgetting to pass model=mmm when manually constructing MMMSummaryFactory
  • Comparing adstock curves across models without matching the amount parameter

ROAS and Metrics

Use this page for channel-efficiency outputs and aggregate predictive metrics.

Abacus separates these into two surfaces:

  • mmm.summary and mmm.data for ROAS and cost-per-target outputs
  • mmm.diagnostics for RMSE, MAE, NRMSE, NMAE, and CRPS

For contribution tables that feed these ratios, see Contributions and Decomposition.

Element-wise ROAS and cost per target

The lowest-level efficiency accessors live on mmm.data:

roas_samples = mmm.data.get_elementwise_roas(original_scale=True)
cost_per_target_samples = mmm.data.get_elementwise_cost_per_target(
    original_scale=True,
)

These are direct ratios built from fitted media contributions and channel spend:

  • ROAS = contribution / spend
  • cost per target = spend / contribution

The arrays are element-wise over time, channel, and any panel dims, with posterior sample dimensions on top.

Abacus returns NaN when it would otherwise divide by zero.

Summarise ROAS

Use mmm.summary.roas(...) for a tidy summary table:

roas_df = mmm.summary.roas(
    hdi_probs=[0.80, 0.94],
    frequency="monthly",
    start_date="2024-01-01",
    end_date="2024-06-30",
)

Abacus applies start_date and end_date before any optional aggregation.

The returned table includes:

  • identifying columns such as date, channel, and any panel dims
  • mean
  • median
  • HDI bound columns such as abs_error_94_lower and abs_error_94_upper

Summarise cost per target

For conversion-style targets, use cost_per_target(...):

cpa_df = mmm.summary.cost_per_target(frequency="monthly")

This is the same retained summary surface that mmm.summary.efficiency() uses for target_type="conversion".

Use the default efficiency metric

Abacus chooses the default efficiency metric from the target type:

target_type mmm.summary.efficiency() returns Label
revenue roas() ROAS
conversion cost_per_target() CPA

You can inspect the selected metric with:

metric_key = mmm.summary.efficiency_metric
metric_label = mmm.summary.efficiency_metric_label

Export channel spend

Use channel_spend() when you want the raw spend table with no posterior aggregation:

spend_df = mmm.summary.channel_spend()

This returns the observed channel spend with columns such as date, channel, panel dims, and channel_data.

Predictive error metrics

Predictive metrics live under mmm.diagnostics.predictive_summary():

mmm.sample_posterior_predictive(
    X=X,
    random_seed=42,
    progressbar=False,
)

predictive_metrics = mmm.diagnostics.predictive_summary()

The returned one-row DataFrame includes:

  • rmse
  • mae
  • nrmse
  • nmae
  • crps
  • residual_mean
  • residual_std

These metrics are calculated from the stored posterior predictive samples and the observed target.

Practical guidance

  • Use roas() for revenue targets.
  • Use cost_per_target() for conversion targets.
  • Use efficiency() when you want target-type-aware reporting.
  • Sample posterior predictive values before using predictive metrics.

Common pitfalls

  • Interpreting Abacus ROAS as something other than contribution divided by spend
  • Forgetting that zero spend or zero contribution produces NaN
  • Using predictive diagnostics before calling sample_posterior_predictive(...)

Summary and Export

mmm.summary is the retained tabular summary surface for fitted PanelMMM models.

It is backed by MMMSummaryFactory and returns pandas or polars DataFrames that you can export with normal DataFrame methods.

For predictive diagnostics and JSON-ready reports, see Diagnostics.

Use mmm.summary

The simplest path is the bound summary factory on the fitted model:

posterior_df = mmm.summary.posterior_predictive()
contributions_df = mmm.summary.contributions(component="channel")
roas_df = mmm.summary.roas(frequency="monthly")

mmm.summary already has access to:

  • mmm.data
  • the fitted PanelMMM
  • the default summary settings

Construct MMMSummaryFactory manually

If you want custom defaults, build the factory yourself:

from abacus.mmm.summary import MMMSummaryFactory

summary = MMMSummaryFactory(
    mmm.data,
    model=mmm,
    hdi_probs=(0.80, 0.94),
    output_format="polars",
)

This is useful when you want one summary object with consistent HDI and output settings across multiple tables.

Common summary methods

Method What it returns
posterior_predictive() Posterior predictive summaries aligned to the wrapped target data
contributions() Tidy contribution summaries by component type
mean_contributions_over_time() Wide decomposition table
roas() / cost_per_target() / efficiency() Efficiency summaries
channel_spend() Raw spend table
saturation_curves() / adstock_curves() Transformation-curve summaries
total_contribution() Component-level totals
change_over_time() Period-on-period percentage change in channel contributions

Choose output format

MMMSummaryFactory supports:

  • output_format="pandas"
  • output_format="polars"

Example:

summary = MMMSummaryFactory(mmm.data, model=mmm, output_format="polars")
roas_df = summary.roas()

If you request polars without Polars installed, Abacus raises an ImportError.

Configure HDI probabilities

Pass HDI probabilities as numbers strictly between 0 and 1:

posterior_df = mmm.summary.posterior_predictive(hdi_probs=[0.80, 0.94])

Do not pass percentages such as 80 or 94.

Summary tables include interval columns named from those probabilities. For example, hdi_probs=[0.94] produces columns such as:

  • abs_error_94_lower
  • abs_error_94_upper

These columns are the current HDI bound columns used by the retained summary surface.

Aggregate over time

Many summary methods accept frequency with one of these values:

  • original
  • weekly
  • monthly
  • quarterly
  • yearly
  • all_time

Example:

monthly = mmm.summary.posterior_predictive(frequency="monthly")
quarterly_roas = mmm.summary.roas(frequency="quarterly")

all_time removes the date dimension. That is useful for fully aggregated tables, but date-dependent summaries still need a date axis.

Do not use all_time with:

  • mean_contributions_over_time()
  • change_over_time()

Export tables

Abacus does not add a separate export wrapper on top of the returned DataFrames. Use the normal DataFrame methods from your selected backend:

posterior_df = mmm.summary.posterior_predictive()
posterior_df.to_csv("posterior_predictive.csv", index=False)

With Polars:

summary = MMMSummaryFactory(mmm.data, model=mmm, output_format="polars")
roas_df = summary.roas()
roas_df.write_csv("roas.csv")

Export diagnostic reports

Diagnostic report objects expose to_dict() for JSON-ready export:

import json

report = mmm.diagnostics.predictive_report()
with open("predictive_report.json", "w", encoding="utf-8") as handle:
    json.dump(report.to_dict(), handle, indent=2)

Common pitfalls

  • Expecting a dedicated file-export API on mmm.summary
  • Passing 94 instead of 0.94 in hdi_probs
  • Using saturation_curves() or adstock_curves() from a manual factory without model=mmm
  • Using all_time on summaries that require a date dimension

Optimisation

This section covers Abacus budget optimisation workflows for fitted PanelMMM models. It explains the low-level optimisation wrapper, how to inspect optimisation outputs.

For the higher-level planner service and Dash UI, see Scenario Planning.

Pages

  • Budget Optimisation - How to run PanelBudgetOptimizerWrapper, set bounds and masks, and define spend over a future window.
  • Interpreting Optimisation - How to read the allocation output, inspect simulated response samples, and use the pipeline optimisation artefacts.
  • Scenario Planning - How to compare current, manual, and fixed-budget optimised scenarios with the planner service and optional Dash UI.

Subsections of Optimisation

Budget Optimisation

Use PanelBudgetOptimizerWrapper when you want to optimise spend for a fitted PanelMMM over a future date window.

The wrapper builds a synthetic future dataset for the requested window, swaps the model’s channel_data for an optimisation variable, and then calls the generic BudgetOptimizer. If you want to compare several plans in total horizon spend units, see Scenario Planning.

What the optimiser maximises

For PanelBudgetOptimizerWrapper, optimize_budget() defaults to:

  • response_variable="total_media_contribution_original_scale"
  • utility_function=average_response
  • SciPy SLSQP with ftol=1e-9 and maxiter=1000

The optimiser therefore maximises the average posterior response of the chosen response variable, subject to your budget bounds and constraints.

Budget units

The low-level wrapper uses per-period spend units.

  • budget is the total spend across all optimised cells for one model period.
  • The returned allocation has no date dimension, so Abacus repeats that allocation across the optimisation window.
  • If the window has num_periods=8 and you pass budget=100_000, the simulated spend over the full horizon is 800_000 before any carryover effects are applied.

This is different from Scenario Planning, which treats total_budget and manual allocations as total horizon spend and converts them to per-period units internally.

The Stage 70 pipeline optimisation uses the same units as the low-level wrapper because it passes optimization.total_budget directly to PanelBudgetOptimizerWrapper.optimize_budget(...).

Required inputs

Input What Abacus expects Notes
model A fitted PanelMMM with idata.posterior The optimiser needs posterior draws and model graph variables.
start_date, end_date A future window at the model’s observed date frequency Abacus infers num_periods from the training data frequency.
budget Per-period total spend See Budget units.
response_variable A variable available from the fitted optimisation graph The wrapper default is total_media_contribution_original_scale.

Basic example

This example assumes that mmm is already fitted.

import xarray as xr

from abacus.mmm.panel import PanelBudgetOptimizerWrapper

channels = ["channel_1", "channel_2"]

wrapper = PanelBudgetOptimizerWrapper(
    model=mmm,
    start_date="2025-02-03",
    end_date="2025-03-31",
)

budget_bounds = xr.DataArray(
    [
        [[0.0, 60_000.0], [0.0, 45_000.0]],
        [[0.0, 55_000.0], [0.0, 40_000.0]],
    ],
    dims=("geo", "channel", "bound"),
    coords={
        "geo": ["UK", "FR"],
        "channel": channels,
        "bound": ["lower", "upper"],
    },
)

budgets_to_optimize = xr.DataArray(
    [[True, True], [True, False]],
    dims=("geo", "channel"),
    coords={
        "geo": ["UK", "FR"],
        "channel": channels,
    },
)

allocation, result = wrapper.optimize_budget(
    budget=100_000.0,
    budget_bounds=budget_bounds,
    budgets_to_optimize=budgets_to_optimize,
    response_variable="total_media_contribution_original_scale",
)

print(allocation)
print(result.success, result.message)

allocation is an xarray.DataArray over the non-date budget dimensions. For a model with dims=("geo",), the result dims are typically ("geo", "channel").

Bounds and masks

budget_bounds

Use budget_bounds to cap spend for each optimised cell.

  • If the budget has only one non-date dimension, you can pass a dict such as {"tv": (0.0, 50_000.0), "search": (0.0, 30_000.0)}.
  • For panel budgets, pass an xarray.DataArray with dims (*budget_dims, "bound"), where "bound" contains "lower" and "upper".
  • If you omit budget_bounds, Abacus warns and uses (0, total_budget) for every optimised cell.
  • Abacus reindexes DataArray bounds to the model’s internal coordinate order, so the input coordinate order does not need to match exactly.

budgets_to_optimize

Use budgets_to_optimize to choose which cells can move.

  • The mask must be a boolean xarray.DataArray over the budget dimensions.
  • Unoptimised cells are fixed at zero in the returned allocation.
  • If you omit the mask, Abacus optimises every cell where the fitted model has non-zero historical channel_contribution information.
  • If your mask includes True for a cell where the model has no information, Abacus raises ValueError.

Time distribution across the window

Use budget_distribution_over_period to flight each allocation cell over time instead of repeating the same spend every period.

The object must be an xarray.DataArray with:

  • dims exactly ("date", *budget_dims)
  • one date weight per optimisation period
  • weights that sum to 1 across the date dimension for every budget cell

Example for a two-geo, two-channel weekly window:

budget_distribution = xr.DataArray(
    [
        [[0.50, 0.50], [0.25, 0.25]],
        [[0.30, 0.30], [0.35, 0.35]],
        [[0.20, 0.20], [0.40, 0.40]],
    ],
    dims=("date", "geo", "channel"),
    coords={
        "date": [0, 1, 2],
        "geo": ["UK", "FR"],
        "channel": ["channel_1", "channel_2"],
    },
)

Use the same budget_distribution_over_period again when you call sample_response_distribution(), otherwise you will optimise one spend path and simulate another.

For response simulation through the wrapper, the date coordinates can be:

  • integer positions 0 .. num_periods - 1, or
  • exact dates that match the optimisation window

Constraints and solver controls

default_constraints=True adds the default equality constraint:

sum(allocation) == budget

This is enabled by default and emits a warning so you can see that the default constraint set is active.

You can also pass:

  • extra SciPy minimise keyword arguments directly to optimize_budget(...) to tweak the underlying solver call
  • callback=True to get a third return value with per-iteration objective, gradient, and constraint diagnostics

YAML note for the pipeline runner

If you run optimisation through the structured pipeline, configure the optimization block in YAML:

optimization:
  start_date: "2024-11-11"
  end_date: "2025-01-27"
  total_budget: 430000000.0

In this pipeline path, optimization.total_budget uses the wrapper contract described above: it is passed straight to optimize_budget(...) as per-period spend, not total horizon spend.

Common pitfalls

  • Passing a total horizon budget to optimize_budget(...). Divide by wrapper.num_periods first, or use Scenario Planning.
  • Passing dict bounds for a panel budget. Dict bounds only work when the budget dims are just ("channel",).
  • Omitting a budget dimension from budget_distribution_over_period. The distribution must include every budget dim, not just the one you want to vary.
  • Forgetting that response_variable must exist in the fitted optimisation graph.
  • Using one budget distribution for optimisation and a different one for response simulation.

Interpreting Optimisation

After you run budget optimisation, you usually work with three outputs:

  • the allocation DataArray
  • the SciPy OptimizeResult
  • a simulated response dataset from sample_response_distribution()

This page explains how to read each one.

Read the optimiser output

PanelBudgetOptimizerWrapper.optimize_budget(...) returns:

allocation, result = wrapper.optimize_budget(...)

If you set callback=True, it returns a third value:

allocation, result, callback_info = wrapper.optimize_budget(..., callback=True)

allocation

allocation is an xarray.DataArray over the non-date budget dimensions.

Model shape Typical allocation dims Meaning
No extra panel dims ("channel",) One optimised value per channel
dims=("geo",) ("geo", "channel") One value per (geo, channel) cell
dims=("geo", "brand") ("geo", "brand", "channel") One value per (geo, brand, channel) cell

The values are in the wrapper’s per-period units. Unoptimised cells are present and set to zero.

result

result is SciPy’s OptimizeResult. The fields you will usually inspect are:

Field Meaning
success Whether the solver converged
status SciPy status code
message Human-readable solver message
fun Final objective value
nit Number of iterations
x The optimised flat parameter vector

If success is False, Abacus raises MinimizeException unless you opt in to return_if_fail=True on the underlying BudgetOptimizer.

callback_info

When callback=True, Abacus records one entry per solver iteration. Each entry includes:

  • x
  • fun
  • jac
  • constraint_info when constraints are active

Use this when you need to diagnose solver behaviour rather than just consume the final allocation.

Simulate the optimised plan

The optimiser itself returns only the allocation. To estimate spend paths and contributions over the requested window, call sample_response_distribution().

response_samples = wrapper.sample_response_distribution(
    allocation_strategy=allocation,
    noise_level=0.0,
    include_last_observations=False,
    include_carryover=True,
    budget_distribution_over_period=budget_distribution,
)

Set noise_level=0.0 when you want the spend path to match the requested allocation exactly.

What response_samples contains

The wrapper builds a synthetic future dataset, samples posterior predictive draws, and then merges the requested allocation and simulated spend path back into the result.

response_samples therefore contains:

Variable Source Meaning
allocation Added by the wrapper Requested allocation without a date dimension
One variable per channel Added by the wrapper Simulated spend path over the future dates
mmm.output_var Posterior predictive sample Model output variable
channel_contribution Posterior predictive sample Channel contribution on model scale
total_media_contribution_original_scale Posterior predictive sample Total media contribution on the original target scale

If you pass additional_var_names, Abacus also includes those variables when they exist in the model graph.

Carryover and evaluation window

include_carryover=True changes how Abacus builds the synthetic future window.

  • Abacus extends the generated dates by adstock.l_max periods.
  • It then zeroes the tail spend rows after the requested window.
  • The extra dates let posterior predictive sampling include lagged effects from the planned spend.

This is why the simulated dataset can cover a longer evaluated window than the requested start_date to end_date range, while still preserving the same total spend.

Plot the result

The plotting helpers under mmm.plot are designed to work directly with the response dataset returned by the wrapper.

fig, ax = mmm.plot.budget_allocation(response_samples, original_scale=True)

fig, ax = mmm.plot.allocated_contribution_by_channel_over_time(
    response_samples,
    original_scale=True,
)

Useful options:

  • dims={...} to filter a panel slice
  • split_by="geo" or another dimension to create separate subplots
  • original_scale=True to prefer original-scale contribution variables when they are available

Example optimisation output:

Budget allocation example Budget allocation example

Allocated contribution by channel over time Allocated contribution by channel over time

Budget response curves example Budget response curves example

Read the Stage 70 pipeline artefacts

If you run optimisation through python -m abacus.pipeline.runner, Stage 70 writes both the low-level optimiser output and several interpretation files.

File What it contains
optimized_allocation.nc / optimized_allocation.csv The allocation returned by the optimiser
response_distribution.nc The simulated response dataset for that allocation
optimize_result.json Solver status, message, objective value, and iteration count
budget_summary.csv Current versus optimised totals
budget_response_points.csv Per-channel current versus optimised spend, contribution, and efficiency summaries
budget_impact.csv Delta between current and optimised channel summaries
budget_bounds_audit.csv Current spend, scaled reference spend, bounds, optimised spend, and bound checks
budget_roi_cpa.csv Channel efficiency summaries using the model’s efficiency metric
budget_response_curves.csv Saturation-only response curve summaries
budget_mroi.csv Marginal efficiency estimates at the current and optimised spend points

The stage also writes plots for allocation, contribution over time, response curves, impact, bounds audit, and ROI or CPA summaries.

These Stage 70 spend figures use the same units as the low-level wrapper: per-period spend, not total horizon spend.

Practical checks

Before you use an optimised plan, check:

  • result.success and result.message
  • whether the allocation matches your intended budget units
  • whether budget_bounds_audit.csv or your own checks show any bound issues
  • how much of the gain comes from reallocation versus carryover assumptions
  • whether the point lies on a sensible part of the response curve, not just on the edge of a bound

For multi-plan comparison in total horizon units, use Scenario Planning.

Scenario Planner

The detailed planner documentation now lives in Scenario Planning.

Use that section for:

  • planner concepts and workflow
  • scenario specification classes
  • Python API examples for ScenarioPlanner
  • comparison output tables
  • the optional Dash UI

The planner is a higher-level surface than Budget Optimisation:

  • PanelBudgetOptimizerWrapper uses per-period spend units
  • ScenarioPlanner uses total horizon spend units

Start here:

Scenario Planning

Use this section when you want to compare historical, manual, and optimised future plans with abacus.scenario_planner.

The scenario planner is a higher-level planning surface than the low-level optimisation wrapper. It works in total horizon spend units, returns structured comparison tables, and includes a supported workspace app for fitted runs.

Pages

  • Supported Surface: The recommended planner entry points, fitted-run contract, persisted workspace state, and beta limits.
  • Overview and Workflow: What the planner does, how it differs from low-level optimisation, and how scenario windows work.
  • Scenario Specifications: The public scenario spec classes, allocation shapes, bounds, and budget distributions.
  • Python API: How to use ScenarioPlanner.evaluate(...) and the supported workspace helpers from Python.
  • Comparison Outputs: The structure and meaning of ScenarioResult, ScenarioComparison, and the output tables.
  • Dash App: How to launch the supported workspace UI from a fitted run, work with saved workspaces, and understand background jobs.

Subsections of Scenario Planning

Overview and Workflow

Use the scenario planner when you want to compare whole plans rather than run a single low-level optimisation call.

The planner combines three things:

  • typed scenario specifications
  • a Python comparison service
  • a supported workspace app for fitted results directories

If you need the low-level optimiser instead, see Budget Optimisation.

For the supported beta entry points and current limits, see Supported Surface.

What the planner compares

The retained planner supports three scenario types:

Scenario type Purpose Public spec
Current Use observed history as a reference plan CurrentScenarioSpec
Manual allocation Simulate a user-defined future plan ManualAllocationScenarioSpec
Fixed-budget optimised Optimise a future plan at a fixed budget FixedBudgetOptimizedScenarioSpec

Planner units versus optimiser units

The most important distinction is budget units.

Surface Public budget contract
PanelBudgetOptimizerWrapper Per-period spend
ScenarioPlanner Total spend over the whole scenario horizon

For example, if a four-period scenario has a total budget of 900_000, the planner converts that to per-period units internally before it calls the wrapper or response sampler.

Requested and evaluated windows

Each scenario has a requested window from start_date to end_date.

For simulated scenarios, the evaluated window can be longer than the requested window when you set include_carryover=True. Abacus extends the synthetic future path so lagged adstock effects can continue after the requested end date.

The planner reports both windows in the metadata output.

Historical overlap for current scenarios

CurrentScenarioSpec is strict about history.

Its requested window must overlap observed data. Abacus does not reinterpret a future-only window as “use the latest history instead”.

Typical workflow

The common workflow is:

  1. Fit PanelMMM.
  2. Build one or more scenario specs.
  3. Either run ScenarioPlanner.compare(...) or launch the workspace app from the fitted run directory.
  4. Inspect the comparison tables, save workspaces, and export the planning outputs you need.

Minimal example

from abacus.scenario_planner import (
    CurrentScenarioSpec,
    ManualAllocationScenarioSpec,
    ScenarioPlanner,
)

planner = ScenarioPlanner(mmm)

comparison = planner.compare(
    [
        CurrentScenarioSpec(
            name="Current baseline",
            start_date="2025-01-06",
            end_date="2025-02-24",
        ),
        ManualAllocationScenarioSpec(
            name="Manual plan",
            start_date="2025-03-03",
            end_date="2025-03-24",
            noise_level=0.0,
            include_carryover=False,
            allocation={
                "channel_1": 420_000.0,
                "channel_2": 280_000.0,
                "channel_3": 200_000.0,
            },
        ),
    ]
)

For the full API, see Python API.

How the planner differs from post-model summaries

The scenario planner does not reuse mmm.summary tables directly. Instead, it builds comparison tables that are specific to planning:

  • totals
  • channels
  • contributions_over_time
  • allocations
  • metadata

See Comparison Outputs.

Common pitfalls

  • Mixing up total horizon spend and per-period spend
  • Using a future-only window in CurrentScenarioSpec
  • Forgetting that carryover can extend the evaluated window

Scenario Specifications

This page documents the public spec classes under abacus.scenario_planner.

Most users create one of the three concrete scenario specs:

  • CurrentScenarioSpec
  • ManualAllocationScenarioSpec
  • FixedBudgetOptimizedScenarioSpec

Abacus also exposes shared base models such as HistoricalReferenceScenarioSpec and SimulatedScenarioSpec, but you do not normally instantiate those directly.

Shared fields

All public scenario specs inherit these core fields:

Field Meaning
name Display name for the scenario
start_date Requested scenario start date
end_date Requested scenario end date
scenario_id Stable scenario key used in outputs

If you do not set scenario_id, Abacus derives one by slugifying name.

Scenario IDs must be unique within one ScenarioPlanner.compare(...) call.

CurrentScenarioSpec

Use CurrentScenarioSpec for a historical reference plan.

from abacus.scenario_planner import CurrentScenarioSpec

spec = CurrentScenarioSpec(
    name="Current baseline",
    start_date="2025-01-06",
    end_date="2025-02-24",
)

Requirements:

  • the requested window must overlap observed data
  • no allocation or budget inputs are needed

Shared simulated-scenario fields

ManualAllocationScenarioSpec and FixedBudgetOptimizedScenarioSpec both inherit these fields:

Field Default Meaning
budget_distribution_over_period None Optional time distribution of the total budget
include_last_observations False Passed through to response sampling for lag context
include_carryover True Extend the evaluated window to capture lagged effects
noise_level 0.001 Response-sampling noise level

Set noise_level=0.0 when you want deterministic realised spend paths.

ManualAllocationScenarioSpec

Use ManualAllocationScenarioSpec when you already know the total allocation you want to simulate.

from abacus.scenario_planner import ManualAllocationScenarioSpec

spec = ManualAllocationScenarioSpec(
    name="Manual reallocation",
    start_date="2025-03-03",
    end_date="2025-03-24",
    noise_level=0.0,
    include_carryover=False,
    allocation={
        "channel_1": 420_000.0,
        "channel_2": 280_000.0,
        "channel_3": 200_000.0,
    },
)

Supported allocation shapes

allocation can be:

  • a dict of {channel: total_budget} for ("channel",) budgets only
  • an xarray.DataArray
  • a DataArraySpec

For panel budgets such as ("geo", "channel") or ("geo", "brand", "channel"), use xarray.DataArray or DataArraySpec.

Dict allocations must match the model’s channel coordinates exactly. Missing or extra keys raise ValueError.

FixedBudgetOptimizedScenarioSpec

Use FixedBudgetOptimizedScenarioSpec when you want Abacus to optimise the allocation.

from abacus.scenario_planner import FixedBudgetOptimizedScenarioSpec

spec = FixedBudgetOptimizedScenarioSpec(
    name="Optimised plan",
    start_date="2025-03-03",
    end_date="2025-03-24",
    noise_level=0.0,
    include_carryover=False,
    total_budget=900_000.0,
)

Optimisation fields

Field Meaning
total_budget Total spend over the full scenario horizon
response_variable Variable used by the optimiser
budget_bounds Explicit lower and upper bounds
spend_constraint_lower Relative lower bound when deriving defaults
spend_constraint_upper Relative upper bound when deriving defaults
default_constraints Passed through to the underlying optimiser

The default response_variable is "total_media_contribution_original_scale".

Default bound derivation

If you do not pass budget_bounds, Abacus derives them from historical reference spend.

For each omitted relative constraint side, Abacus uses 0.3. That gives the default Meridian-style bounds:

  • lower bound: scaled reference spend × (1 - 0.3)
  • upper bound: scaled reference spend × (1 + 0.3)

If historical reference spend sums to zero, Abacus cannot derive those default bounds and raises ValueError.

Supported budget_bounds shapes

budget_bounds can be:

  • a dict of {channel: (lower, upper)} for ("channel",) budgets only
  • an xarray.DataArray
  • a DataArraySpec

For xarray or DataArraySpec, the dims must be (*budget_dims, "bound") with "lower" and "upper" values on the bound dimension.

budget_distribution_over_period

Both simulated scenario types support budget_distribution_over_period.

The object must:

  • have dims ("date", *budget_dims)
  • contain one weight per scenario period
  • sum to 1 across date for every budget cell

The date coordinates can be:

  • integer positions 0 .. num_periods - 1, or
  • exact dates that match the requested scenario window

If the dates do not match the scenario window exactly, Abacus raises ValueError.

DataArraySpec

Use DataArraySpec when you want JSON-friendly or YAML-friendly planner inputs.

from abacus.scenario_planner import DataArraySpec

allocation = DataArraySpec(
    values=[[420_000.0, 280_000.0], [300_000.0, 200_000.0]],
    dims=("geo", "channel"),
    coords={
        "geo": ["UK", "FR"],
        "channel": ["channel_1", "channel_2"],
    },
)

Abacus materialises DataArraySpec as an xarray.DataArray before it validates dims and coordinates.

Common pitfalls

  • Reusing the same scenario_id twice in one comparison
  • Using dict allocations or dict bounds for panel budgets
  • Passing an allocation or bounds object with missing coordinates
  • Providing a budget_distribution_over_period that does not sum to 1

Python API

Use ScenarioPlanner when you want to evaluate one scenario or compare multiple scenarios from Python.

The public API lives under abacus.scenario_planner.

For the recommended entry points and supported beta scope, see Supported Surface.

Prerequisite

ScenarioPlanner requires a fitted PanelMMM with idata.

If you construct the planner before fitting, Abacus raises ValueError.

Create a planner

from abacus.scenario_planner import ScenarioPlanner

planner = ScenarioPlanner(mmm)

You can inspect the modelled channel names with:

channels = planner.channels

Supported workspace helpers

If you want to work with the supported app from Python, use these helpers:

Helper What it returns Use it when
load_workspace_bundle(...) run_context, workspace_service, workspace you want the fitted run context and active workspace without starting Dash
create_app_from_results_dir(...) app, run_context, workspace_service, workspace you want to launch or embed the supported app from Python

Example:

from abacus.scenario_planner import create_app_from_results_dir

app, run_context, workspace_service, workspace = create_app_from_results_dir(
    "results/timeseries_20260308_144627",
    workspace_name="Timeseries planning workspace",
)

app.run(host="127.0.0.1", port=8050, debug=False)

These helpers expect a fitted results directory with run_manifest.json and a fit-stage idata artefact.

When available, the loader rebuilds the model from in-run metadata config artifacts in this order:

  1. 00_run_metadata/config.resolved.yaml
  2. 00_run_metadata/config.original.yaml
  3. the copied config file under 00_run_metadata/
  4. run_manifest.json["config_path"] as fallback

The returned run_context records both config_path and config_provenance_type so callers can tell which source was used.

Residual portability risk remains if the chosen config still points to dataset files outside the saved run directory.

Evaluate one scenario

Use evaluate(...) when you want one scenario result:

from abacus.scenario_planner import ManualAllocationScenarioSpec, ScenarioPlanner

planner = ScenarioPlanner(mmm)

result = planner.evaluate(
    ManualAllocationScenarioSpec(
        name="Manual plan",
        start_date="2025-03-03",
        end_date="2025-03-24",
        noise_level=0.0,
        include_carryover=False,
        allocation={
            "channel_1": 420_000.0,
            "channel_2": 280_000.0,
            "channel_3": 200_000.0,
        },
    )
)

print(result.totals)
print(result.channels)
print(result.metadata)

evaluate(...) returns ScenarioResult with:

  • totals
  • channels
  • contributions_over_time
  • allocation
  • metadata

Compare multiple scenarios

Use compare(...) when you want one combined comparison object:

from abacus.scenario_planner import (
    CurrentScenarioSpec,
    FixedBudgetOptimizedScenarioSpec,
    ManualAllocationScenarioSpec,
    ScenarioPlanner,
)

planner = ScenarioPlanner(mmm)

comparison = planner.compare(
    [
        CurrentScenarioSpec(
            name="Current baseline",
            start_date="2025-01-06",
            end_date="2025-02-24",
        ),
        ManualAllocationScenarioSpec(
            name="Manual plan",
            start_date="2025-03-03",
            end_date="2025-03-24",
            noise_level=0.0,
            include_carryover=False,
            allocation={
                "channel_1": 420_000.0,
                "channel_2": 280_000.0,
                "channel_3": 200_000.0,
            },
        ),
        FixedBudgetOptimizedScenarioSpec(
            name="Optimised plan",
            start_date="2025-03-03",
            end_date="2025-03-24",
            noise_level=0.0,
            include_carryover=False,
            total_budget=900_000.0,
        ),
    ]
)

print(comparison.totals)
print(comparison.allocations)

compare(...) returns ScenarioComparison with:

  • totals
  • channels
  • contributions_over_time
  • allocations
  • metadata

Unlike ScenarioResult, the combined object uses the plural allocations.

Programmatic workspace orchestration

Use WorkspaceService when you want to work with saved planner workspaces from Python.

Common operations include:

  • load_workspace(...)
  • save_workspace(...)
  • clone_workspace(...)
  • update_workspace_metadata(...)
  • create_template_draft(...)
  • replace_draft(...)
  • evaluate_draft(...)
  • run_sensitivity_sweep(...)
  • export_workspace_bundle(...)

Example:

from abacus.scenario_planner import load_workspace_bundle

run_context, workspace_service, workspace = load_workspace_bundle(
    "results/timeseries_20260308_144627",
)

draft = workspace_service.create_template_draft(
    workspace=workspace,
    scenario_type="fixed_budget_optimized",
)
workspace = workspace_service.replace_draft(workspace, draft)
workspace = workspace_service.evaluate_draft(workspace, draft)
workspace_service.save_workspace(
    workspace,
    action="evaluate_draft",
    changed_scenario_ids=[draft.scenario_id],
)

WorkspaceService defaults to synchronous jobs when you instantiate it directly. The supported app uses ThreadedScenarioPlannerJobRunner automatically.

Prepare data for a client UI

Use to_store_payload() when you want a JSON-friendly version of the comparison tables:

payload = comparison.to_store_payload()

This method converts datetime columns to YYYY-MM-DD strings and returns a dict of record lists that works well for client-side UIs such as Dash.

Background-job helpers

For custom integrations, WorkspaceService also exposes queue/apply methods:

  • submit_draft_evaluation(...)
  • apply_draft_evaluation_job(...)
  • submit_sensitivity_sweep(...)
  • apply_sensitivity_sweep_job(...)

Use these only when you need the same background-job pattern as the supported app. For most scripted flows, the blocking methods are simpler.

Relationship to the low-level wrapper

ScenarioPlanner uses PanelBudgetOptimizerWrapper internally for simulated scenarios, but its public contract is different:

  • you pass total horizon budgets and allocations
  • the planner converts them to per-period units internally
  • the planner returns comparison tables rather than raw optimiser objects

If you want direct access to optimize_budget(...) or sample_response_distribution(...), use Budget Optimisation instead.

Common pitfalls

  • Passing per-period spend into ManualAllocationScenarioSpec or FixedBudgetOptimizedScenarioSpec
  • Expecting duplicate scenario_id values to be allowed in compare(...)
  • Forgetting that result.allocation and comparison.allocations use different attribute names

Comparison Outputs

ScenarioPlanner returns structured planning tables rather than a single optimiser object.

This page explains the output objects and the meaning of each table.

Output objects

Object Produced by Tables
ScenarioResult planner.evaluate(spec) totals, channels, contributions_over_time, allocation, metadata
ScenarioComparison planner.compare(specs) totals, channels, contributions_over_time, allocations, metadata

ScenarioComparison is a row-wise concatenation of the individual scenario results, with scenario identifiers added to every table.

totals

totals has one row per scenario.

It includes:

  • scenario_id
  • scenario_name
  • scenario_type
  • total_spend
  • contribution_mean
  • contribution_median
  • contribution_hdi_94_lower
  • contribution_hdi_94_upper
  • efficiency_metric
  • efficiency_mean
  • efficiency_median
  • efficiency_hdi_94_lower
  • efficiency_hdi_94_upper

efficiency_metric is ROAS for revenue targets and CPA for conversion targets.

channels

channels has one row per (scenario, channel).

It includes:

  • scenario identifiers
  • channel
  • spend
  • spend_share
  • spend_per_period
  • contribution summary columns
  • contribution-per-period columns
  • efficiency summary columns
  • efficiency_metric

The planner aggregates non-channel panel dims before it builds this table. For example, a (geo, channel) model still returns one row per channel here.

contributions_over_time

contributions_over_time has one row per (scenario, date, channel).

It includes:

  • scenario identifiers
  • date
  • channel
  • contribution_mean
  • contribution_median
  • contribution_hdi_94_lower
  • contribution_hdi_94_upper

Like channels, this table aggregates non-channel panel dims before summarising.

allocations

allocations keeps the original allocation grain.

It includes:

  • scenario identifiers
  • the allocation dims, such as channel, geo, or brand
  • allocation
  • realized_spend

For current scenarios, allocation is the summed historical spend over the reference window. For simulated scenarios, allocation is the requested total horizon allocation and realized_spend is the realised spend from the response simulation.

metadata

metadata is the audit table for each scenario.

Shared fields include:

  • scenario_id
  • scenario_name
  • scenario_type
  • start_date
  • end_date
  • evaluated_start_date
  • evaluated_end_date
  • num_periods
  • target_type
  • efficiency_metric

Additional fields depend on scenario type.

Current scenario metadata

Current scenarios add:

  • reference_window_dates

Manual scenario metadata

Manual scenarios add:

  • requested_total_budget
  • total_budget
  • reference_window_dates
  • budget_unit

Fixed-budget optimised metadata

Optimised scenarios add:

  • requested_total_budget
  • total_budget
  • optimization_success
  • optimization_status
  • optimization_message
  • optimization_objective_value
  • reference_window_dates
  • budget_unit

Requested versus evaluated windows

The metadata table is the best place to check whether the evaluated window matches the requested window.

When include_carryover=True, the evaluated end date can be later than the requested end_date.

Example inspection

comparison = planner.compare(specs)

totals = comparison.totals
metadata = comparison.metadata

optimised_metadata = metadata.loc[
    metadata["scenario_type"] == "fixed_budget_optimized"
].iloc[0]

print(optimised_metadata["optimization_success"])
print(optimised_metadata["optimization_message"])

to_store_payload()

ScenarioComparison.to_store_payload() converts the comparison tables into a JSON-friendly dict of record lists:

payload = comparison.to_store_payload()

This is the payload format consumed by the supported workspace app.

Common pitfalls

  • Reading channels as if it retained non-channel panel dims
  • Ignoring metadata when carryover is enabled
  • Comparing requested allocation with realised spend without checking the allocations table

Supported Surface

Use this page to understand which Scenario Planner entry points Abacus supports for beta evaluation.

The planner has two primary surfaces:

  • a Python comparison API for scripted planning workflows
  • a workspace-based Dash app for interactive scenario editing and review

Recommended entry points

Use these entry points in preference order.

Entry point Use it when you want to Notes
ScenarioPlanner evaluate or compare scenarios from Python Best fit for notebooks, scripts, and testable planning flows
python -m abacus.scenario_planner launch the supported interactive app from a fitted run directory Starts the workspace UI with file-backed persistence
create_app_from_results_dir(...) embed the supported app in your own Python launcher Returns app, run_context, workspace_service, and workspace
load_workspace_bundle(...) load the fitted run and active workspace without starting Dash Useful for custom wrappers around the supported app
WorkspaceService work with saved workspaces programmatically Advanced surface for cloning, saving, evaluating, sweeping, and exporting

Advanced integration surfaces

Abacus also exposes lower-level objects such as:

  • create_scenario_planner_dash_app(...)
  • ThreadedScenarioPlannerJobRunner
  • SynchronousScenarioPlannerJobRunner
  • WorkspaceStore

These are public, but they are more implementation-shaped than the recommended entry points above. Use them when you need to embed the planner into a custom application or override the default job runner or storage behaviour.

Results directory contract

The supported launcher and load_workspace_bundle(...) expect a fitted run directory, not raw modelling inputs.

The run directory must include:

Requirement Why it matters
run_manifest.json Abacus uses it to locate the config and saved artefacts
a fit-stage idata artefact Abacus attaches the saved posterior to the rebuilt model

When metadata-stage config artifacts are present, Abacus prefers those in-run files when rebuilding the saved PanelMMM:

  • 00_run_metadata/config.resolved.yaml
  • 00_run_metadata/config.original.yaml
  • the copied config file under 00_run_metadata/

Only when those in-run config artifacts are absent does the planner fall back to run_manifest.json["config_path"].

That makes the supported loader more portable when the original config path is no longer available, but it does not guarantee full relocation across machines. The chosen config can still reference dataset files outside the run directory.

The planner can also load these optional optimisation artefacts when they are present:

  • 70_optimisation/budget_response_curves.csv
  • 70_optimisation/budget_bounds_audit.csv

When these files are available, the app can show saved saturation-reference response-curve and bounds-audit views.

What the app persists

The workspace app stores its own planning state under the fitted run directory:

Path Contents
scenario_planner/workspaces/<workspace_id>.json full persisted workspace state
scenario_planner/workspaces/<workspace_id>.manifest.json compact workspace manifest
scenario_planner/cache/cache_index.json evaluation cache index
scenario_planner/cache/evaluations/ cached evaluated scenarios
scenario_planner/exports/<workspace_id>/<export_id>/ export bundle contents
scenario_planner/exports/<workspace_id>/<export_id>.zip zipped export bundle

Workspaces persist:

  • workspace metadata such as name, owner, tags, and notes
  • draft metadata such as scenario owner, workflow status, approvals, pinning, notes, and tags
  • evaluated scenarios
  • sensitivity runs
  • revision history
  • job history
  • cache metadata

Background jobs in the supported app

The supported app launches with ThreadedScenarioPlannerJobRunner.

In the current beta:

  • draft evaluation runs as a queued background job
  • sensitivity sweeps run as queued background jobs
  • export remains synchronous, but Abacus still records it in job history

The UI currently tracks one active planner job at a time. Wait for the current evaluation or sweep to finish before starting another one from the app.

Beta scope and current limits

The current supported beta scope is:

  • local use against fitted run directories
  • file-backed workspace persistence inside the run directory
  • interactive drafting, evaluation, comparison, sensitivity sweeps, and export

Current limits to keep in mind:

  • the app does not fit or refit PanelMMM
  • the launcher starts Dash’s built-in server for local evaluation
  • the UI does not yet manage multiple active planner jobs at the same time

Next pages

Dash App

Abacus includes a supported Dash app for workspace-based scenario planning.

Use it when you already have a fitted run directory and want to inspect, edit, evaluate, compare, sweep, and export scenarios without writing the entire workflow by hand.

The app does not fit PanelMMM. It loads an existing fitted run, reuses the saved idata, and evaluates planner scenarios against that fitted model.

For the recommended entry points and beta scope, see Supported Surface.

Install the optional dependencies

python -m pip install -e ".[planner]"

The planner extra installs the Dash and Plotly dependencies used by the UI.

Launch the supported app

Use the supported module launcher for fitted pipeline results:

python -m abacus.scenario_planner \
  --results-dir results/timeseries_20260308_144627

This launcher is the recommended interactive entry point for beta evaluation. It loads the fitted run, opens or seeds a planner workspace, and starts the app with the threaded job runner used by the supported UI.

Useful flags:

  • --workspace-id to open one previously saved workspace
  • --workspace-name to control the seeded workspace name
  • --current-periods and --future-periods to change the default seeded windows
  • --budget-scale to scale the default future budget
  • --build-only to validate the run and print a summary without starting Dash
  • --host, --port, and --debug for the Dash server

For example:

python -m abacus.scenario_planner \
  --results-dir results/timeseries_20260308_144627 \
  --workspace-id timeseries-20260308-144627-planning-workspace \
  --host 127.0.0.1 \
  --port 8050

Create the app from Python

If you want to embed the UI in your own script, use the supported helper:

from abacus.scenario_planner import create_app_from_results_dir

app, run_context, workspace_service, workspace = create_app_from_results_dir(
    "results/timeseries_20260308_144627",
)

app.run(host="127.0.0.1", port=8050, debug=False)

Abacus also still exposes the lower-level create_scenario_planner_dash_app(...) factory when you already have a ScenarioComparison or ScenarioWorkspace.

What the launcher requires

The supported launcher expects a fitted results directory that contains:

  • run_manifest.json
  • a fit-stage idata artefact

When the metadata stage is present, the launcher prefers the in-run config artifacts under 00_run_metadata/ and only falls back to run_manifest.json["config_path"] if those files are absent.

In build-only mode, the launcher prints the selected config path and its provenance so you can see whether the planner loaded:

  • resolved_in_run
  • original_in_run
  • copied_in_run
  • external_manifest_path

This makes the launcher more portable when the original config path no longer exists, but the chosen config can still fail if it references dataset files that are not present on the current machine.

When these optional files are present, the app also loads them for richer UI views:

  • 70_optimisation/budget_response_curves.csv
  • 70_optimisation/budget_bounds_audit.csv

What the UI includes

The current app has five tabs:

  • Plan Setup for run context, workspace metadata, saved workspaces, draft inventory, and the launch path into Scenario Builder
  • Scenario Builder for editing one draft at a time and evaluating it back into the workspace
  • Review for cross-scenario totals, deltas, rankings, movers, and approval/export readiness
  • Explain for response curves, operating-region views, lift comparisons, and diagnostics/audit surfaces
  • Export for reproducible export bundles and deterministic sensitivity output selection

What the app saves

The workspace app persists planning state under the fitted run directory:

Path What Abacus saves
scenario_planner/workspaces/ workspace JSON files and compact manifests
scenario_planner/cache/ cached evaluated scenarios and cache index
scenario_planner/exports/ export bundles and zipped archives

This means a planner session stays attached to one fitted run.

Plan Setup page

The Plan Setup page shows the loaded run context and the active planner workspace. It also lets you:

  • open a different saved workspace for the same run
  • clone the current workspace into a new planning branch
  • edit workspace name, owner, tags, and notes
  • inspect revision history, job history, and evaluation-cache reuse
  • launch the current workspace into Scenario Builder

This page is the planner launch surface: planning context stays visible first, while operational details remain available through collapsed secondary sections.

Scenario Builder page

The Scenario Builder page is interactive. You can:

  • create current, manual_allocation, and fixed_budget_optimized drafts
  • duplicate or delete drafts
  • edit names, dates, carryover, budget, and manual allocations
  • capture scenario owner, workflow status, approvals, pinning, tags, and notes
  • evaluate and save the draft back into the workspace

When a draft has been evaluated, the page shows planned versus realised spend, allocation detail, and scenario metadata. When a draft has changed but has not yet been re-evaluated, the page shows a draft preview instead.

Scenario Builder view Scenario Builder view

Scenario Builder page in the supported Dash app.

Review page

The Review page focuses on scenario-to-scenario trade-offs and review readiness. It includes:

  • scenario summary cards
  • overview and delta charts
  • channel comparison charts
  • scenario ranking and top-mover tables
  • contribution-over-time comparisons

Compare Scenarios view Compare Scenarios view

Compare Scenarios page in the supported Dash app.

Explain and Export pages

The remaining tabs build on the same workspace state:

  • Explain overlays scenario reference points on the saved Stage 70 saturation-only response-curve artefact when available
  • the plotted marker position follows the saved reference curve at each scenario’s spend-per-period level
  • marker hover text also shows the actual evaluated average contribution so you can compare the scenario outcome with the reference-curve position
  • Explain also surfaces scenario warnings, optimiser status, bounds audit, allocation reconciliation, operating-region views, and lift comparisons
  • Export writes reproducible bundles under the run directory and exposes any saved sensitivity output selections

Background jobs

The supported app runs draft evaluation and sensitivity sweeps as background jobs.

In the current beta:

  • the app queues draft evaluation and sensitivity sweeps
  • the UI polls the active job and refreshes the workspace when the job completes
  • export runs synchronously, but Abacus still records it in job history

The UI currently tracks one active planner job at a time. Finish the current evaluation or sweep before starting another one.

Practical guidance

  • Launch the app from a fitted results directory, not from raw input data.
  • Use separate cloned workspaces for competing planning narratives.
  • Re-evaluate a draft after changing dates, budget, or allocation values.
  • Check both requested and evaluated windows when carryover is enabled.
  • Review the Diagnostics page before exporting or sharing a scenario set.
  • Treat the built-in launcher as a local beta workflow rather than a production deployment surface.

Common pitfalls

  • Launching the app without installing .[planner]
  • Pointing the launcher at a directory without run_manifest.json and fit artefacts
  • Expecting the app to fit a model from scratch
  • Interpreting a draft preview as evaluated output before clicking Evaluate and Save
  • Starting a second evaluation or sweep while another planner job is still running

Pipeline Runner

This section covers the structured abacus.pipeline runner: how it loads a config and dataset, executes the retained stage sequence, and writes reproducible run artefacts to disk.

Pages

  • Runner Overview - How run_pipeline(...) works, which stages run, and when the optimisation stage is skipped.
  • YAML Configuration - Which YAML keys the runner consumes and how they map to model build, data loading, holidays, and optimisation.
  • CLI Reference - The thin python -m abacus.pipeline.runner interface and its supported flags.
  • Output Directory Schema - The run directory layout, manifest schema, stage statuses, and main artefacts.
  • Extending the Runner - How to add a stage or wire in reporting without bypassing the manifest and artifact helpers.

Subsections of Pipeline Runner

Runner Overview

Use the pipeline runner when you want a full disk-backed PanelMMM run instead of only an in-memory fit.

The runner loads a YAML config and a CSV dataset, builds the model, executes a fixed stage sequence, and writes each stage’s artefacts into a structured run directory. When validation is enabled, the runner performs a second train-window fit for the blocked holdout stage, so the run takes longer than a pure full-sample fit.

If you want a quick first run, start with Quickstart: Pipeline Runner.

Public entry points

The public Python API is:

  • abacus.pipeline.PipelineRunConfig
  • abacus.pipeline.run_pipeline
  • abacus.pipeline.PipelineRunResult

The thin CLI wraps the same code path:

python -m abacus.pipeline.runner --config path/to/config.yml

Basic Python example

from pathlib import Path

from abacus.pipeline import PipelineRunConfig, run_pipeline

result = run_pipeline(
    PipelineRunConfig(
        config_path=Path("data/demo/geo_panel/config.yml"),
        output_dir=Path("results"),
        run_name="geo_panel_baseline",
        prior_samples=10,
        draws=500,
        tune=500,
        chains=2,
        cores=2,
        random_seed=42,
        curve_samples=100,
        curve_points=100,
    )
)

print(result.run_dir)
print(result.manifest_path)

PipelineRunResult contains:

Field Meaning
run_dir The created run directory
manifest_path The path to run_manifest.json inside that directory

What the runner does

run_pipeline(...) performs these steps:

  1. Load the YAML config with load_yaml_config(...).
  2. Load X and y from CSV using load_pipeline_data(...).
  3. Merge CLI sampler overrides with YAML fit through build_model_kwargs(...).
  4. Create the output directory tree and initialise run_manifest.json.
  5. Run the retained stages in order, updating the manifest after every stage.

The model is built in Stage 00 by build_mmm_from_yaml(...), then stored in the shared PipelineContext for the remaining stages. Runner-only roots such as diagnostics and validation stay on the pipeline context and are stripped before the public MMM builder validates the model YAML.

Stage order

The runner uses a fixed stage list.

Stage key Directory Purpose Optional
metadata 00_run_metadata Build the model and write resolved config and dataset metadata No
preflight 10_pre_diagnostics Prior predictive draws and plot No
fit 20_model_fit Fit the model, save InferenceData, write trace and summary No
assessment 30_model_assessment In-sample posterior predictive checks, fitted values, residual outputs No
validation 35_holdout_validation Blocked holdout scoring on a train-window refit Yes
decomposition 40_decomposition Contribution tables and decomposition plots No
diagnostics 50_diagnostics Raw input screening, MCMC, predictive, and residual diagnostics No
curves 60_response_curves Saturation-only, forward-pass direct contribution, and adstock curve artefacts No
optimisation 70_optimisation Budget optimisation artefacts Yes

The validation stage is marked skipped when the YAML config does not contain validation or it is disabled. The optimisation stage is also optional; it returns None and is marked skipped when the YAML config does not contain an optimization block.

See Output Directory Schema for the stage folders and artefact layout.

Data and model assumptions

The retained runner is designed around PanelMMM.

  • The flow-oriented public YAML is expected to describe a PanelMMM.
  • The data loader reads CSV only.
  • Later stages call PanelMMM plotting, summary, diagnostics, and optimisation methods directly.

If you need the exact YAML keys, see YAML Configuration.

PipelineRunConfig

PipelineRunConfig controls runtime settings that sit outside the YAML model specification.

Field Purpose
config_path YAML file to load
output_dir Root directory under which the run directory is created
run_name Optional run-name override; otherwise the config filename stem
dataset_path Optional combined dataset CSV override
x_path, y_path Optional feature and target CSV overrides
holidays_path Optional holiday CSV override
target_column Target column name used during CSV loading
prior_samples Number of prior predictive samples for Stage 10
draws, tune, chains, cores, random_seed Sampler overrides merged onto YAML fit
curve_samples, curve_points Curve sampling settings for Stage 60

Only sampler settings are merged into model construction. Other overrides are used by the runner itself during data loading, holiday resolution, diagnostics reporting, and output setup.

Run directory naming

The runner creates the run directory as:

<output_dir>/<effective_run_name>_<YYYYMMDD_HHMMSS>

The timestamp is generated in UTC.

All stage directories are created up front, even if a later stage is skipped or the run aborts.

Failure and skip behaviour

If a stage raises an exception:

  • the current stage is marked failed
  • the run manifest is marked failed
  • all still-pending later stages are marked not_reached
  • run_pipeline(...) re-raises the exception

If a stage returns None:

  • the stage is marked skipped
  • the manifest warning records that no configuration was supplied for that optional stage

Reporter hook

run_pipeline(...) accepts an optional reporter that implements the PipelineReporter protocol.

The reporter can observe:

  • pipeline start
  • stage start
  • stage end
  • pipeline end
  • pipeline failure

See Extending the Runner for the callback contract.

YAML Configuration

The pipeline runner reads the same YAML model specification used by build_mmm_from_yaml(...), then adds a small set of runner-specific conventions for data loading, optional blocked holdout validation, and Stage 70 optimisation.

This page documents the keys that the runner actually consumes.

Root keys

Key Required Used for
data Usually Resolve dataset paths when you do not pass dataset_path, x_path, or y_path through PipelineRunConfig
target Yes Define the target column and business target type
dimensions No Declare panel-dimension columns such as geo or brand
media Yes Define channel/control columns and transform types
scaling No Configure target/channel scaling rules
effects No Append additive effects in YAML order before build_model(...)
priors No Override model-level priors and prefixed transform priors
fit No Default sampler settings for Stage 20 fitting
holidays No Add holiday events before model build
original_scale_vars No Add original-scale contribution variables before fitting
inference_data No Attach existing InferenceData when the file exists
validation No Enable optional Stage 35 blocked holdout validation
optimization No Enable Stage 70 budget optimisation
diagnostics No Override Stage 50 runner diagnostics thresholds

Minimal runner config

data:
  dataset_path: dataset.csv
  date_column: date

target:
  column: revenue
  type: revenue

dimensions:
  panel: [geo]

media:
  channels: [channel_1, channel_2]
  adstock:
    type: geometric
    l_max: 4
  saturation:
    type: logistic

fit:
  draws: 1000
  tune: 1000
  chains: 4
  cores: 4
  random_seed: 42

Relative paths in YAML are resolved relative to the YAML file’s directory.

diagnostics is runner-only. The structured pipeline reads it, but build_mmm_from_yaml(...) still validates only the public MMM model schema.

validation is also runner-only. The structured pipeline reads it for Stage 35 blocked holdout scoring, but the public MMM YAML builder never sees it.

Core modeling blocks

The runner always builds a PanelMMM, so the public YAML no longer exposes a model.class field. Instead, it reads:

  • data.date_column
  • target.column
  • target.type
  • media.channels
  • media.controls, if any
  • dimensions.panel, if any
  • media.adstock
  • media.saturation
  • fit

data

The runner loads data before building the model. It supports two CSV layouts.

Combined dataset

data:
  dataset_path: "dataset.csv"

The runner reads the CSV, removes the target column from X, and uses that column as y.

Separate feature and target files

data:
  x_path: "X.csv"
  y_path: "y.csv"

When loading y_path:

  • if the configured target column exists, the runner uses that column
  • otherwise, if the file has exactly one column, the runner uses that column and renames it to the target name

Target column resolution

The runner resolves the target column in this order:

  1. PipelineRunConfig.target_column or CLI --target-column
  2. target.column
  3. "y"

Use the CLI override only when you want to change how the runner reads the CSV. Keep it consistent with target.column in YAML.

fit

fit controls Stage 20 fitting because the fit stage calls:

context.model.fit(X=context.X, y=context.y, progressbar=False)

The runner merges these CLI or PipelineRunConfig overrides onto the YAML fit block when they are provided:

  • draws
  • tune
  • chains
  • cores
  • random_seed

The public YAML schema currently supports these fit keys:

  • draws
  • tune
  • chains
  • cores
  • random_seed
  • target_accept
  • progressbar
  • compute_convergence_checks

Unknown fit keys are rejected when the YAML is loaded.

effects

effects is an optional list of additive effect specifications:

effects:
  - type: linear_trend
    prefix: trend
    n_changepoints: 8
  - type: weekly_fourier
    order: 3

The builder appends each effect to model.mu_effects in YAML order before calling build_model(...).

holidays

The holidays block is optional.

Supported keys used by the builder include:

Key Meaning
path Holiday CSV path
enabled Set to false to disable holiday loading
prefix Prefix for generated holiday effect coordinates
countries Optional country filter for catalogue-style holiday CSV input

Example:

holidays:
  path: "holidays.csv"
  prefix: "holiday"

The CLI or PipelineRunConfig.holidays_path overrides holidays.path.

If you omit both path and the override but still configure holidays, Abacus falls back to the bundled abacus.data:holidays.csv.

original_scale_vars

Use original_scale_vars when you want specific contribution variables to be available on the original target scale:

original_scale_vars:
  - channel_contribution
  - y

The builder applies these through model.add_original_scale_contribution_variable(...) before fitting.

inference_data

inference_data.path is passed through to the YAML builder. If the file exists, Abacus attaches that InferenceData to the built model during Stage 00.

Important: the structured runner still executes Stage 20 and fits the model again. inference_data.path does not currently skip fitting.

optimization

Add an optimization block when you want Stage 70 to run. If this block is absent, Stage 70 is marked skipped.

The YAML builder validates this block when the config is loaded. The required scalar fields below must be present, and unknown top-level optimization keys are rejected.

Required keys:

optimization:
  start_date: "2024-11-11"
  end_date: "2025-01-27"
  total_budget: 1289000000.0

Optional keys read by Stage 70:

Key Default Meaning
response_variable total_media_contribution_original_scale Optimisation objective variable
budget_distribution_over_period None Time weights over the optimisation window
budget_bounds Derived or default Explicit spend bounds
spend_constraint_lower 0.3 when deriving bounds Relative lower bound around scaled reference spend
spend_constraint_upper 0.3 when deriving bounds Relative upper bound around scaled reference spend
default_constraints true Whether to add the default equality budget constraint
noise_level 0.001 Noise level for simulated response samples
include_last_observations false Whether posterior predictive sampling includes trailing observed rows
include_carryover true Whether simulated response sampling extends the window for carryover

Important budget-unit note

In the structured pipeline, optimization.total_budget is passed straight to PanelBudgetOptimizerWrapper.optimize_budget(...).

That means Stage 70 uses the wrapper’s per-period spend contract, not the scenario planner’s total-horizon spend contract.

See Budget Optimisation.

Xarray-like optimisation values in YAML

For panel bounds or time distributions, use the xarray-like mapping shape that Stage 70 expects:

optimization:
  start_date: "2025-02-03"
  end_date: "2025-02-24"
  total_budget: 100000.0
  budget_distribution_over_period:
    values:
      - [[0.25, 0.25], [0.25, 0.25]]
      - [[0.25, 0.25], [0.25, 0.25]]
      - [[0.25, 0.25], [0.25, 0.25]]
      - [[0.25, 0.25], [0.25, 0.25]]
    dims: ["date", "geo", "channel"]
    coords:
      date: [0, 1, 2, 3]
      geo: ["UK", "FR"]
      channel: ["channel_1", "channel_2"]

The same shape works for budget_bounds, but with an additional "bound" dimension containing "lower" and "upper".

diagnostics

Use the optional diagnostics block when you want Stage 50 to use different warn/fail thresholds than the retained defaults.

diagnostics:
  thresholds:
    design_max_vif:
      warn: 8.0
      fail: 12.0
    mcmc_max_rhat:
      warn: 1.02
      fail: 1.08

Supported threshold keys:

  • design_max_vif
  • design_condition_number
  • mcmc_divergence_count
  • mcmc_max_rhat
  • mcmc_min_ess_bulk
  • mcmc_bfmi_min
  • predictive_nrmse
  • residual_ljung_box_p
  • residual_max_abs_acf

Validation rules:

  • upper-bound checks require warn <= fail
  • lower-bound checks require warn >= fail
  • omit the block entirely to use the built-in defaults

This block affects only the structured runner. It is stripped before Stage 00 model build so the public MMM YAML schema remains unchanged.

validation

Use the optional validation block when you want Stage 35 blocked holdout scoring.

validation:
  enabled: true
  holdout_observations: 8
  include_last_observations: true
  coverage_levels: [0.5, 0.8, 0.94]
  sampler:
    draws: 500
    tune: 500
    chains: 2
    cores: 2
    random_seed: 42

Supported keys:

Key Meaning
enabled Set to false to skip Stage 35 while keeping the stage in the manifest
holdout_observations Number of unique dates to reserve for the blocked holdout window
include_last_observations Keep lag history for carryover-sensitive holdout scoring
coverage_levels Coverage levels reported in Phase 10; use the fixed 50, 80, and 94 percent defaults
sampler Optional validation-only sampler overrides for the train-window refit

Phase 10 reports coverage as coverage_50, coverage_80, and coverage_94. Keep those defaults unless the implementation and tests are updated together.

The validation stage builds a clean train-window model for holdout scoring and ignores inference_data.path so the refit does not inherit attached posterior state from Stage 00.

Override precedence

For the runner, precedence is:

Setting Higher precedence Lower precedence
Combined dataset path dataset_path / --dataset-path data.dataset_path
Split CSV paths x_path, y_path / --x-path, --y-path data.x_path, data.y_path
Holiday CSV path holidays_path / --holidays-path holidays.path
Sampler settings PipelineRunConfig or CLI overrides fit
Target column for CSV loading target_column / --target-column target.column, then "y"
Diagnostics thresholds diagnostics.thresholds retained Stage 50 defaults

Common pitfalls

  • Using Parquet paths in the pipeline data block. The runner data loader reads CSV only.
  • Providing only one of data.x_path or data.y_path.
  • Treating optimization.total_budget as total horizon spend instead of per-period spend.
  • Assuming diagnostics is part of the public MMM builder schema. It is a runner-only block.
  • Assuming inference_data.path skips Stage 20 fitting. It does not.
  • Forgetting that relative paths are resolved from the YAML file directory, not from the shell working directory.

Output Directory Schema

Each pipeline run creates a timestamped directory under the configured output_dir:

<output_dir>/<run_name>_<YYYYMMDD_HHMMSS>

The timestamp is generated in UTC. The runner creates every stage directory up front, then updates run_manifest.json as stages start, complete, skip, or fail.

Directory tree

results/
  geo_panel_baseline_20260308_153000/
    run_manifest.json
    00_run_metadata/
    10_pre_diagnostics/
    20_model_fit/
    30_model_assessment/
    35_holdout_validation/
    40_decomposition/
    50_diagnostics/
    60_response_curves/
    70_optimisation/

Stage directories

Stage Directory Typical artefacts
metadata 00_run_metadata config.resolved.yaml, model_metadata.json, spec_summary.csv
preflight 10_pre_diagnostics prior_predictive.nc, prior_predictive.png
fit 20_model_fit model.nc, trace.png, posterior_summary.csv
assessment 30_model_assessment in-sample posterior predictive checks and residual outputs
validation 35_holdout_validation blocked holdout scoring, uncertainty-aware metrics, and residual diagnostics
decomposition 40_decomposition contribution CSVs and decomposition plots
diagnostics 50_diagnostics raw input screening, MCMC, predictive, and residual diagnostic reports
curves 60_response_curves saturation-only, forward-pass direct contribution, and adstock NetCDF, summaries, and plots
optimisation 70_optimisation allocation, response, optimisation summary, and bounds audit artefacts

See Runner Overview for the stage order and optionality.

Main artefacts by stage

00_run_metadata

Main files:

  • a copy of the original config under its source filename
  • config.original.yaml
  • config.resolved.yaml
  • session_info.txt
  • dataset_metadata.json
  • model_metadata.json
  • data_dictionary.csv
  • design_matrix_manifest.csv
  • spec_summary.csv
  • holiday_feature_manifest.csv when holidays are configured

config.resolved.yaml normalises configured data and holiday paths to absolute paths and records the effective sampler configuration on the model.

10_pre_diagnostics

Main files:

  • prior_predictive.nc
  • prior_predictive.png

20_model_fit

Main files:

  • model.nc
  • trace.png
  • posterior_summary.csv

30_model_assessment

Main files:

  • posterior_predictive.nc
  • posterior_predictive.png
  • posterior_predictive_summary.csv
  • observed.csv
  • fitted.csv
  • fit_timeseries.png
  • fit_scatter.png
  • residuals.csv
  • residuals_timeseries.png
  • residuals_hist.png
  • residuals_vs_fitted.png

This stage is the in-sample or training-fit assessment. It uses the same data the model was fit on and should not be read as the pipeline’s out-of-sample validation layer.

35_holdout_validation

Main files:

  • validation_metadata.json
  • holdout_posterior_predictive.nc
  • holdout_predictive_summary.csv
  • holdout_predictive_report.json
  • holdout_observed.csv
  • holdout_fitted.csv
  • holdout_residuals.csv
  • holdout_timeseries.png
  • holdout_residuals_acf.png

The holdout summary and report include uncertainty-aware metrics such as crps, bias, and fixed coverage columns for coverage_50, coverage_80, and coverage_94.

This stage is optional. When validation is absent or disabled in YAML, the directory still exists and the stage is marked skipped.

40_decomposition

Main files:

  • waterfall_components_decomposition.png
  • weekly_media_contribution.png
  • channel_contributions.csv
  • baseline_contributions.csv
  • mean_contributions_over_time.csv

50_diagnostics

Main files:

  • design_summary.csv
  • design_report.json
  • vif_report.csv
  • mcmc_summary.csv
  • mcmc_report.json
  • predictive_summary.csv
  • predictive_report.json
  • residual_diagnostics.csv
  • residuals_acf.png
  • diagnostics_report.csv
  • diagnostics_summary.txt
  • chain_diagnostics.txt

The design-oriented files are raw input screening outputs. In particular, diagnostics_report.csv labels the corresponding phase as raw_input_screening rather than design.

60_response_curves

Main files:

  • saturation_curve.nc
  • saturation_curve_summary.csv
  • saturation_curve.png
  • forward_pass_contribution_curve.nc
  • forward_pass_contribution_curve_summary.csv
  • forward_pass_contribution_curve.png
  • adstock_curve.nc
  • adstock_curve_summary.csv
  • adstock_curve.png

These artefacts are intentionally different:

  • saturation_curve.* is the sampled saturation transformation on the scaled channel axis, exported with original-scale contribution values for easier reading. The PNG overlays that saturation-only curve against posterior mean realised contributions.
  • forward_pass_contribution_curve.* is a full-model direct contribution artefact. It rescales the observed historical spend path from 0% to 200%, runs that spend through the fitted adstock and saturation path, and records the resulting total channel contribution in original target units.
  • adstock_curve.* is the sampled carryover-weight profile for one impulse.

70_optimisation

This directory is present for every run, but the stage is skipped unless the YAML config contains an optimization block.

Main files when the stage runs:

  • optimized_allocation.nc
  • optimized_allocation.csv
  • response_distribution.nc
  • optimize_result.json
  • budget_summary.csv
  • budget_response_points.csv
  • budget_impact.csv
  • budget_bounds_audit.csv
  • budget_roi_cpa.csv
  • budget_response_curves.csv
  • budget_mroi.csv
  • budget_optimisation.json
  • several PNG plots for allocation, contribution over time, response curves, impact, bounds audit, and ROI or CPA

run_manifest.json

The manifest is the machine-readable index for the whole run.

Top-level fields include:

Field Meaning
run_name Effective run name
timestamp UTC run timestamp
config_path Original config path
output_dir Run directory path
status Overall run status
model_class Set after Stage 00 builds the model
data Basic dataset metadata
stages Per-stage manifest records
warnings Run-level warnings
error Run-level failure payload when the pipeline aborts

data includes:

  • x_shape
  • y_length
  • target_column
  • x_columns

Stage records

Each stage record contains:

Field Meaning
directory Stage directory name
status Current stage status
started_at ISO timestamp when the stage started
finished_at ISO timestamp when the stage finished
artifacts Mapping of artefact labels to root-relative paths
warnings Stage warnings
error Error string when the stage fails

The artifacts mapping uses root-relative paths such as 20_model_fit/model.nc.

Stage statuses

Status Meaning
pending Stage has not started yet
running Stage is currently running
completed Stage finished successfully
skipped Stage returned None intentionally
failed Stage raised an exception
not_reached A previous stage failed before this one ran

Common cases:

  • Stage 35 is skipped when validation is missing or disabled from YAML.
  • Stage 70 is skipped when optimization is missing from YAML.
  • Later stages become not_reached after the first failure.

Practical use

Use the run directory when you want:

  • a stable folder for downstream reporting
  • a machine-readable audit trail through run_manifest.json
  • stage-level links to artefacts without hard-coding filenames

If you want to add new artefact types or stages, see Extending the Runner.

CLI Reference

The pipeline exposes a thin CLI through abacus.pipeline.runner.

Entry point

python -m abacus.pipeline.runner --config path/to/config.yml

On success, the CLI prints the final run directory:

Structured pipeline completed: results/my_run_20260308_153000

Arguments

Flag Required Default Meaning
--config Yes None YAML config path
--output-dir No results Root directory for pipeline runs
--run-name No Config filename stem Optional run-name override
--dataset-path No None Combined dataset CSV override
--x-path No None Feature CSV override when not using --dataset-path
--y-path No None Target CSV override when not using --dataset-path
--holidays-path No None Holiday CSV override
--target-column No None Target column used when reading CSV input
--prior-samples No 20 Prior predictive samples for Stage 10
--draws No None Posterior draws override
--tune No None Posterior tuning steps override
--chains No None Posterior chains override
--cores No None Posterior cores override
--random-seed No 42 Shared random seed
--curve-samples No 100 Posterior samples for Stage 60 curves
--curve-points No 100 Number of x-values for saturation curves

Common command patterns

Use the dataset path from YAML

python -m abacus.pipeline.runner \
  --config data/demo/geo_panel/config.yml

Override the combined dataset path

python -m abacus.pipeline.runner \
  --config configs/geo_panel.yml \
  --dataset-path /data/geo_panel_latest.csv \
  --run-name geo_panel_latest

Use separate feature and target files

python -m abacus.pipeline.runner \
  --config configs/panel.yml \
  --x-path /data/X.csv \
  --y-path /data/y.csv \
  --target-column revenue

Override sampler settings for one run

python -m abacus.pipeline.runner \
  --config configs/panel.yml \
  --draws 1000 \
  --tune 1000 \
  --chains 4 \
  --cores 4 \
  --random-seed 42

Override the holiday CSV

python -m abacus.pipeline.runner \
  --config configs/panel.yml \
  --holidays-path /data/holidays_uk_fr.csv

How CLI overrides interact with YAML

The CLI does not replace the full YAML config. It only overrides the runtime fields exposed through PipelineRunConfig.

Important behaviours:

  • --dataset-path takes precedence over data.dataset_path.
  • --x-path and --y-path take precedence over data.x_path and data.y_path.
  • --holidays-path takes precedence over holidays.path.
  • --draws, --tune, --chains, --cores, and --random-seed are merged onto YAML fit.
  • --target-column affects CSV loading. Keep it consistent with target.column in YAML.

Exit behaviour

The CLI exits with status 0 on success. On failure, the process exits non-zero with the underlying exception.

The pipeline stops at the first stage failure. It does not provide flags to:

  • run only a subset of stages
  • continue after a failed stage
  • disable individual built-in stages other than omitting the optional optimization block from YAML

See Runner Overview and YAML Configuration for the execution model and config surface.

Extending the Runner

The retained runner is static, not plugin-based. To add a stage or integrate custom status reporting, extend the existing runner surfaces instead of bypassing them.

Stage contract

A stage function has this contract:

def run_some_stage(context: PipelineContext) -> dict[str, str] | None:
    ...

Return values:

  • return a dict[str, str] of artefact labels to root-relative paths when the stage succeeds
  • return None when the stage is intentionally skipped
  • raise an exception when the stage fails and should abort the run

The runner handles manifest updates around the stage call. Do not update context.manifest directly from a normal stage implementation unless you are changing core runner behaviour.

What is available in PipelineContext

PipelineContext gives each stage access to:

Field Use it for
run_config Runtime settings such as output root, seeds, and curve sample counts
raw_cfg The loaded YAML config as a mutable mapping
X, y Loaded dataset inputs
paths Stage directories and manifest path
manifest Current run manifest
model_kwargs Effective sampler overrides passed into model build
model Built PanelMMM, available after Stage 00

Artifact helpers

Use the helpers in abacus/pipeline/artifacts.py:

  • write_json(...)
  • write_dataframe(...)
  • write_dataset(...)
  • write_idata(...)
  • write_text(...)
  • save_figure(...)
  • copy_file(...)

Use context.paths.relative(path) when building the artefact mapping that the stage returns. The manifest expects root-relative paths, not absolute paths.

Adding a new stage

To add a new built-in stage, update these places:

  1. abacus/pipeline/artifacts.py Add the stage directory name to STAGE_DIRECTORIES.
  2. abacus/pipeline/runner.py Add a PipelineStageSpec to PIPELINE_STAGE_SPECS.
  3. abacus/pipeline/runner.py Add the stage function to the stage_functions mapping inside run_pipeline(...).
  4. abacus/pipeline/stages/__init__.py Export the new stage helper if you want it available from the stage package.

Minimal stage example

from abacus.pipeline.artifacts import write_dataframe


def run_custom_stage(context):
    if context.model is None:
        raise ValueError("Model has not been initialized before the custom stage.")

    stage_dir = context.paths.stage_dirs["custom"]
    output_path = stage_dir / "custom_summary.csv"

    frame = context.model.summary.total_contribution(output_format="pandas")
    write_dataframe(output_path, frame)

    return {
        "custom_summary": context.paths.relative(output_path),
    }

Optional stage pattern

If a stage should only run when a config block is present, follow the same pattern as Stage 70:

def run_optional_stage(context):
    cfg = context.raw_cfg.get("my_optional_block")
    if cfg is None:
        return None
    ...

Returning None is what marks the stage as skipped in the manifest.

Failure semantics

If your stage raises an exception:

  • the stage is marked failed
  • the run is marked failed
  • later pending stages are marked not_reached
  • run_pipeline(...) re-raises the exception

That means stage code should only catch exceptions when it can recover locally and still produce a valid artefact set.

Adding structured reporting

If you want progress callbacks without changing the core stage code, implement a PipelineReporter and pass it to run_pipeline(...).

The reporter protocol methods are:

  • on_pipeline_start(...)
  • on_stage_start(...)
  • on_stage_end(...)
  • on_pipeline_end(...)
  • on_pipeline_error(...)

This is the right extension point for:

  • notebooks or dashboards that want progress updates
  • lightweight orchestration wrappers
  • structured logging around pipeline runs

Consuming the manifest programmatically

The manifest is written after every stage transition, so external tools can poll run_manifest.json during execution.

Typical uses:

  • check whether the optimisation stage was skipped
  • discover stage artefact paths without hard-coding filenames
  • detect the first failed stage and its error message

See Output Directory Schema for the manifest fields and status values.

FAQ

This section collects longer-form answers to recurring MMM, Bayesian, and panel-econometrics questions that come up when practitioners move from classical econometrics to PanelMMM.

The pages are written for technical readers who already understand regression, panel data, and causal inference, but want the Abacus framing.

Core model concepts

Priors and model checking

Computation and comparison

Panel specification

Suggested reading order

If you are new to Bayesian MMM, a practical sequence is:

  1. Bayesian Priors for Econometricians
  2. Prior Predictive Checks for Econometricians
  3. MCMC Diagnostics for Econometricians
  4. Posterior Predictive Checks for Econometricians
  5. Causal Identification in Marketing Mix Modelling

Subsections of FAQ

Bayesian Priors

This document addresses common concerns that econometricians have about Bayesian priors, reframes them using familiar econometric concepts, and discusses the practical trade-offs between “tight” and “loose” prior specifications in the context of Marketing Mix Modeling.


1. Are priors subjective? Don’t they bias the results?

This is the most common objection from econometricians. The short answer is: you are already using priors, you just call them something else.

Priors You Already Use in Classical Econometrics

Every constraint or modelling decision an econometrician makes is, mathematically, a prior belief imposed on the parameter space:

Classical Econometric Practice Bayesian Equivalent
“Media coefficients must be non-negative” (sign restriction) A HalfNormal or truncated prior that places zero probability on negative values
“The intercept should be positive because sales can’t be negative” A LogNormal prior on the intercept
Ridge regression (L2 penalty) A Normal(0, sigma) prior on all coefficients, where sigma controls the penalty strength
LASSO regression (L1 penalty) A Laplace(0, b) prior on all coefficients
Excluding a variable from the model entirely An infinitely tight prior at exactly zero (a point mass)
Including a variable with no constraints A uniform prior over $(-\infty, +\infty)$ — the so-called “non-informative” prior

The difference is not whether you impose assumptions, but whether you are explicit about them. In classical econometrics, these assumptions are hidden inside the model specification (variable selection, functional form, sign restrictions). In Bayesian modeling, they are declared openly as Prior objects, making them auditable, debatable, and reproducible.

Why “Letting the Data Speak” Is Itself a Prior

When a classical econometrician says “I let the data speak,” they are implicitly choosing a uniform (flat) prior: every parameter value from $-\infty$ to $+\infty$ is equally plausible before seeing the data. This sounds objective, but it has real consequences:

  • It assigns equal prior probability to a media ROI of 0.01 and a media ROI of 10,000,000.
  • In small samples (typical in marketing data: 100–200 weekly observations), this flat prior provides no regularization, leading to extreme, unstable coefficient estimates.
  • It is equivalent to running OLS with no penalty — which econometricians already know is fragile when $p$ is large relative to $N$.

A well-chosen weakly informative prior (e.g., HalfNormal(sigma=2) for media coefficients) does not “bias” the model. It says: “We believe media effects are positive and probably modest, but we are open to being surprised.” If the data strongly disagrees, the posterior will override the prior. If the data is ambiguous (as it often is with 150 weekly observations and 7 correlated media channels), the prior prevents the model from hallucinating absurd coefficient values.


2. How does Abacus specify priors?

In Abacus, priors are declared using Prior objects from the pymc_extras library. These are composable, hierarchical, and fully serializable. Here is a simple example:

from pymc_extras.prior import Prior

# A weakly informative prior for media channel betas:
# "Media effects are positive, probably modest, but could be larger"
beta_channel = Prior("HalfNormal", sigma=2)

# A prior for the intercept:
# "Baseline sales are positive and log-normally distributed"
intercept = Prior("LogNormal", mu=0, sigma=5)

# A hierarchical prior for adstock decay:
# "Carryover is moderate, skewed toward shorter decay"
alpha = Prior("Beta", alpha=1, beta=3)

Each Prior object is a first-class citizen in the model configuration. It can be inspected, overridden, serialized to YAML, and version-controlled — unlike classical econometric constraints, which are typically buried in code or verbal documentation.


3. What is the difference between “tight” and “loose” priors?

This is one of the most consequential modelling decisions in Bayesian MMM. Two real-world configurations from our repositories illustrate the spectrum.

Tight Priors: The DSAMbayes Approach

In the DSAMbayes R/Stan library, tight priors are implemented via explicit boundary constraints on media coefficients:

# From: DSAMbayes config/blm_synthetic_holidays_dummies.yaml
boundaries:
  overrides:
    - { parameter: m_tv,        lower: 0.0, upper: .Inf }
    - { parameter: m_search,    lower: 0.0, upper: .Inf }
    - { parameter: m_social,    lower: 0.0, upper: .Inf }
    - { parameter: m_display,   lower: 0.0, upper: .Inf }
    - { parameter: m_ooh,       lower: 0.0, upper: .Inf }
    - { parameter: m_email,     lower: 0.0, upper: .Inf }
    - { parameter: m_affiliate, lower: 0.0, upper: .Inf }

priors:
  use_defaults: true  # Package defaults (relatively tight)

What this does: Every media coefficient is hard-bounded to be non-negative. Combined with the package’s default priors (which are relatively concentrated), this creates a model that is strongly constrained. The data can move the coefficients within the allowed region, but the model will never produce a negative media effect.

Pros of tight priors:

  • Stability: Results are robust even with very small sample sizes (e.g., 52 weeks). The model cannot produce economically nonsensical results like “TV advertising reduces sales.”
  • Interpretability: Stakeholders can trust the sign and rough magnitude of every coefficient.
  • Convergence: The MCMC sampler explores a smaller parameter space, converging faster and with fewer divergences.
  • Reproducibility: Different analysts fitting the same data will obtain very similar results because the prior dominates the likelihood in ambiguous regions.

Cons of tight priors:

  • Risk of masking genuine effects: If a media channel truly has zero or negligible effect, a tight positive prior will force the model to assign it some positive contribution, creating a false positive. The model cannot “discover” that a channel is worthless.
  • Prior-data conflict: If the data strongly suggests a negative relationship (e.g., due to confounding — heavy TV spend coincides with a recession), the tight prior will suppress this signal. The analyst will not see the conflict unless they explicitly check for it.
  • Overconfidence: The posterior credible intervals will be artificially narrow, because the prior has eliminated large regions of the parameter space. This can make the model appear more certain than it actually is.

Loose Priors: The AMMM Approach

In the AMMM Python library, priors are specified with wider distributions and fewer hard constraints:

# From: AMMM data-config/demo_config.yml
custom_priors:
  intercept:
    dist: LogNormal
    kwargs:
      mu: 0
      sigma: 5        # Very wide — allows intercept to range enormously

  beta_channel:
    dist: HalfNormal
    kwargs:
      sigma: 1        # Moderately wide positive prior

  alpha:              # Adstock decay
    dist: Beta
    kwargs:
      alpha: 1
      beta: 3         # Weakly informative, skewed toward short decay

  lam:                # Saturation rate
    dist: Gamma
    kwargs:
      alpha: 3
      beta: 1         # Moderately informative

What this does: The priors are “weakly informative” — they encode soft directional beliefs (media effects are positive via HalfNormal, intercept is positive via LogNormal) but with wide spreads that allow the data substantial room to determine the final estimates.

Pros of loose priors:

  • Data-driven: The posterior is dominated by the likelihood, not the prior. Results are closer to what an unconstrained MLE would produce, which may feel more “honest” to econometricians.
  • Discovery: The model can reveal surprising patterns (e.g., a channel with near-zero effect will have a posterior concentrated near zero, rather than being artificially inflated).
  • Honest uncertainty: Posterior credible intervals reflect genuine estimation uncertainty, including uncertainty about effect direction.

Cons of loose priors:

  • Instability in small samples: With only 100–200 weekly observations and 7+ correlated media channels, a loose prior provides insufficient regularization. Coefficients can be wildly unstable across different random seeds or slight data perturbations.
  • Economically nonsensical results: Without strong regularization, the model may produce results that are statistically plausible but economically absurd (e.g., display advertising having a larger effect than TV despite 10x less spend).
  • Harder convergence: The MCMC sampler must explore a vast parameter space, leading to longer runtimes, more divergences, and lower effective sample sizes.

4. Which should we use: tight or loose?

Neither extreme is correct in isolation. The right choice depends on your sample size, number of media channels, and tolerance for false positives vs. false negatives.

The Practical Recommendation

Scenario Recommended Approach
Small sample ($N < 104$ weeks), many channels ($k > 5$) Tight priors. The data simply cannot identify 5+ correlated media effects independently. Without strong regularization, the model is fundamentally underidentified.
Medium sample ($104 < N < 208$ weeks), moderate channels Weakly informative priors (the Abacus default). Encode directional beliefs (positive media effects) but allow the data to determine magnitude.
Large sample ($N > 208$ weeks), few channels ($k \leq 3$) Loose priors are defensible. The data volume is sufficient to overwhelm even a weak prior, so the choice matters less.
Any sample size, with lift test calibration Loose priors become safer, because the lift test data injects external causal evidence that compensates for the weak regularization of the prior.

The Key Insight for Econometricians

In classical econometrics, you are trained to believe that constraints reduce efficiency (you “lose information” by restricting the parameter space). In Bayesian statistics, the opposite is often true for small samples: a well-chosen prior increases efficiency by concentrating the sampler on the economically plausible region of the parameter space. It is the Bayesian equivalent of using economic theory to improve your estimator, which is exactly what structural econometricians (e.g., in IO or macro) have always done.

The prior is not a bias. It is a statement of economic theory. If you believe advertising cannot reduce sales, encoding that belief is not “cheating” — it is incorporating domain knowledge, just as a structural econometrician incorporates equilibrium conditions or rational expectations into their likelihood.


5. Can I check whether the prior is dominating the posterior?

Yes. This is a critical diagnostic step. In Abacus (and any PyMC-based workflow), you should always compare the prior predictive distribution to the posterior distribution for each parameter.

  • If the posterior looks very similar to the prior, the data has not updated your beliefs. This means either: (a) the prior is too tight and is suppressing the data, or (b) the data genuinely contains no information about that parameter.
  • If the posterior is substantially narrower or shifted relative to the prior, the data has successfully updated your beliefs, and the prior served only as a sensible starting point.

This comparison is the Bayesian analogue of checking whether your classical constraints are binding. If they are always binding, you should question whether the constraints are appropriate.

Adstock and Saturation

In classical econometrics, you model diminishing returns by taking the logarithm of spend: $\log(\text{spend})$ enters the regression, and the coefficient captures an elasticity. Carryover effects, if considered at all, are handled with lagged dependent variables or Koyck distributed lags. These approaches are simple and familiar. They are also, for media measurement, inadequate.

This document explains the two non-linear transformations at the heart of every modern Marketing Mix Model — adstock (carryover) and saturation (diminishing returns) — and shows why they are more flexible, more interpretable, and more economically grounded than the classical alternatives. We also address a subtle but important modelling decision: whether to apply adstock before saturation, or saturation before adstock.


1. The Problem with Log-Linear Specifications

The classical $\log(\text{spend})$ specification makes a single, rigid assumption: the marginal return to an additional pound of media spend decreases at a rate governed by the reciprocal of current spend. Doubling spend from £100 to £200 produces the same incremental effect as doubling from £1,000 to £2,000. The curvature is fixed by the functional form. You cannot learn it from the data.

This creates two problems in practice.

The first is that the log transform cannot capture saturation at high spend levels. If a channel is already saturated — say, you have bought every available TV slot in the UK — the log transform will still predict positive incremental returns for every additional pound. The curve never flattens. In reality, the marginal return from saturated media is effectively zero, and you need a function that can reach a ceiling.

The second is that the log transform says nothing about carryover. A TV advertisement aired in week 10 does not affect sales only in week 10. Viewers remember the ad. Brand salience persists. The effect decays over subsequent weeks. A pure $\log(\text{spend}_t)$ specification attributes the entire effect to the week the money was spent, ignoring the temporal diffusion of advertising impact. You can add lagged terms manually ($\log(\text{spend}_{t-1})$, $\log(\text{spend}_{t-2})$, and so on), but each lag consumes a degree of freedom, and you must choose the lag length arbitrarily.

Abacus replaces both of these ad hoc treatments with two purpose-built, parameterised transformations whose shapes are learned jointly from the data inside the Bayesian graph.

2. Adstock: Modelling Carryover

Adstock captures a simple economic intuition: advertising has a lingering effect. A pound spent on TV in week 10 generates some response in week 10, a smaller response in week 11, a still smaller response in week 12, and so on until the effect has fully decayed.

The default implementation in Abacus is geometric adstock. The transformation takes the raw weekly spend series and replaces each observation with a weighted sum of current and past spend, where the weights decay geometrically:

$$x^*_t = x_t + \alpha \cdot x^*_{t-1}$$

The parameter $\alpha$ (between 0 and 1) controls the rate of decay. When $\alpha$ is close to zero, the effect is concentrated in the week of exposure — the ad is forgotten almost immediately. When $\alpha$ is close to one, the effect persists for many weeks — the brand impression lingers. The maximum lag length l_max truncates the convolution at a finite horizon for computational efficiency.

For an econometrician, recognise that this is precisely a Koyck distributed lag model, but with two critical differences. First, the decay parameter $\alpha$ is not estimated from lagged dependent variables (which introduces Nickell bias in short panels). It is estimated directly as a parameter of the transformation, with its own Bayesian prior — by default a Beta(1, 3) distribution that gently favours shorter decay while allowing the data to push toward longer persistence if warranted. Second, you do not need to choose the lag length by hand. You set l_max as a generous upper bound (say, 8 or 12 weeks), and the geometric decay structure ensures that distant lags receive negligible weight automatically.

Abacus also provides alternative adstock functions, including Weibull PDF and Weibull CDF adstock, which allow for non-monotonic decay patterns (an effect that peaks one or two weeks after exposure rather than immediately). These capture the empirical reality that some channels — particularly upper-funnel brand advertising — take time to build mental availability before generating measurable sales response.

3. Saturation: Modelling Diminishing Returns

Saturation captures the second economic intuition: each additional pound of spend on a channel is worth less than the last. The first £10,000 of TV spend reaches new audiences and generates substantial incremental sales. The next £10,000 reaches many of the same people again and generates less. Eventually, you have saturated the available audience, and further spend generates almost nothing.

The default implementation in Abacus is logistic saturation:

$$f(x) = \beta \cdot \frac{1 - e^{-\lambda x}}{1 + e^{-\lambda x}}$$

Two parameters govern the shape. The parameter $\lambda$ controls the steepness of the curve — how quickly diminishing returns set in. A large $\lambda$ means the channel saturates rapidly (steep initial response, early flattening). A small $\lambda$ means the channel has a long runway before saturation (gradual response, late flattening). The parameter $\beta$ controls the asymptotic maximum — the ceiling of the response, representing the maximum possible contribution from this channel regardless of spend.

Compare this to the classical $\log(\text{spend})$ specification. The logistic saturation curve has a genuine asymptote: beyond a certain spend level, the curve is effectively flat. The log specification has no such ceiling. The logistic curve also has a tunable inflection point (governed by $\lambda$), allowing the data to determine where diminishing returns begin. The log curve always bends at the same relative rate.

The default priors in Abacus encode mild economic beliefs. The prior on $\lambda$ is Gamma(3, 1), which centres mass on moderate saturation rates while allowing the data to push toward very steep or very gradual curves. The prior on $\beta$ is HalfNormal(sigma=2), which keeps the channel contribution positive and moderately scaled.

4. Joint Estimation Inside the Bayesian Graph

Here is the critical difference between the Abacus approach and classical pre-processing. In many legacy MMM implementations (and in some textbook treatments), the adstock and saturation transformations are applied as a pre-processing step: the analyst picks fixed values for $\alpha$ and $\lambda$ (perhaps through grid search or “expert judgement”), transforms the raw spend data, and then runs a linear regression on the transformed data.

This approach severs the chain of uncertainty. The regression treats the transformed spend as a known quantity, ignoring the fact that $\alpha$ and $\lambda$ were estimated (or guessed). The standard errors on the media coefficients are conditional on the pre-selected transformation parameters being exactly correct. They are too narrow.

In Abacus, the adstock parameter $\alpha$, the saturation parameters $\lambda$ and $\beta$, and the media coefficient are all estimated simultaneously inside a single PyMC model. The MCMC sampler explores the joint posterior over all parameters at once. When the sampler draws a high value of $\alpha$ (long carryover), it simultaneously adjusts $\lambda$ and the media coefficient to maintain consistency with the observed data. The resulting posterior credible intervals for media contribution honestly reflect uncertainty about the transformation shape, the coefficient magnitude, and their interactions.

This is analogous to the distinction between two-stage least squares (where the first-stage residuals inject estimation error into the second stage, requiring corrected standard errors) and full-information maximum likelihood (where all parameters are estimated jointly). The Bayesian joint estimation in Abacus is closer in spirit to FIML, but with the added benefit of prior regularisation.

5. The Ordering Decision: Adstock First or Saturation First

When you initialise a PanelMMM in Abacus, you choose adstock_first=True (the default) or adstock_first=False. This decision controls the order in which the two transformations are composed, and it encodes a substantive economic assumption about how the media channel operates.

When adstock_first=True, the pipeline is: raw spend → adstock → saturation. The economic interpretation is that carryover accumulates first in the consumer’s memory (brand salience builds up over multiple weeks of exposure), and only then does the accumulated stock of impressions hit diminishing returns. This makes sense for brand-building channels like TV, outdoor, and sponsorship, where the advertising effect is cumulative and the saturation constraint applies to the total accumulated exposure rather than to a single week’s spend.

When adstock_first=False, the pipeline is: raw spend → saturation → adstock. The economic interpretation is that diminishing returns apply immediately to each week’s spend (this week’s audience is saturated by this week’s spend alone), and only then does the already-saturated response carry over into future weeks. This makes sense for direct-response channels like paid search or performance display, where each week’s impressions hit a ceiling independently (you can only capture so many searches in a week), but the conversion effect persists.

The distinction matters quantitatively. Under adstock-first, the model allows a sequence of moderate spend weeks to accumulate into a heavily saturated state — even if no single week was high-spend on its own. Under saturation-first, each week’s spend is capped independently, so a steady moderate spend never reaches the saturation ceiling.

In practice, most MMM practitioners default to adstock-first for all channels, which is why Abacus sets adstock_first=True as the default. But if you have strong prior knowledge that a particular channel exhibits immediate per-week saturation (because the audience pool is fixed and refreshes weekly), switching the order is a principled modelling choice.

6. Why This Matters for Econometricians

The adstock-saturation framework replaces several ad hoc classical specifications with a coherent, jointly estimated non-linear model. To summarise the mapping:

The classical Koyck lag model is replaced by geometric adstock with a Bayesian prior on the decay rate. You no longer need to choose lag lengths manually or worry about Nickell bias from lagged dependent variables.

The classical $\log(\text{spend})$ specification is replaced by logistic saturation with learnable steepness and ceiling parameters. You gain a genuine asymptote (something $\log$ cannot provide) and data-driven curvature (something $\log$ fixes by assumption).

The classical two-stage approach (transform then regress) is replaced by joint Bayesian estimation. Your credible intervals honestly propagate uncertainty from the transformation parameters through to the media contribution estimates.

The result is a media response model that is more flexible than any classical specification, more honest about uncertainty, and grounded in the same economic intuitions — carryover and diminishing returns — that econometricians have always recognised. The difference is that Abacus lets the data determine the shape of these phenomena rather than imposing it through functional form.

HSGP

This document answers common questions econometricians may have when encountering HSGP (Hilbert Space Gaussian Process) approximations in the codebase, particularly regarding model flexibility and the number of basis functions.

1. Does a Hilbert Space Gaussian Process use up degrees of freedom when modelling?

Yes, but not in the strict $N - k$ counting sense used in classical OLS econometrics. Instead, Gaussian Processes (and their HSGP approximations) use “effective degrees of freedom” (EDF) due to Bayesian regularization.

Here is how to map HSGPs to classical econometrics concepts:

The Mechanical Setup (Looks like it uses $m$ degrees of freedom)

In classical econometrics, if you want to model a non-linear time trend, you might add polynomial terms or a Fourier series (sines and cosines). If you add $m$ sine/cosine terms to your OLS model, you lose exactly $m$ degrees of freedom.

An HSGP is mathematically very similar to a Fourier series. It approximates an infinite-dimensional Gaussian Process by using $m$ basis functions (the m parameter in the code, often set to 50–200). If this were OLS, estimating those 200 basis function coefficients would cost 200 degrees of freedom, potentially breaking your model if $N < 200$.

The Bayesian Reality (Effective Degrees of Freedom)

In an HSGP, those $m$ coefficients are not freely estimated. They are bound together by a hierarchical prior structure governed by hyperparameters, specifically the lengthscale ($\ell$) and the amplitude/variance ($\eta$).

Because the coefficients share a prior that heavily shrinks most of them toward zero, we measure the flexibility using Effective Degrees of Freedom (EDF).

  • Like Ridge Regression: Think of HSGP as running a Ridge Regression (L2 regularization) on 200 Fourier terms. Even though there are 200 parameters, the L2 penalty restricts their variance. The “effective” degrees of freedom might only be 4 or 5.
  • Data-driven penalty: The amount of shrinkage is controlled by the lengthscale ($\ell$).
    • If the data shows a smooth, slowly moving trend, the model learns a large lengthscale. This imposes massive shrinkage on the high-frequency (wiggly) basis functions, meaning the HSGP uses very few effective degrees of freedom (acting almost like a simple linear trend).
    • If the data is highly volatile, the model learns a short lengthscale, relaxing the shrinkage, allowing the curve to wiggle, and consuming more effective degrees of freedom.

Summary: While you might instantiate an HSGP with 100 basis functions ($m=100$), it does not subtract 100 from your denominator. It dynamically consumes exactly as much “effective” flexibility as the data proves is necessary, heavily penalizing unnecessary complexity (wiggliness) via its priors. You are completely safe from the classical $N - k < 0$ matrix inversion failures.


2. Is it up to the analyst to decide how many basis functions to set? Will this result in specification hunting?

This is a very valid concern. In standard OLS, if Analyst A uses a 5th-order Fourier series and Analyst B uses a 20th-order Fourier series, they will get wildly different results, opening the door for specification hunting.

In the Abacus HSGP implementation, this risk is mitigated in two ways: Automated Heuristics (code design) and Approximation Limits (mathematical design).

1. Automated Selection (The Code Design)

The library is specifically designed so analysts do not have to guess or manually set the number of basis functions ($m$).

In the HSGP class, the factory method parameterize_from_data calculates $m$ automatically using an algorithm (approx_hsgp_hyperparams) based on published literature (Ruitort-Mayol et al., 2022).

It calculates $m$ deterministically based on two things:

  1. The span of the time-series data (e.g., 3 years of weekly data).
  2. The lower bound of the lengthscale prior (the shortest time-span over which we believe the effect could realistically change).

This guarantees that two analysts modeling the same dataset with the same assumptions will end up with the exact same $m$.

2. $m$ dictates “Resolution”, not “Complexity” (The Mathematical Design)

Even if an analyst decided to bypass the automation and manually force a massive number of basis functions, it would not result in overfitting or specification hunting.

In an HSGP, $m$ is just the resolution limit of the approximation to the true infinite-dimensional Gaussian Process.

  • If $m$ is too small: The model lacks the resolution to capture fast-moving trends (it will artificially smooth things out).
  • If $m$ is exactly right (e.g., $m=50$): The model perfectly approximates the true Gaussian Process.
  • If $m$ is absurdly large (e.g., $m=500$): The model will yield the exact same curve as $m=50$.

Why? Because the extra 450 basis functions represent very high-frequency, rapid wiggles. The Bayesian lengthscale prior mathematically forces the coefficients for those extra high-frequency basis functions exactly to zero.

The only penalty for setting $m$ too high is computation time. The MCMC sampler will run much slower because it has to drag around useless matrices, but the statistical fit will remain identical. Therefore, an analyst cannot “p-hack” or specification-hunt by artificially inflating $m$.

3. We often model trend/seasonality using explicit Fourier terms (e.g., sin52_1 + cos52_1 + ...). This uses up degrees of freedom and often causes severe multicollinearity (high VIF) with our media or control variables. Does HSGP solve this?

Yes. Explicitly adding Fourier terms to a linear formula creates structural problems that HSGP elegantly sidesteps.

1. The Degrees of Freedom Problem

As discussed in Section 1, explicitly adding 10 sine/cosine terms to a regression permanently burns 10 degrees of freedom. The model is forced to independently estimate an unpenalized coefficient for every single wave, regardless of whether that specific frequency is actually present in the data.

The HSGP Solution: HSGP uses Effective Degrees of Freedom (EDF). It evaluates a large number of basis functions (which are essentially Fourier terms), but ties them all together under a single hierarchical Gaussian Process prior. If the data doesn’t exhibit a certain high-frequency wiggle, the GP lengthscale prior dynamically crushes the coefficients of those specific basis functions toward zero. You get the flexibility of 100 sine waves, but only “pay” for the effective degrees of freedom the data actually demands.

2. The Multicollinearity (High VIF) Problem

When you add explicit Fourier terms, they act as independent regressors. If one of your media channels (e.g., m_tv) happens to have a seasonal spending pattern that correlates strongly with sin52_1, the model suffers from classic multicollinearity. The VIF skyrockets, standard errors blow up, and the coefficient for m_tv becomes completely unstable (the “backdoor” bias).

The HSGP Solution: HSGP mitigates this through structured regularization.

  • Orthogonal Basis: The basis functions generated internally by the HSGP are orthogonal to each other.
  • Shared Shrinkage: More importantly, the coefficients for the HSGP basis functions are not estimated independently. They are strictly regularized by the GP’s lengthscale ($\ell$) and variance ($\eta$) hyperparameters. Because the GP is mathematically constrained to behave like a smooth, cohesive curve, it cannot arbitrarily spike a single basis function’s coefficient just to “steal” variance from a highly correlated m_tv variable. The GP prior strongly penalizes such isolated, un-smooth coefficient spikes. Consequently, the model focuses on capturing the true underlying baseline trend, leaving the media coefficients much more stable than they would be against unpenalized, explicit Fourier regressors.

4. Should we feed in holiday dummy variables instead?

No. You do not need to manually construct binary 1/0 dummy variables (e.g., is_black_friday) or step functions in your input data.

The recommendation is to pass the raw dates of the holidays directly into the model via a separate DataFrame. Abacus’s EventAdditiveEffect API will internally calculate the distance in days from your time series to the holiday, and wrap that in a continuous basis function (like a Gaussian curve). This provides a smoother, more realistic “build-up and cool-down” effect compared to the harsh structural breaks of traditional dummy variables.

Example: Ingesting a Holidays DataFrame into Abacus

If you have a CSV of holidays (like data-config/holidays.csv), you load it as a standard Pandas DataFrame and inject it into the model configuration before building.

import pandas as pd
from pymc_extras.prior import Prior
from abacus.mmm.panel import PanelMMM
from abacus.mmm.events import EventEffect, GaussianBasis

# 1. Load your raw holidays
# The dataframe must contain exactly: "name", "start_date", "end_date"
df_holidays = pd.DataFrame({
    "name": ["Black Friday 2023", "Black Friday 2024", "Christmas 2023"],
    "start_date": ["2023-11-24", "2024-11-29", "2023-12-25"],
    "end_date": ["2023-11-25", "2024-11-30", "2023-12-26"]
})

# 2. Define the mathematical shape of the holiday effect
# We use a GaussianBasis so the effect smoothly ramps up and down
holiday_effect = EventEffect(
    basis=GaussianBasis(),
    effect_size=Prior("Normal", mu=0, sigma=1),
    dims="holiday"
)

# 3. Initialize your MMM
mmm = PanelMMM(
    date_column="date",
    target_column="sales",
    channel_columns=["tv", "social"],
    dims=("country",)
)

# 4. Inject the raw dataframe into the API
# Abacus handles all the distance calculations and basis mappings internally
mmm.add_events(
    df_events=df_holidays,
    prefix="holiday",
    effect=holiday_effect
)

# 5. Build and fit as normal
mmm.build_model(X, y)
mmm.fit()

5. If HSGP is statistically superior for seasonality, why does the fourier.py module still exist?

This is not a contradiction. Model building requires balancing statistical elegance with computational constraints and structural assumptions. There are four reasons explicit Fourier terms are retained alongside HSGP in the library:

1. Computation Speed

HSGPs are statistically efficient but computationally expensive. The PyMC engine must invert and multiply large matrices to solve the Gaussian Process approximation. Explicit Fourier terms, by contrast, are just static columns in the design matrix. Estimating a Bayesian regression with 4 sine/cosine columns takes seconds; fitting an HSGPPeriodic can be substantially slower. For analysts iterating rapidly on a prototype or running models on large datasets, explicit Fourier terms offer a fast, “good enough” approximation.

2. Static vs. Drifting Seasonality

  • HSGPPeriodic allows the seasonal shape to drift slowly over time (e.g., consumer behaviour shifting gradually across 5 years). This is more realistic but requires learning extra GP hyperparameters.
  • Explicit Fourier forces the seasonality to be completely static: the December peak in 2021 is mathematically identical to the December peak in 2024. If the econometrician has a strong prior belief that the seasonal structure is structurally invariant, explicit Fourier terms enforce that belief more rigidly than an HSGP can.

3. The “Trend = HSGP, Seasonality = Fourier” Hybrid Pattern

A very common and practically effective architecture in Bayesian MMMs is:

  • Standard HSGP for the baseline trend, because trend is unbound, unpredictable, and highly prone to overfitting.
  • A low-order YearlyFourier (e.g., n_order=2 or 3) for seasonality, because seasonality is bounded, predictable, and structurally repetitive.

By keeping the Fourier order very low, the degrees of freedom penalty is minimal (only 4–6 parameters), and the analyst avoids the computational overhead of running two separate HSGPs simultaneously. This hybrid is often the most practical choice for weekly marketing data.

4. Backwards Compatibility and Migration

Many teams migrate to Abacus from legacy OLS frameworks or tools like Prophet, which relies heavily on explicit Fourier terms. To build trust in the new Bayesian framework, econometricians often want to first build a “baseline” model that perfectly mirrors their old model’s architecture and verify they obtain comparable results. The fourier.py module enables this 1:1 apples-to-apples comparison before upgrading the architecture to use HSGPs.

MCMC Diagnostics

If you have spent your career reading Stata output — coefficient tables, standard errors, t-statistics, p-values, and the occasional Durbin-Watson statistic — then your first encounter with MCMC output will feel disorienting. There are no p-values. There is no single “estimate.” Instead, there are thousands of draws from something called a posterior distribution, accompanied by diagnostics you have never seen: R-hat, ESS, divergences, trace plots. This document maps every one of these concepts back to something you already understand, so you can read Bayesian output with the same confidence you bring to a regression table.


1. What the Sampler Actually Does

In classical econometrics, estimation is an optimisation problem. You write down a likelihood function and find the parameter values that maximise it (MLE) or minimise a loss function (OLS, GMM). The result is a single point estimate for each parameter, and the standard errors come from the curvature of the likelihood at that point (the inverse of the information matrix).

In Bayesian estimation, we do not optimise. We integrate. The goal is to characterise the entire posterior distribution — the full landscape of parameter values that are consistent with both the data and the prior. For most models of practical interest, this integral has no closed-form solution. We cannot write down a formula for the posterior the way you can write down the OLS estimator $\hat{\beta} = (X'X)^{-1}X'y$.

MCMC (Markov Chain Monte Carlo) solves this problem by constructing a random walk through the parameter space. At each step, the sampler proposes a new set of parameter values, evaluates how well they fit the data (the likelihood) and the prior, and decides whether to accept or reject the proposal. After enough steps, the collection of accepted values — the “chain” — converges to a representative sample from the posterior distribution.

The specific algorithm used in Abacus and PyMC is called NUTS (the No-U-Turn Sampler), a variant of Hamiltonian Monte Carlo (HMC). Think of it as a physics simulation: the sampler treats the negative log-posterior as a potential energy surface and launches a particle across it. The particle rolls downhill into regions of high posterior density and rolls uphill out of regions of low density. NUTS automatically tunes the trajectory length so the particle explores efficiently without doubling back on itself.

The critical point for an econometrician: the output of this process is not a single number. It is a collection of, say, 4,000 parameter vectors (2 chains × 2,000 draws each). Every summary statistic you will ever compute — the mean, the median, credible intervals, the probability that a coefficient exceeds zero — derives from this collection of draws.

2. Trace Plots: The First Thing to Check

A trace plot displays the sampled values of a single parameter across the iterations of the chain. The horizontal axis represents the iteration number. The vertical axis represents the parameter value. If everything has gone well, the trace plot looks like a “fuzzy caterpillar” — a dense, stationary band of values oscillating around a stable mean with no visible trends, steps, or sticky regions.

If you are an econometrician, think of the trace plot as the time-series plot of an MCMC residual. You want it to look like white noise. Specifically, you want three properties.

The first is stationarity. The chain should not drift upward or downward over time. If you see a clear trend, the chain has not converged: the sampler is still searching for the high-density region of the posterior, and the draws from the early part of the chain are not representative. This is analogous to estimating an AR(1) process that has not yet reached its stationary distribution.

The second is good mixing. The chain should move rapidly across the full support of the posterior. If you see long stretches where the chain gets “stuck” at a particular value before jumping to another region, the sampler is struggling to explore the parameter space. Poor mixing inflates your effective standard errors, just as strong autocorrelation in a time series reduces the effective information content of the data.

The third is agreement across chains. If you run multiple independent chains (and you always should — Abacus defaults to at least two), they should all settle into the same band. If one chain is exploring a different region of the parameter space from the others, the model has not converged, and you cannot trust any summary statistics.

3. R-hat: The Convergence Diagnostic

R-hat ($\hat{R}$) is the single most important diagnostic number in Bayesian computation. It measures whether multiple independent chains have converged to the same distribution.

The intuition is straightforward. R-hat compares the variance of a parameter within each chain to the variance of the same parameter across chains. If all chains are sampling from the same distribution, these two variances should be roughly equal, and R-hat should be close to 1.0. If the chains disagree — one chain has settled around 0.5 while another has settled around 2.3 — the between-chain variance will be large relative to the within-chain variance, and R-hat will be substantially greater than 1.0.

For an econometrician, think of R-hat as a convergence test analogous to the Gelman-Rubin statistic (because that is exactly what it is, in its modern split-chain formulation). The threshold is conventional: R-hat below 1.01 is considered safe. Values between 1.01 and 1.05 warrant caution. Values above 1.1 indicate that the chains have not converged, and you should not interpret the results.

When R-hat is too high, the remedy is usually to run the sampler for more iterations (increase tune and draws), reparameterise the model (e.g., use non-centered parameterisations for hierarchical models), or simplify the model.

4. Effective Sample Size (ESS): Your True Degrees of Freedom

The sampler produces, say, 4,000 draws. But consecutive draws are autocorrelated — each draw is a small perturbation of the previous one. The effective sample size (ESS) measures how many independent draws your 4,000 autocorrelated draws are actually worth.

If you are an econometrician, you already understand this concept perfectly. It is identical to the Newey-West correction for autocorrelated errors in time-series regression. When your regression residuals are positively autocorrelated, the “effective” number of independent observations is smaller than the nominal sample size $N$, and your standard errors are too small if you ignore the autocorrelation. ESS performs exactly the same adjustment for MCMC draws.

There are two flavours of ESS reported in PyMC and ArviZ output. ESS-bulk measures the effective sample size in the centre of the posterior distribution (around the mean and median). ESS-tail measures the effective sample size in the tails (relevant for credible interval estimation). Both matter.

The practical threshold is simple: you want ESS-bulk and ESS-tail both above 400 for reliable inference. Below 400, your posterior summaries are noisy — the mean might be reasonable, but the 95% credible interval endpoints could shift substantially if you re-ran the sampler. Below 100, the results are unreliable and should not be reported.

When ESS is too low, the remedies are to increase the number of draws, improve the model parameterisation, or thin the chains (though thinning is rarely the best option — more draws is almost always preferable).

5. Divergences: The Red Flag You Must Not Ignore

A divergence is an event during sampling where the NUTS trajectory encounters a region of the posterior that changes so sharply that the numerical integration breaks down. The sampler detects that its simulated particle has deviated from the true Hamiltonian trajectory and flags the draw.

For an econometrician, think of a divergence as the Bayesian equivalent of a near-singular Hessian in MLE optimisation. When the likelihood surface has extremely steep ridges or sharp funnels, the MLE optimiser either fails to converge or converges to a local maximum. In MCMC, the analogous pathology manifests as divergences.

Divergences are not merely a computational nuisance. They indicate that the sampler has failed to explore some region of the posterior, which means the resulting draws are a biased sample from the true posterior. Even a handful of divergences can systematically exclude an important region of the parameter space, leading to overconfident and potentially wrong inference.

The practical rule is unforgiving: zero divergences is the target. A small number (fewer than 10 out of 4,000 draws) may be tolerable if they occur during the early warmup phase and do not cluster in a particular region. But if you see hundreds of divergences, the model is misspecified or poorly parameterised, and no amount of additional sampling will fix the problem.

The most common remedies are increasing target_accept (the target acceptance probability for NUTS, analogous to tightening the step size), reparameterising the model (switching from a centred to a non-centred parameterisation for hierarchical priors), or simplifying the model to remove the pathological geometry.

6. Posterior Credible Intervals Replace Confidence Intervals

In classical econometrics, a 95% confidence interval means: “If we repeated this experiment infinitely many times and constructed an interval each time, 95% of those intervals would contain the true parameter.” Crucially, it does not mean that there is a 95% probability that the true parameter lies in this particular interval. The true parameter is fixed. The interval is random.

A 95% Bayesian credible interval means exactly what you always wished the confidence interval meant: “Given the data and the model, there is a 95% probability that the parameter lies in this interval.” The parameter is treated as a random variable (with a posterior distribution), and the interval directly quantifies our uncertainty about its value.

The Highest Density Interval (HDI), which Abacus and ArviZ report by default, is a specific type of credible interval: the narrowest interval that contains 95% (or 94%, the ArviZ default) of the posterior mass. For symmetric posteriors, the HDI coincides with the equal-tailed credible interval. For skewed posteriors (common for variance parameters or media effects bounded at zero), the HDI is narrower and more informative.

7. Mapping Bayesian Output to Classical Hypothesis Testing

econometricians are trained to ask: “Is this coefficient statistically significant?” In Bayesian inference, the question is reframed as: “What is the probability that this coefficient exceeds (or falls below) a particular threshold?”

The mapping is direct. When a 94% HDI for a media coefficient excludes zero — meaning the entire interval lies above zero — this is the Bayesian analogue of rejecting the null hypothesis at roughly the 6% significance level. When a 90% HDI excludes zero, the analogy is rejection at the 10% level.

But Bayesian inference offers richer answers than a binary significant/not-significant verdict. You can compute the exact posterior probability that the coefficient exceeds zero: $P(\beta > 0 \mid \text{data})$. If this probability is 0.98, you have strong evidence that the media channel has a positive effect. If it is 0.62, you have weak and inconclusive evidence. The posterior probability gives you a continuous measure of evidential strength, not a binary decision forced by an arbitrary 5% threshold.

You can also compute the posterior probability that the coefficient exceeds a practically meaningful threshold. “Is there at least a 90% probability that the ROI for TV exceeds 1.0?” is a more useful question for a media planner than “Is the TV coefficient significantly different from zero?” Bayesian inference answers the first question naturally.

8. A Diagnostic Checklist

When you receive MCMC output from an Abacus model run, work through the following checks in order.

Start with R-hat. Examine R-hat for every parameter. If any R-hat exceeds 1.01, stop. The chains have not converged, and every downstream summary is unreliable. Increase tune and draws, or investigate the model parameterisation.

Next, check for divergences. If the sampler reports more than a handful of divergences, the posterior geometry is pathological. Increase target_accept to 0.95 or 0.99. If divergences persist, the model likely needs reparameterisation or simplification.

Then examine ESS. Verify that ESS-bulk and ESS-tail exceed 400 for every parameter of interest. If ESS is low despite good R-hat, the chains are highly autocorrelated. Increase the number of draws.

Now inspect trace plots. Visually confirm that each chain looks like stationary white noise and that multiple chains overlap. Look for any sticky regions, trends, or bimodality.

Finally, interpret the posteriors. Report the posterior mean or median as your point estimate, the HDI as your interval estimate, and the posterior probability of exceeding zero (or any substantive threshold) as your measure of evidential strength.

Only after all four computational diagnostics pass — R-hat, divergences, ESS, and trace plots — should you proceed to interpret the substantive results. A Bayesian model with poor diagnostics is no more trustworthy than an OLS regression with autocorrelated residuals and a Durbin-Watson statistic of 0.4. The numbers may look plausible, but they are not reliable.

Prior Predictive Checks

If you come from classical econometrics, you are used to checking assumptions after estimation: residual plots, heteroskedasticity tests, outlier influence, and maybe out-of-sample fit. Bayesian workflow adds one earlier question:

Before fitting anything, do my priors imply plausible behaviour for the target variable?

That is what prior predictive checking answers.

1. Why parameter-level priors are not enough

A prior can look sensible when you inspect it in isolation and still imply absurd behaviour once it flows through the whole model.

For example:

  • an intercept prior may look “weakly informative” on paper
  • a channel coefficient prior may look “reasonably positive”
  • a likelihood sigma prior may look “safely diffuse”

But jointly, those choices might imply:

  • weekly revenue that is far above anything you could ever observe
  • negative conversions for a business where the target is always non-negative
  • far more volatility than the real series could possibly have

Classical econometrics rarely forces you to check this explicitly because you usually specify penalties or constraints directly on the coefficient space. Bayesian MMM requires one more layer of discipline: inspect the implied distribution of y, not just the configured priors on the parameters.

2. What prior predictive checking does

Prior predictive checking asks:

If the priors were true, what kinds of target series would this model generate before seeing the actual data?

The workflow is:

  1. Build the model with your chosen priors and structure.
  2. Sample from the prior predictive distribution.
  3. Compare those simulated target draws with the scale and shape of the real target series.

This is not a convergence check and it is not a causal test. It is a plausibility check on the model you are about to fit.

3. How Abacus supports it

Abacus exposes prior predictive sampling directly on PanelMMM:

prior = mmm.sample_prior_predictive(
    X=X,
    y=y,
    samples=100,
    random_seed=42,
)

If you want a quick visual check, Abacus also exposes a retained plot surface:

figure, axes = mmm.plot.prior_predictive(var=mmm.output_var)

In the structured runner, this is Stage 10, the preflight stage. The pipeline writes:

  • 10_pre_diagnostics/prior_predictive.nc
  • 10_pre_diagnostics/prior_predictive.png

Abacus currently gives you the sampled draws and the plot. It does not apply an automatic plausibility score or a hard pass/fail gate for you.

4. What to look for

A useful prior predictive check is not about matching the data exactly. That would defeat the point of a prior. The question is whether the implied target behaviour is at least in the right universe.

Look for the following.

Level

Do the simulated draws live on roughly the same order of magnitude as the observed target?

If your historical weekly revenue is in the low millions, prior predictive draws in the billions are a red flag.

Dispersion

Is the implied volatility remotely plausible?

If the prior predictive distribution is much wider than the observed series, your likelihood sigma or contribution priors are probably too loose.

Sign and support

Does the model imply values that violate business reality?

For example:

  • negative conversions
  • implausibly negative revenue
  • large oscillations around zero for a strictly positive KPI

These are often signs that the prior scale is too permissive relative to the data scaling and likelihood choice.

Time pattern

Do the implied trajectories look structurally plausible?

You are not looking for a perfect seasonal pattern before fitting, but you should ask whether the prior predictive draws look like something that could have come from your business rather than from a random-number generator with no economic interpretation.

5. Common failure modes

Several practical pathologies show up repeatedly.

The intercept is too loose

A very wide intercept prior can dominate the prior predictive distribution, especially when the target has been scaled but the intercept prior is still too diffuse for the transformed space.

The likelihood sigma is too loose

If the prior predictive draws look far too noisy, the problem is often not the media priors at all. It is the observation model allowing implausibly large residual variance.

Media transformation priors are too permissive

Adstock and saturation priors that allow unrealistically persistent carryover or unrealistically steep response can imply contributions that are wildly too large before the data has had any say.

Flexible baseline terms are too unconstrained

Time-varying intercepts, seasonality, events, and other additive effects can all inject structure into the prior predictive distribution. If those priors are too loose, the target series can become implausibly volatile or pattern-heavy before fitting.

6. What to do when the prior predictive check looks bad

Do not proceed directly to posterior interpretation. Fix the model first.

Typical remedies:

  • tighten the intercept prior
  • tighten the likelihood sigma prior
  • make media priors more weakly informative in the economically plausible region rather than completely diffuse
  • reduce unnecessary model flexibility before the data has justified it
  • check whether your scaling choices make the configured priors too wide or too narrow on the model scale

This is the Bayesian analogue of catching a broken specification before you start arguing about p-values.

7. What prior predictive checks do not tell you

Passing a prior predictive check does not mean:

  • the model is causally identified
  • the model will fit well
  • the posteriors will converge cleanly
  • the attribution decomposition will be trustworthy

It only means the configured priors do not imply obviously absurd target behaviour before seeing the data.

You still need:

8. Practical recommendation

Treat prior predictive checking as a standard pre-fit step, not as an optional extra for purists.

In Abacus terms, the workflow should usually be:

  1. Specify the model and priors.
  2. Run sample_prior_predictive(...).
  3. Inspect the implied target behaviour.
  4. Revise the priors if needed.
  5. Fit only once the prior predictive behaviour is broadly plausible.

That sequence is usually cheaper than fitting a badly specified Bayesian MMM and then discovering that the posterior is unstable for reasons you could have caught before sampling.

Posterior Predictive Checks

Posterior predictive checking asks a simple question:

After fitting the model, can it reproduce the main features of the observed data?

For a classically trained econometrician, this is the Bayesian analogue of residual diagnostics, fitted-versus-observed checks, and out-of-sample sanity-checking, but with one important difference: the checks are based on the full posterior distribution, not a single point estimate.

1. What the check actually is

After fitting, you sample from the posterior predictive distribution:

post = mmm.sample_posterior_predictive(
    X=X,
    progressbar=False,
    random_seed=42,
)

Conceptually, each posterior draw says:

  • here is one plausible parameter vector
  • given that parameter vector, here is one plausible target path

If the fitted model is adequate, the observed data should look like a credible member of that posterior predictive family.

2. Why this matters

A model can have:

  • clean MCMC diagnostics
  • seemingly sensible coefficient signs
  • elegant priors

and still fail to reproduce basic features of the target series.

Posterior predictive checks catch that mismatch.

This matters because a model that cannot reproduce the observed target well enough is usually not ready for:

  • decomposition narratives
  • ROI or CPA interpretation
  • budget optimisation
  • strong causal storytelling

3. How Abacus supports it

Abacus exposes posterior predictive sampling directly:

post = mmm.sample_posterior_predictive(
    X=X,
    progressbar=False,
    random_seed=42,
)

It also exposes retained plotting helpers such as:

figure, axes = mmm.plot.posterior_predictive(var=[mmm.output_var])
residual_figure, residual_axes = mmm.plot.residuals_over_time(hdi_prob=[0.94])

In the structured runner, Stage 30 assessment writes a fuller set of artefacts:

  • 30_model_assessment/posterior_predictive.nc
  • 30_model_assessment/posterior_predictive.png
  • 30_model_assessment/posterior_predictive_summary.csv
  • 30_model_assessment/observed.csv
  • 30_model_assessment/fitted.csv
  • 30_model_assessment/fit_timeseries.png
  • 30_model_assessment/fit_scatter.png
  • 30_model_assessment/residuals.csv
  • 30_model_assessment/residuals_timeseries.png
  • 30_model_assessment/residuals_hist.png
  • 30_model_assessment/residuals_vs_fitted.png

That assessment stage is the closest Abacus comes to a retained, systematically-produced posterior predictive diagnostics bundle.

4. What to inspect

Observed versus fitted over time

Start with the time-series overlay.

Ask:

  • Does the fitted mean track the major movements in the target?
  • Are the predictive intervals wide enough to cover the observed series reasonably often?
  • Does the model systematically lag turning points or seasonal peaks?

If the observed line keeps sitting outside the predictive interval in structured ways, the model is missing something systematic rather than merely being noisy.

Residual structure

Residuals should not show strong unresolved patterns.

In practice, look for:

  • long runs of positive residuals followed by long runs of negative residuals
  • clear seasonality left in the residuals
  • residual variance increasing with fitted values
  • one panel slice fitting much worse than the others

The presence of structure in the residuals usually means the model is still under-specified for the data.

Scatter of fitted versus observed

The fitted-versus-observed scatter is not a formal test, but it quickly shows:

  • compression toward the mean
  • systematic underprediction at high values
  • systematic overprediction at low values

This is the Bayesian cousin of the fitted-value plots you would inspect after a classical regression.

5. What “good” posterior predictive behaviour looks like

A good posterior predictive check does not mean the model matches every wiggle exactly.

You are looking for something more practical:

  • the main level and variation are captured
  • the observed series falls inside plausible predictive ranges often enough
  • residuals are not strongly structured
  • panel slices are not failing in obviously asymmetric ways

The question is whether the model is adequate for interpretation, not whether it is perfect.

6. What posterior predictive checks cannot prove

This is the most important warning.

A model can pass posterior predictive checks and still fail as a causal model.

Why? Because posterior predictive checks evaluate prediction of the target, not causal attribution of the components.

Two models can predict sales equally well while assigning very different shares of those sales to:

  • baseline
  • media
  • controls
  • seasonality
  • events

That is why posterior predictive checking must be paired with:

7. Common failure patterns

The model is too rigid

If the fitted line misses broad movements or regime changes, the model may need more structural flexibility, for example in trend, seasonality, controls, or events.

The model is too flexible in the wrong place

You may see good in-sample fit but strange residual behaviour or unstable attribution because the model is fitting noise through components that should remain more constrained.

Media is carrying baseline structure

If media spend is strongly correlated with time patterns, the model may let media soak up baseline variation that should have been handled by intercept, seasonality, controls, or other additive structure.

Baseline is carrying media structure

The reverse can also happen: a very flexible baseline can absorb variation that you would otherwise attribute to media.

8. What to do when checks fail

If posterior predictive checks look bad, resist the temptation to jump straight to interpreting coefficients anyway.

Instead:

  1. Check convergence first.
  2. Inspect residual structure rather than only aggregate fit.
  3. Revisit baseline specification, controls, seasonality, events, and media transformation choices.
  4. Refit and compare again.

In other words, use posterior predictive checking as a model-development tool, not just as a reporting plot.

9. Practical recommendation

In Abacus, the robust sequence is:

  1. Run prior predictive checks before fitting.
  2. Fit the model and verify MCMC diagnostics.
  3. Run posterior predictive checks and inspect residuals.
  4. Only then move to contributions, optimisation, or causal interpretation.

That order mirrors how a careful econometrician would already work, except that the Bayesian workflow makes the predictive-check step much richer and more honest about uncertainty.

Model Comparison

You have spent your career comparing models with AIC, BIC, adjusted $R^2$, and the occasional likelihood ratio test. These tools are elegant, fast, and deeply embedded in econometric practice. They are also, in the Bayesian setting, either inapplicable or subtly misleading. This document explains the Bayesian model comparison toolkit — LOO-CV, ELPD, posterior predictive checks — by mapping each concept back to something you already understand. We also discuss the pitfalls that arise when comparing ELPD across models, because this is where even experienced practitioners make mistakes.


1. Why AIC and BIC Do Not Transfer Cleanly

AIC and BIC are derived from the maximised log-likelihood and a penalty term that counts the number of free parameters. The logic is intuitive: a model that fits the data well (high log-likelihood) but uses many parameters (high complexity) is penalised, preventing overfitting.

In a Bayesian model, the concept of “number of free parameters” becomes ambiguous. Consider a hierarchical prior on media coefficients: eight channel-level coefficients are partially pooled toward a shared group mean. Are there eight free parameters, or one? The answer depends on how much pooling the data induces. If the group mean dominates, the effective number of parameters is closer to one. If each channel estimate ignores the group mean, the effective number is closer to eight. The truth lies somewhere in between, and it changes depending on the data.

BIC fares no better. Its derivation assumes that the posterior concentrates on a single point (the MLE) as the sample size grows. In a fully Bayesian model with informative priors and moderate sample sizes — precisely the setting of most MMMs — this assumption fails. The posterior is a genuine distribution, not a spike, and BIC’s penalty term does not account for the regularisation imposed by the prior.

You can still compute AIC and BIC from a Bayesian model by plugging in the posterior mean and the nominal parameter count, and some software will do this for you. But the resulting numbers do not have their usual theoretical justification, and they can mislead you into selecting the wrong model.

2. LOO-CV: The Gold Standard for Predictive Model Comparison

The Bayesian replacement for information criteria is Leave-One-Out Cross-Validation (LOO-CV), computed via an efficient approximation called Pareto-Smoothed Importance Sampling (PSIS-LOO). The implementation in ArviZ (which Abacus uses) makes this computation fast enough to run routinely.

The intuition maps directly to something every econometrician understands: out-of-sample prediction. Imagine you have $N$ observations. For each observation $i$, you refit the model on the remaining $N - 1$ observations and compute the predictive density for the held-out observation $i$. The average of these $N$ predictive densities, on the log scale, gives you the Expected Log Pointwise Predictive Density (ELPD).

In practice, you do not actually refit the model $N$ times. PSIS-LOO uses importance sampling to approximate each leave-one-out posterior from the full-data posterior, making the computation nearly free once the model has been fitted. The Pareto-smoothing step stabilises the importance weights, and the shape parameter of the fitted Pareto distribution (the Pareto-$k$ diagnostic) tells you how reliable each approximation is.

For an econometrician, ELPD is the Bayesian analogue of the out-of-sample log-likelihood that motivates AIC. In fact, AIC can be interpreted as an asymptotic approximation to LOO-CV. The difference is that LOO-CV makes no asymptotic assumptions, fully accounts for the prior, and works correctly even when the effective number of parameters is ambiguous.

3. Reading the ELPD Output

When you run az.loo() in ArviZ (or access LOO diagnostics through an Abacus model), the output reports several quantities that deserve careful interpretation.

The first is elpd_loo, the estimated expected log pointwise predictive density. This is a single number that summarises the model’s out-of-sample predictive performance. Higher (less negative) values indicate better predictive accuracy. On its own, the absolute value of ELPD is not very informative — it depends on the scale of the data and the number of observations. ELPD becomes useful only when you compare it across models fitted to the same data.

The second is p_loo, the effective number of parameters. This quantity captures the complexity of the model as measured by how much each observation influences its own prediction. A model with strong regularisation (tight priors, heavy pooling) will have a small $p_\text{loo}$ relative to its nominal parameter count, because the priors constrain the flexibility. A model with weak regularisation will have $p_\text{loo}$ closer to the nominal count. If $p_\text{loo}$ exceeds the nominal number of parameters, the model is misspecified or the PSIS approximation has broken down.

The third is se_elpd_loo, the standard error of the ELPD estimate. This is crucial for model comparison and is where many practitioners make errors. We address this in detail below.

4. Comparing Models: The ELPD Difference and Its Standard Error

Suppose you have fitted two models to the same dataset and computed ELPD for each. Model A has $\text{ELPD}_A = -320$ and Model B has $\text{ELPD}_B = -315$. Model B appears to predict better. But is the difference meaningful, or is it within noise?

The function az.compare() in ArviZ computes the difference $\Delta\text{ELPD} = \text{ELPD}_B - \text{ELPD}_A$ and its standard error. The standard error of the difference is computed from the pointwise ELPD values (one per observation), accounting for the correlation between the two models’ predictions.

The interpretation is analogous to a classical hypothesis test. If $|\Delta\text{ELPD}|$ is large relative to its standard error (say, greater than 2 SE), you have reasonable evidence that one model predicts better than the other. If the difference is smaller than 2 SE, the models are indistinguishable in predictive performance, and you should prefer the simpler or more interpretable model on non-statistical grounds.

However — and this is the critical caveat — the standard error of $\Delta\text{ELPD}$ is itself an estimate, and it can be unreliable when the pointwise ELPD differences are heavy-tailed. A handful of influential observations (outliers that one model handles much better than the other) can inflate the standard error dramatically, making a genuine difference look insignificant. Conversely, if both models fail on the same outliers in the same way, the standard error can be artificially small, making a meaningless difference look significant.

The practical recommendation is to always inspect the pointwise ELPD differences alongside the aggregate comparison. If a small number of observations drive most of the difference, investigate those observations individually before concluding that one model is superior.

5. Pareto-k Diagnostics: When to Trust the Approximation

The PSIS-LOO approximation relies on importance sampling, and importance sampling can fail when individual observations are highly influential — that is, when removing a single observation substantially changes the posterior. The Pareto-$k$ diagnostic measures this influence for each observation.

For an econometrician, Pareto-$k$ plays a role analogous to Cook’s distance or leverage in OLS diagnostics. A high-leverage observation in OLS disproportionately influences the coefficient estimates. A high Pareto-$k$ observation in LOO-CV disproportionately influences the ELPD estimate, and the importance sampling approximation for that observation may be unreliable.

The conventional thresholds are straightforward. Pareto-$k$ values below 0.7 indicate that the PSIS approximation is reliable for that observation. Values between 0.7 and 1.0 indicate marginal reliability — the estimate is usable but noisy. Values above 1.0 indicate that the importance sampling approximation has broken down for that observation, and the reported ELPD is not trustworthy.

When you encounter high Pareto-$k$ values, several remedies are available. The simplest is moment matching, an option in ArviZ that improves the approximation for problematic observations. If that fails, you can refit the model with the offending observations actually held out (exact LOO-CV for those points only). More fundamentally, high Pareto-$k$ values often signal that the model is misspecified for those observations — perhaps they are genuine outliers, or the model’s functional form fails in that region of the data. Investigating why specific observations are influential is often more valuable than fixing the diagnostic.

6. Posterior Predictive Checks: The Bayesian Goodness-of-Fit Test

ELPD and LOO-CV are relative metrics: they tell you which model predicts better, but they cannot tell you whether any of your models predict well in an absolute sense. For that, you need posterior predictive checks.

The idea is simple. Once you have fitted a model, you generate simulated datasets from the posterior predictive distribution — that is, you sample parameter values from the posterior and then simulate new data from the likelihood. You then compare the distribution of these simulated datasets to the observed data. If the simulations look like the real data, the model is capturing the key features of the data-generating process. If not, the model is missing something important.

For an econometrician, posterior predictive checks are the Bayesian analogue of residual diagnostics, but more powerful. Instead of checking whether residuals are normally distributed or homoscedastic, you can check any feature of the data. Does the model reproduce the seasonal pattern? Does it capture the right degree of week-to-week volatility? Does the distribution of simulated total annual sales match the observed total? Each of these questions becomes a visual or numerical comparison between the real data and the posterior predictive distribution.

The key advantage over classical residual analysis is that posterior predictive checks incorporate parameter uncertainty. Classical residuals are computed at the point estimate, which can mask model deficiencies when the standard errors are large. Posterior predictive simulations are drawn from the full posterior, so they honestly reflect how much the model’s predictions could vary even if the model is correctly specified.

In practice, we recommend running posterior predictive checks before computing ELPD or comparing models. If the posterior predictive distribution fails to reproduce basic features of the data (the mean, the variance, the seasonal pattern), the model is misspecified at a fundamental level, and comparing its ELPD to another model’s ELPD is an exercise in choosing the least bad option rather than selecting a good model.

7. When Model Comparison Is Meaningful and When It Is Not

Not all model comparisons are informative, and econometricians should exercise the same caution here that they would when comparing nested versus non-nested classical specifications.

ELPD comparisons are meaningful when the two models are fitted to exactly the same dataset, with exactly the same observations and the same target variable. If one model drops missing values differently, or transforms the target variable (e.g., one model predicts $y$ and the other predicts $\log(y)$), the ELPD values are on different scales and cannot be compared directly. This is analogous to the well-known prohibition against comparing $R^2$ across models with different dependent variables in classical econometrics.

ELPD comparisons are also meaningful only when the Pareto-$k$ diagnostics are acceptable for both models. If one model has many observations with Pareto-$k$ above 1.0, its ELPD estimate is unreliable, and the comparison is confounded by approximation error rather than genuine predictive differences.

ELPD comparisons are less informative when the models differ in ways that do not affect prediction but do affect causal interpretation. Two models might produce nearly identical ELPD values — predicting sales equally well out of sample — while attributing completely different proportions of sales to TV versus search. This is the identification problem discussed in the causal identification FAQ: predictive equivalence does not imply causal equivalence. A model that attributes 30% of sales to TV and 10% to search might predict just as well as a model that attributes 20% to each, because the total media contribution is the same. ELPD cannot distinguish between these models, because it evaluates prediction, not attribution.

For this reason, we recommend treating ELPD as a necessary but not sufficient criterion for model selection. Use it to eliminate models that predict poorly. Use posterior predictive checks to verify that the surviving models capture the essential features of the data. Then use substantive economic reasoning, lift test calibration, and domain expertise to choose among predictively equivalent models based on the plausibility of their causal attributions.

8. A Practical Mapping from Classical to Bayesian Model Selection

To consolidate the discussion, here is how each classical tool maps to its Bayesian replacement.

Adjusted $R^2$ measures in-sample fit penalised by the number of parameters. The Bayesian analogue is the posterior predictive $R^2$ proposed by Gelman, Goodrich, Gabry, and Vehtari (2019), which computes $R^2$ from the posterior predictive distribution rather than a point estimate. Unlike classical adjusted $R^2$, posterior predictive $R^2$ comes with a full distribution (one value per posterior draw), so you can report its uncertainty.

AIC measures asymptotic out-of-sample predictive performance. The Bayesian analogue is ELPD estimated via PSIS-LOO. ELPD is more general (no asymptotic assumptions), fully accounts for the prior, and handles hierarchical models correctly.

BIC targets model identification rather than prediction (it is consistent for the true model as $N \to \infty$). There is no direct Bayesian analogue that serves the same purpose, because Bayesian model comparison via ELPD is inherently predictive. If you want to identify the “true” model in a Bayesian framework, you would use Bayes factors, but Bayes factors are sensitive to the prior specification in ways that ELPD is not, and we do not generally recommend them for MMM applications.

The likelihood ratio test compares nested models by examining whether the additional parameters significantly improve the likelihood. The Bayesian replacement is the ELPD difference with its standard error. If the ELPD difference exceeds roughly 2 standard errors, the more complex model predicts meaningfully better. If not, prefer the simpler model.

Classical residual diagnostics (Durbin-Watson, Breusch-Pagan, Q-Q plots) check model assumptions after fitting. The Bayesian replacement is posterior predictive checking, which is more flexible (you can check any data feature, not just residual properties) and more honest (it incorporates parameter uncertainty).

In every case, the Bayesian tool is at least as informative as its classical counterpart and often more so. The cost is unfamiliarity. We hope this document has reduced that cost.

Causal Identification

If you are a classically trained econometrician, you have every right to be sceptical of Marketing Mix Models. The causal identification strategy underpinning MMM is weaker than the methods you were taught to trust. This document confronts that reality head-on: we explain what MMM can and cannot claim causally, where the identifying assumptions break down, and how modern calibration techniques partially rescue the framework. We also place MMM on the “causal ladder” relative to the gold-standard methods you already know.

Our goal is not to oversell MMM. It is to give you an honest accounting of the trade-offs, so you can deploy the tool where it is defensible and flag where it is not.


1. The Identification Problem, Plainly Stated

Every causal claim rests on an identification strategy — a logical argument for why the estimated relationship reflects a true causal effect rather than a statistical artefact. In classical econometrics, you learned several strategies, each with a well-understood set of assumptions. Consider three that you know well.

A randomised controlled trial (RCT) identifies a causal effect by physically randomising treatment assignment. Because randomisation breaks the link between treatment and all confounders (observed and unobserved), the simple difference in means is an unbiased estimator of the average treatment effect. The assumption is minimal: the randomisation was executed correctly.

An instrumental variables (IV/2SLS) estimator identifies a causal effect by exploiting an instrument — a variable that affects the outcome only through the endogenous treatment. The identifying assumptions are relevance (the instrument predicts the treatment) and the exclusion restriction (the instrument has no direct effect on the outcome). These assumptions are testable to some degree and falsifiable.

A difference-in-differences (DiD) estimator identifies a causal effect by comparing the change in outcomes over time between a treated and control group. The identifying assumption is parallel trends: absent treatment, the two groups would have followed the same trajectory. Again, this assumption is partially testable using pre-treatment data.

Now consider what MMM does. An MMM estimates media effects by regressing sales (or another KPI) on media spend and controls over time. The variation it exploits is temporal: weeks when TV spend was high are compared to weeks when TV spend was low, after controlling for seasonality, trend, and other observables.

The identifying assumption is strict exogeneity of the media regressors, conditional on the controls. In plain language: after we account for trend, seasonality, holidays, and any included control variables, the remaining variation in media spend is “as good as random” with respect to the error term. If an unobserved, time-varying confounder drives both media spend and sales simultaneously — and we have not controlled for it — the media coefficient is biased.

This is a strong assumption. And unlike the IV exclusion restriction or the DiD parallel trends assumption, it is essentially untestable. You cannot run a placebo check on an unobserved confounder you have not measured.

2. Where the Assumptions Break Down

The strict exogeneity assumption fails in practice more often than MMM practitioners care to admit. Consider three common violations.

The first is simultaneity. Media planners increase spend during periods when they expect sales to be high (Christmas, product launches, promotional windows). Sales are high in those periods not because of the advertising but because of the underlying demand shock. The MMM attributes the demand shock to the media channel, inflating its estimated effect. This is textbook endogeneity, identical to the problem that motivates IV estimation in labour economics or IO.

The second is omitted variable bias from time-varying confounders. Suppose a competitor launches an aggressive pricing campaign in Q3, simultaneously causing your sales to drop and your marketing team to increase defensive spend. The MMM sees high spend coinciding with low sales and may underestimate the media effect. If instead the competitor withdraws, the reverse happens. Without a “competitor activity” control, the media coefficient absorbs the confounding variation.

The third is functional form misspecification. Even if the true data-generating process satisfies strict exogeneity, specifying the wrong functional form (linear when the truth is concave, or missing an interaction between channels) introduces bias. MMM frameworks like Abacus mitigate this with flexible non-linear transforms (adstock, saturation), but no parametric family can guarantee correct specification.

3. How Lift Test Calibration Partially Rescues MMM

Modern Bayesian MMM frameworks, including Abacus, address the endogeneity problem through calibration with incrementality experiments (lift tests or geo-experiments). The logic works as follows.

A lift test is a controlled experiment — typically a geo-randomised or matched-market design — in which media exposure is deliberately varied across treatment and control regions. Because the variation is experimentally induced, the resulting incremental estimate is causally identified in the RCT sense, at least for the specific channel, time window, and geography tested.

When you feed this lift test estimate into the MMM (via the EventAdditiveEffect or lift test calibration API in Abacus), you inject an external piece of causal evidence into the model’s likelihood. The Bayesian machinery then updates the media coefficient posterior to be consistent with both the observational time-series data and the experimental result. In effect, the lift test acts as an anchor: it constrains the media coefficient to a causally credible region, even if the observational data alone would have produced a biased estimate.

Think of the lift test as playing a role analogous to an instrumental variable. The IV provides exogenous variation that identifies the causal effect. The lift test provides exogenous variation (from the experiment) that calibrates the observational estimate. The difference is that the IV is embedded inside the estimator, whereas the lift test enters as an informative prior or likelihood penalty.

This approach does not eliminate all bias. The lift test identifies the causal effect for one channel in one time window. Extrapolating that result across all channels and all time periods requires additional assumptions (stability of the effect over time, no interaction between the calibrated and uncalibrated channels). But it is a genuine improvement over pure observational MMM, and it brings the framework closer to the causal credibility that econometricians demand.

4. MMM on the Causal Ladder

We can place MMM relative to the methods you trust by thinking about a hierarchy of identification strategies, ordered by the strength of their causal assumptions.

At the top sits the RCT. Randomisation eliminates all confounding, and the only threat to validity is implementation failure (non-compliance, attrition, spillovers). For media measurement, the RCT analogue is a well-executed geo-experiment or a randomised holdout test. When you can run one, run one.

One rung below sits IV/2SLS. The instrument provides exogenous variation, but only if the exclusion restriction holds. In media measurement, genuine instruments are rare. Weather shocks that affect outdoor advertising exposure, or regulatory changes that force abrupt spend shifts, occasionally qualify. But most marketing datasets lack a credible instrument.

Below IV sits DiD and synthetic control methods. These exploit a treatment event (a campaign launch, a market entry) and compare treated versus control units under a parallel trends assumption. Geo-experiments with a staggered rollout fit naturally into this framework. The assumption is testable but not guaranteed.

Below DiD sits regression discontinuity (RD), which exploits a sharp threshold in treatment assignment. Media applications are uncommon because advertising spend rarely exhibits the kind of sharp discontinuity that RD requires.

And then we arrive at the observational regression — which is where standard MMM lives. The identifying assumptions are the weakest in the hierarchy: conditional exogeneity given controls, correct functional form, and no unobserved time-varying confounders. Without external calibration, this is the least credible causal claim on the ladder.

However, MMM calibrated with lift tests occupies a hybrid position. The observational regression provides the structure and the time-series variation. The lift test provides a causally identified anchor point. Together, they produce an estimate that is stronger than pure observational regression but weaker than a full RCT across all channels. In practice, this hybrid is the best that most marketing organisations can achieve at scale, because running a separate RCT for every channel, every quarter, in every market, is prohibitively expensive.

5. The Role of DAGs and Structural Thinking

If you are trained in the Pearlian causal inference tradition (directed acyclic graphs, do-calculus, the structural causal model), you will recognise that MMM implicitly assumes a particular DAG. The assumed structure looks roughly like this: media spend causes sales, seasonality and trend cause sales, controls cause sales, and (critically) nothing unobserved simultaneously causes both media spend and sales after conditioning on the included controls.

Drawing this DAG explicitly is a powerful exercise. It forces you to articulate every backdoor path between media and sales, and to verify that your control set blocks them all. If you identify a backdoor path that your controls do not block — for example, “competitor pricing → our media spend” and “competitor pricing → our sales” — you have found a source of bias that the MMM cannot resolve without either adding a control for competitor pricing or calibrating with a lift test.

We strongly recommend that every MMM engagement begins with a causal DAG workshop, even an informal one. The DAG does not make the model causal. But it forces the team to be explicit about what they are assuming, and it provides a framework for discussing where the model’s causal claims are credible and where they are not.

6. Honest Counsel for Sceptical Econometricians

We close with five points of honest counsel.

First, do not treat MMM outputs as causal estimates with the same confidence you would place in a well-identified IV or DiD result. They are not. They are conditional associations, regularised by Bayesian priors and (ideally) anchored by experimental calibration.

Second, always ask: “What is the identifying variation?” If the answer is “weeks when spend was high versus weeks when spend was low,” follow up with: “Why was spend high in those weeks? Could the same factor that drove high spend also have driven high sales independently?” If the answer is “yes” or “maybe,” the estimate is potentially confounded.

Third, calibrate wherever possible. A single well-executed lift test for your largest channel does more for the credibility of the entire model than any amount of prior tuning or functional form experimentation.

Fourth, use the model for what it does well. MMM excels at relative channel comparison (channel A versus channel B), at budget allocation (given a fixed total budget, how should we distribute it?), and at scenario planning (what happens if we increase TV spend by 20%?). These tasks require correct ranking of media effects, not unbiased point estimation. Even a moderately biased MMM can rank channels correctly if the bias is roughly proportional across channels.

Fifth, be transparent with stakeholders. Present posterior credible intervals, not point estimates. Discuss the assumptions openly. Flag where calibration data exists and where it does not. The credibility of the framework depends not on pretending the model is an RCT, but on demonstrating that the team understands its limitations and has taken concrete steps to mitigate them.

Baseline vs Media Trade-offs

One of the most confusing experiences in MMM is this:

  • two specifications can fit the target series almost equally well
  • both can have acceptable diagnostics
  • yet they can assign very different amounts of the target to media versus baseline

This is not necessarily a bug in the software. It is a structural feature of the problem.

This page explains how that trade-off appears in Abacus and why you should expect it.

1. The decomposition problem

At a high level, Abacus builds the expected target from several additive components.

In the retained PanelMMM build path, the mean function can include:

  • intercept_contribution
  • channel_contribution
  • control_contribution, if you configure control_columns
  • mundlak_contribution, if use_mundlak_cre=True
  • yearly_seasonality_contribution, if yearly_seasonality is enabled
  • additional additive effects you attach before build, such as events or trend effects

The likelihood sees the sum of these pieces, not a directly observed “ground-truth baseline” and “ground-truth media” split.

That means the total fit can be easier to identify than the decomposition.

2. Why the trade-off exists

Suppose revenue rises every December and TV spend also rises every December.

Several stories can fit the same sales data reasonably well:

  • December uplift is mostly seasonality
  • December uplift is mostly TV
  • December uplift is partly both

If the model includes both a seasonal term and media terms, they will compete to explain the same observed movement.

This is the core baseline-versus-media trade-off:

the data often identify total explained variation better than they identify which component deserves the credit

Classical econometricians already know this as collinearity and omitted-variable competition. Bayesian MMM does not make that problem disappear. It makes the uncertainty around it more explicit.

3. What counts as “baseline” in Abacus

In Abacus, the baseline side comes from the terms you specify inside the PyMC graph.

Depending on configuration, that can include:

  • a static intercept
  • a time-varying intercept
  • yearly Fourier seasonality
  • controls
  • events
  • trend-like additive effects
  • Mundlak CRE adjustments in panel settings

So when people say “baseline absorbed the effect”, they usually mean one or more of those components, not a separate external decomposition engine.

4. How media can lose attribution

Media can lose attribution when the non-media side of the model is too good at explaining the same movements.

Common cases:

  • a flexible time-varying intercept captures medium-run swings that media could also explain
  • strong seasonal terms absorb repeating peaks that coincide with campaign timing
  • control variables proxy for media timing or market conditions too strongly
  • event effects explain demand spikes that were previously being picked up by channel coefficients

In each case, the model may still predict well. The question is how the variation is partitioned.

5. How media can steal attribution from baseline

The reverse failure is also common.

If the baseline side is under-specified, media channels can absorb variation that is not truly incremental media response.

Examples:

  • missing seasonality leaves recurring annual structure for media to explain
  • missing controls leave competitor, pricing, or macro effects for media to explain
  • missing events leave spikes for channels to absorb
  • insufficient baseline flexibility forces media to act as a trend proxy

This usually inflates media contribution and makes optimisation outputs look better than they should.

6. Why good fit does not settle the argument

You might hope that whichever specification predicts better must also have the more trustworthy attribution split.

Unfortunately, that does not follow.

A model can reproduce the observed target series very well while still having ambiguous attribution. Predictive adequacy is necessary, but it is not enough to identify the correct media decomposition.

That is why:

7. Signs that the trade-off is driving your result

Be cautious when you see any of the following:

  • very similar model fit with materially different channel contributions
  • large channel swings after adding or removing a seasonal or trend term
  • media ROI rankings that flip after adding controls or events
  • one highly flexible baseline term dominating decomposition while media contributions collapse
  • implausibly smooth media contributions paired with a very wiggly baseline, or vice versa

These are not proofs of misspecification, but they are strong prompts for sensitivity analysis.

8. What to do in practice

A disciplined Abacus workflow is usually better than trying to argue theoretically about the “right” split in the abstract.

Recommended approach:

  1. Start with a specification that has the minimum baseline structure you can defend.
  2. Add seasonal, control, event, or time-varying terms only when you can justify them substantively or diagnostically.
  3. Refit and compare decomposition stability, not just target fit.
  4. Report instability when attribution changes materially across defensible specifications.
  5. Where possible, bring in external evidence such as lift tests or calibration.

The important point is not to force one narrative prematurely. It is to show which attribution conclusions remain stable after reasonable specification changes.

9. Abacus-specific interpretation

In Abacus, you should treat the decomposition outputs as conditional on the configured structure:

  • the chosen controls
  • whether yearly_seasonality is on
  • whether the intercept is time-varying
  • whether media effects are time-varying
  • whether you added events or other additive effects
  • whether use_mundlak_cre=True

Change the structure, and the attribution can change even when predictive fit does not move much.

That is normal. It is the software telling you where the data alone are not decisive.

10. Bottom line

Baseline-versus-media trade-offs are unavoidable in MMM because the observed target only reveals the sum of the contributing processes.

Abacus makes this explicit by fitting all configured terms inside one additive Bayesian graph. That is a strength, but it also means you need to read the decomposition as a conditional statement:

given this model structure, priors, and data, this is the most plausible attribution split

That is much more defensible than pretending the split is uniquely observed in the data.

Mundlak Specification Test

Background

Classical panel econometrics uses the Mundlak specification test (also called the Chamberlain–Mundlak test) to decide whether random effects (RE) or fixed effects (FE) should be preferred. Stata 19 implements this as estat mundlak — a Wald test on the auxiliary Mundlak γ coefficients:

  • H₀: RE is consistent (γ = 0 jointly), so the simpler RE model is adequate.
  • H₁: RE is inconsistent (γ ≠ 0), so CRE or FE is needed.

This test is the cluster-robust-compatible replacement for the classical Hausman test, which breaks under heteroskedasticity or within-cluster correlation.

Why It Does Not Apply to Abacus

Abacus is a fully Bayesian MMM framework. The Mundlak specification test is a frequentist hypothesis test and does not translate directly:

  1. No frequentist rejection framework. There is no Wald statistic or asymptotic chi-squared distribution. Bayesian inference does not produce p-values or binary accept/reject decisions.

  2. The posterior already answers the question. When use_mundlak_cre=True, the Mundlak γ coefficients receive priors and are estimated jointly with all other model parameters. If the posteriors of γ are concentrated near zero, the baseline panel specification was adequate. If they are clearly non-zero, the CRE correction is absorbing meaningful between-group confounding. You read the posterior — you do not need a separate test.

  3. Bayesian pooling is a continuum, not a binary choice. In Abacus, hierarchical shrinkage only appears when you encode it in the priors. Default PanelMMM panel priors are indexed by the panel coordinates, not automatically hierarchical. Once you choose hierarchical priors, there is no clean “pure RE” versus “pure FE” dichotomy to test between.

What to Do Instead

Inspect the γ posteriors directly

After fitting with use_mundlak_cre=True, examine the Mundlak coefficients:

import arviz as az

az.summary(
    mmm.idata,
    var_names=["gamma_channel_mundlak"],
)

If the 94% HDI includes zero for all channels, the CRE correction is doing little. If the HDI excludes zero, the correction is absorbing real between-group correlation.

The diagnostics surface also reports these:

mmm.diagnostics.mcmc_summary()

Bayesian model comparison (optional, not currently in scope)

The formal Bayesian analog of the specification test is model comparison via LOO-CV (leave-one-out cross-validation using Pareto-smoothed importance sampling):

  1. Fit with use_mundlak_cre=False.
  2. Fit with use_mundlak_cre=True.
  3. Compare ELPD (expected log predictive density) via az.compare().

This is currently out of scope for Abacus. LOO/WAIC were explicitly deferred in the project backlog. If formal Bayesian model comparison is needed later, it would be a separate feature.

Prior predictive checks

Verify that the prior on γ is not dominating the posterior. This is standard Bayesian workflow and is already supported via mmm.sample_prior_predictive().

Summary

Approach Framework Available in Abacus
Mundlak specification test (estat mundlak) Frequentist No — does not apply
Posterior inspection of γ Bayesian Yes — az.summary() / mmm.diagnostics
LOO-CV model comparison Bayesian Not yet — deferred in backlog
Prior predictive check Bayesian Yes — sample_prior_predictive()

The recommendation is to inspect the γ posteriors rather than implement a frequentist specification test. The Bayesian posterior provides a richer and more directly interpretable answer than a binary reject/fail-to-reject decision.

References

  • Mundlak, Y. (1978). “On the Pooling of Time Series and Cross Section Data.” Econometrica, 46(1), 69–85.
  • Stata 19: Mundlak specification test
  • Vehtari, A., Gelman, A., & Gabry, J. (2017). “Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC.” Statistics and Computing, 27(5), 1413–1432.

Contributing

Use this section when you are changing Abacus itself rather than using it as a library. The contributor docs focus on three questions:

  1. How do you get a working local environment?
  2. Where should new code live?
  3. What do you need to run before you consider a change complete?

Abacus is intentionally local-first. The source of truth for development workflow is the combination of the repo Makefile, ARCHITECTURE.md, and the verification scripts in sandbox/.

Start Here

  • Development Setup explains the supported local environment, editable install, and the verification commands you are expected to use.
  • Architecture explains the module boundaries, dependency direction, and where new code should land.
  • Testing explains the test layout, recommended pytest commands, and when to run the heavier local verification scripts.
  1. Create or refresh your local environment.
  2. Read the architecture page before touching abacus/mmm/panel.py or the extracted panel modules.
  3. Make the smallest coherent code change that solves the task.
  4. Run targeted lint and tests for the touched area.
  5. For substantial work, run make verify_local.
  6. If packaging, imports, or bundled assets changed, run make verify_package.

Subsections of Contributing

Architecture

Abacus is structured so that the public MMM API stays small while the implementation can evolve behind stable seams. The most important rule is that PanelMMM is a facade, not the place where new core behaviour should accumulate.

For the complete module map, read ARCHITECTURE.md in the repository root. This page summarises the parts that matter most when you are deciding where to put new code.

Design Principles

  1. PanelMMM stays thin. Constructor normalisation, data prep, graph construction, prediction, calibration, runtime helpers, and serialisation live under abacus/mmm/models/.
  2. Compute comes before presentation. Diagnostics, summaries, and plotting should consume structured outputs from the model layer rather than embedding analytical logic in presentation code.
  3. Dependencies flow downward. Shared root infrastructure can be imported by MMM modules, but MMM-specific modules should not leak back into the shared layer.
  4. Compatibility is deliberate. If you move imports or rename internals, keep facades or compatibility shims where public usage would otherwise break.

High-Level Package Layers

Layer Purpose Examples
Public facades Stable user-facing entry points abacus.mmm.panel, abacus.mmm.plot, abacus.mmm.summary
Panel implementation seams Core panel behaviour abacus.mmm.models.panel_config, panel_build, panel_predict, panel_runtime, panel_serialize
MMM primitives Reusable modelling building blocks abacus.mmm.components, abacus.mmm.transforms, abacus.mmm.fourier, abacus.mmm.hsgp, abacus.mmm.events
Post-fit outputs Diagnostics, summaries, optimisation, plots abacus.mmm.diagnostics, abacus.mmm.summarization, abacus.mmm.optimization, abacus.mmm.plotting
Shared root Generic infrastructure used across the package abacus.modeling, abacus.prior, abacus.metrics, abacus.data, abacus.pipeline

Where New Code Goes

If you are adding… Put it in…
Constructor normalisation, dims logic, transform configuration abacus/mmm/models/panel_config.py
Data conversion, scaling, Mundlak support, prediction-data prep abacus/mmm/models/panel_data.py
PyMC graph construction abacus/mmm/models/panel_build.py
Posterior predictive or response-curve sampling abacus/mmm/models/panel_predict.py
Serialisation or save/load compatibility abacus/mmm/models/panel_serialize.py and shared helpers in abacus/modeling/io.py
Diagnostics compute abacus/mmm/diagnostics/
Summary tables and exported curve summaries abacus/mmm/summarization/
Static charts abacus/mmm/plotting/
Budget optimisation logic abacus/mmm/optimization/
Adstock or saturation behaviour abacus/mmm/components/ and abacus/mmm/transforms/
Shared model-builder infrastructure abacus/modeling/

Dependency Rules

Allowed

  • Shared root modules can be imported by MMM modules.
  • abacus/mmm/models/ can depend on MMM primitives and shared root modules.
  • Facades such as panel.py can depend on the extracted panel modules.
  • Plotting, summaries, diagnostics, and optimisation can depend on model outputs and extracted helpers.

Avoid

  • Importing panel.py from abacus/mmm/models/*.
  • Adding plotting or summary logic to core model-building modules.
  • Adding MMM-specific behaviour to the shared abacus/modeling/ layer unless it is genuinely reusable.
  • Defaulting to panel.py for new features just because it is visible.

Practical Guidance

When you touch a feature area, check whether there is already an extracted seam for it before adding a new helper. Examples:

  • Plot behaviour should usually land in abacus/mmm/plotting/, not in abacus/mmm/plot.py.
  • Serialisation changes should usually land in abacus/mmm/models/panel_serialize.py, not directly in PanelMMM.
  • Time-varying parameter behaviour should use the HSGP and TVP support modules rather than embedding new logic in plotting or builders.

Before You Merge

  • Confirm the change landed in the correct layer.
  • Keep public facades thin.
  • Preserve public API compatibility unless the change is explicitly breaking.
  • Add or update tests in the matching test area.
  • Run the local verification commands described in Testing.

Development Setup

This page describes the supported local setup for working on Abacus. The project is maintained with local verification scripts rather than a CI-first workflow, so your development environment needs to be able to run linting, pytest, and the packaging smoke checks directly.

Prerequisites

  • Python 3.12
  • A local environment manager such as Conda
  • A writable temporary directory such as /tmp for PyTensor caches and package verification artefacts

Create the Development Environment

The simplest supported path is the repository environment file:

conda env create -f environment.yml
conda activate abacus-dev
python3 -m pip install -e .

If you know you will be running linting and tests frequently, install the optional extras as well:

python3 -m pip install .[lint,test]

Local Runtime Defaults

Some parts of the stack need writable cache directories. In restricted or sandboxed environments, set the same defaults used by the repo’s local verification scripts:

export PYTENSOR_FLAGS="base_compiledir=/tmp/pytensor,linker=py"
export JAX_PLATFORMS=cpu
export XDG_CACHE_HOME=/tmp

The Makefile already applies these defaults for make test and make smoke_mmm.

Common Commands

Lint and format

make check_lint
make lint
make check_format
make format

These targets cover the package, tests, scripts, and the local verification entry points in sandbox/.

Tests

make test
pytest tests/<path>/test_*.py -v

Use targeted pytest first when you are working on a narrow area. Run the wider verification commands before closing substantial changes.

Local verification

make smoke_mmm
make verify_local
make verify_package
make verify_local_all

What these commands do:

  • make smoke_mmm runs a short end-to-end MMM smoke path against the demo config and demo data.
  • make verify_local runs the retained local verification matrix from sandbox/run_local_verification.py.
  • make verify_package builds package artefacts and validates an installed package smoke path.
  • make verify_local_all runs the local verification matrix and includes the packaging smoke step.

Important Working Files

File Why it matters
Makefile Primary local entry point for lint, test, smoke, and package verification
environment.yml Supported dev environment definition
pyproject.toml Packaging metadata, extras, Ruff, MyPy, and pytest configuration
sandbox/run_local_verification.py Authoritative local verification matrix
sandbox/run_package_verification.py Package build and installed-wheel smoke verification
ARCHITECTURE.md Contributor-facing module map and dependency rules

Local-Only Areas

The repo contains some directories that are useful locally but are not part of the shipped library surface:

  • .archive/ for archived planning and reference material
  • assets/engineering/standards/ for local documentation and writing standards
  • sandbox/ for local scripts and verification entry points

Keep temporary scripts in sandbox/ rather than mixing them into the package.

Troubleshooting

PyTensor cache permission errors

If you see errors related to .pytensor lock files or compiledir creation, export the runtime defaults shown above and rerun the command.

Package verification fails because build is missing

Run:

python3 -m pip install build

The make verify_package target does this automatically.

You are not sure which command to run

As a rule:

  • run targeted pytest and ruff while iterating
  • run make verify_local before finishing non-trivial code changes
  • run make verify_package when packaging, imports, or bundled assets changed

Testing

Abacus uses pytest for automated tests, plus local verification scripts for the broader confidence checks that glue linting, smoke paths, and packaging together. The expected workflow is to run targeted tests while you iterate and then run the wider local verification commands before you finish substantial work.

Test Layout

Path What it covers
tests/test_*.py Shared infrastructure such as model IO, paths, package identity, and root-level helpers
tests/mmm/ MMM behaviour at the public surface
tests/mmm/models/ Extracted panel implementation seams
tests/mmm/components/ Adstock and saturation component behaviour
tests/mmm/plotting/ Static plotting helpers and theme/layout behaviour
tests/mmm/optimization/ Budget optimisation logic and wrappers
tests/mmm/diagnostics/ Structured diagnostics compute
tests/mmm/summarization/ Summary/export helpers

When you change a specific module seam, add or update tests in the matching test area instead of only asserting through a broad end-to-end test.

Core Commands

Fast targeted runs

pytest tests/<path>/test_*.py -v
pytest tests/mmm/plotting/test_theme.py --no-cov -q
pytest tests/mmm/models/test_panel_serialize.py --no-cov -q

Use targeted runs first. They are faster to interpret and make regressions easier to localise.

Whole-suite pytest

make test

This installs the test extras and runs pytest with the local runtime defaults from the Makefile.

Local verification

make verify_local
make verify_local_all

make verify_local is the preferred baseline for substantial changes because it runs:

  • Ruff on retained source, tests, and local runner scripts
  • byte-compilation checks
  • targeted pytest batches across the shared and MMM surfaces
  • the local MMM smoke pipeline

make verify_local_all adds the package build/install smoke path.

Packaging smoke

make verify_package

Run this when any of the following changed:

  • packaging metadata in pyproject.toml
  • import surfaces or compatibility facades
  • bundled assets under abacus/
  • install-time behaviour or README/package artefacts

Runtime Environment

Some test paths need writable cache directories. The recommended defaults are:

export PYTENSOR_FLAGS="base_compiledir=/tmp/pytensor,linker=py"
export JAX_PLATFORMS=cpu
export XDG_CACHE_HOME=/tmp

The local verification scripts apply these defaults automatically.

Special Cases

Plotting tests

Prefer tests that inspect stable properties such as axes, labels, colours, sizes, rcParams, and return types. Avoid brittle pixel-perfect assertions.

When needed, you can run the top-level plot suite the same way the local verification script does:

NUMBA_DISABLE_JIT=1 pytest tests/mmm/test_plot.py --no-cov -q

Save/load and compatibility work

If you change model serialisation, identity strings, or import compatibility, add tests that prove older saved data or old import paths still work where that compatibility is expected.

Packaging and bundled assets

If you add or move package data, use make verify_package so the change is checked against an installed wheel rather than only the editable repo checkout.

What to Run Before You Finish

Small, localised change

  • Targeted pytest
  • Targeted ruff check

Moderate code change

  • Targeted pytest
  • make check_lint
  • make smoke_mmm

Broad or risky change

  • make verify_local
  • make verify_package if packaging or bundled assets changed

Writing Good Tests

  • Test observable behaviour, not implementation noise.
  • Keep fixtures close to the layer you are testing.
  • Prefer additive compatibility tests when preserving old behaviour.
  • Use small synthetic data where possible.
  • For plotting and serialisation, assert the stable contract rather than fragile internals.

API Reference

This section is a hand-curated reference for the retained public Abacus API.

It focuses on stable entry points that users are expected to import directly. It does not try to document every internal module under abacus.mmm.models, abacus.mmm.summarization, or abacus.pipeline.stages.

For task-oriented workflows, use the main documentation sections first. Use this reference when you need the exact import path, object name, or the scope of a public surface.

Main module groups

Module Primary public surface
abacus.mmm.panel PanelMMM
abacus.mmm Adstock, saturation, Fourier, HSGP, and trend classes
abacus.mmm.optimization PanelBudgetOptimizerWrapper and advanced optimisation helpers
abacus.mmm.builders.yaml build_mmm_from_yaml(...)
abacus.pipeline PipelineRunConfig, run_pipeline(...), PipelineRunResult
abacus.scenario_planner Planner specs, ScenarioPlanner, ScenarioComparison
abacus.data.idata MMMIDataWrapper, schema helpers, and idata utilities

Pages

Subsections of API Reference

PanelMMM

PanelMMM is the single retained public MMM model API in Abacus.

Import it from:

from abacus.mmm.panel import PanelMMM

For conceptual guidance, see Model Overview. For data contracts, see Data Preparation.

Constructor

PanelMMM(...) is keyword-only.

The main constructor arguments are:

Argument Meaning
date_column Name of the date column in X
channel_columns Required media columns
target_column Semantic target column name
target_type "revenue" or "conversion"
adstock An AdstockTransformation instance
saturation A SaturationTransformation instance
dims Optional panel dimensions such as ("geo",)
control_columns Optional non-media regressors
control_impacts Optional directional expectations for controls
control_sign_policy "soft" or "strict"
yearly_seasonality Number of yearly Fourier modes
time_varying_intercept bool or an HSGPBase instance
time_varying_media bool or an HSGPBase instance
use_mundlak_cre Add Mundlak / correlated random effects terms
scaling Scaling, a dict, or None
model_config Prior and likelihood configuration
sampler_config Default sampler settings
adstock_first Whether adstock runs before saturation

Core lifecycle methods

The most commonly used methods are:

Method Purpose
build_model(X, y) Build the PyMC graph for the current configuration
fit(X, y, **kwargs) Sample the posterior and store idata
approximate_fit(X, y, ...) Fit with variational inference instead of NUTS
sample_prior_predictive(X, y, ...) Sample prior and prior predictive draws
sample_posterior_predictive(X, ...) Sample posterior predictive draws
predict(X, ...) Return posterior mean predictions
predict_posterior(X, ...) Return posterior predictive samples for output_var
save(path, **kwargs) Save idata to NetCDF
load(path, check=True) Load a saved model from NetCDF
load_from_idata(idata, check=True) Rebuild from an in-memory InferenceData

fit(...), sample_prior_predictive(...), predict(...), save(...), and the load helpers come from the shared model-builder base classes but are part of the user-facing PanelMMM surface.

Post-fit model methods

PanelMMM also exposes model-specific post-fit methods:

Method Purpose
add_original_scale_contribution_variable(var=[...]) Add original-scale deterministics before fitting
sample_saturation_curve(...) Sample posterior saturation curves
sample_adstock_curve(...) Sample posterior adstock curves
sample_channel_contribution_forward_pass(...) Sample channel contributions in scaled target space
channel_contribution_forward_pass(...) Evaluate channel contributions in original target units
get_channel_contribution_forward_pass_grid(...) Build a contribution grid over shared spend multipliers
new_spend_contributions(...) Simulate forward contribution paths for a spend scenario
add_lift_test_measurements(...) Add lift-test calibration measurements
add_cost_per_target_calibration(...) Add cost-per-target calibration penalties
add_events(df_events, prefix, effect) Add dated event effects before build

Bound properties

Once the model exists, these bound properties expose the retained post-fit surface:

Property Returns
plot MMMPlotSuite
data MMMIDataWrapper
summary MMMSummaryFactory
diagnostics MMMDiagnosticsFactory
efficiency_metric Default efficiency metric key for target_type
efficiency_metric_label Display label such as ROAS or CPA

See Post-Fit Facades.

Other useful attributes

Common model attributes include:

Attribute Meaning
idata The fitted arviz.InferenceData
output_var Output variable name used in predictive sampling ("y")
channel_columns Configured channel names
control_columns Configured control names
dims Configured panel dimensions
mu_effects Additive effects attached before build

Minimal example

from abacus.mmm import GeometricAdstock, LogisticSaturation
from abacus.mmm.panel import PanelMMM

mmm = PanelMMM(
    date_column="date",
    target_column="revenue",
    channel_columns=["tv", "search", "social"],
    dims=("geo",),
    adstock=GeometricAdstock(l_max=8),
    saturation=LogisticSaturation(),
)

mmm.fit(X, y, draws=500, tune=500, chains=2, progressbar=False)
mmm.sample_posterior_predictive(X=X, progressbar=False)

Transforms and Supporting Types

Abacus keeps most reusable modelling primitives under abacus.mmm.

This page lists the main import groups for transformations, seasonality and trend components, HSGP helpers, and scaling types.

Top-level abacus.mmm re-exports

Import these directly from abacus.mmm:

from abacus.mmm import GeometricAdstock, LogisticSaturation, YearlyFourier

Adstock transformations

Top-level import path:

from abacus.mmm import (
    AdstockTransformation,
    BinomialAdstock,
    DelayedAdstock,
    GeometricAdstock,
    NoAdstock,
    WeibullCDFAdstock,
    WeibullPDFAdstock,
    adstock_from_dict,
)

Main public types:

Type Purpose
AdstockTransformation Base adstock interface
NoAdstock No carryover
GeometricAdstock Geometric decay
DelayedAdstock Delayed peak with decay
BinomialAdstock Binomial-style lag weights
WeibullCDFAdstock Weibull CDF carryover
WeibullPDFAdstock Weibull PDF carryover
adstock_from_dict(...) Rebuild an adstock from serialised config

Saturation transformations

Top-level import path:

from abacus.mmm import (
    HillSaturation,
    HillSaturationSigmoid,
    InverseScaledLogisticSaturation,
    LogisticSaturation,
    MichaelisMentenSaturation,
    NoSaturation,
    RootSaturation,
    SaturationTransformation,
    TanhSaturation,
    TanhSaturationBaselined,
    saturation_from_dict,
)

Main public types:

Type Purpose
SaturationTransformation Base saturation interface
NoSaturation No diminishing returns
LogisticSaturation Logistic response curve
InverseScaledLogisticSaturation Inverse-scaled logistic curve
HillSaturation Hill response curve
HillSaturationSigmoid Hill-style sigmoid curve
MichaelisMentenSaturation Michaelis-Menten curve
RootSaturation Root response curve
TanhSaturation Hyperbolic tangent curve
TanhSaturationBaselined Tanh curve with baseline handling
saturation_from_dict(...) Rebuild a saturation from serialised config

Fourier and trend components

Top-level import path:

from abacus.mmm import MonthlyFourier, WeeklyFourier, YearlyFourier, LinearTrend

These classes are building blocks for built-in or custom additive effects.

Type Purpose
YearlyFourier Yearly Fourier basis
MonthlyFourier Monthly Fourier basis
WeeklyFourier Weekly Fourier basis
LinearTrend Piecewise linear trend component

HSGP and time-varying parameter helpers

Top-level import path:

from abacus.mmm import (
    HSGP,
    CovFunc,
    HSGPPeriodic,
    PeriodicCovFunc,
    SoftPlusHSGP,
    approx_hsgp_hyperparams,
    create_complexity_penalizing_prior,
    create_constrained_inverse_gamma_prior,
    create_eta_prior,
    create_m_and_L_recommendations,
)

Main public types and helpers:

Name Purpose
HSGP General HSGP configuration
SoftPlusHSGP HSGP variant used by time-varying parameter surfaces
HSGPPeriodic Periodic HSGP configuration
CovFunc Covariance-function enum for HSGP
PeriodicCovFunc Periodic covariance-function enum
approx_hsgp_hyperparams(...) Approximate HSGP hyperparameter helper
create_eta_prior(...) Eta prior helper
create_m_and_L_recommendations(...) Basis-size and domain recommendations
create_complexity_penalizing_prior(...) Complexity-penalising prior helper
create_constrained_inverse_gamma_prior(...) Inverse-gamma prior helper

Scaling types

Scaling is not re-exported from abacus.mmm. Import it from abacus.mmm.scaling:

from abacus.mmm.scaling import Scaling, VariableScaling

The scaling types are:

Type Purpose
VariableScaling Method and non-date dims for one variable group
Scaling Combined target and channel scaling configuration

Supported VariableScaling.method values are:

  • "max"
  • "mean"

VariableScaling.dims must not include date, because Abacus already assumes the date dimension for scaling.

Notes on import paths

  • PanelMMM is not re-exported from abacus.mmm. Import it from abacus.mmm.panel.
  • Scaling is not re-exported from abacus.mmm. Import it from abacus.mmm.scaling.

Post-Fit Facades

After fitting, PanelMMM exposes most read and reporting operations through bound properties:

  • mmm.data
  • mmm.summary
  • mmm.diagnostics
  • mmm.plot

These are the preferred entry points when you already have a fitted model.

mmm.data

mmm.data returns MMMIDataWrapper.

Direct import path:

from abacus.data.idata import MMMIDataWrapper

You can also create it explicitly with:

wrapper = MMMIDataWrapper.from_mmm(mmm)

Main methods:

Method Purpose
get_target(original_scale=True) Return observed target data
get_channel_spend() Return observed channel spend
get_posterior_predictive(original_scale=True) Return posterior predictive samples
get_errors(original_scale=True) Return residual samples
get_channel_contributions(original_scale=True) Return media contribution samples
get_contributions(...) Return channels, baseline, controls, seasonality, and events
get_elementwise_roas(original_scale=True) Contribution-over-spend ratios
get_elementwise_cost_per_target(original_scale=True) Spend-over-contribution ratios
get_channel_scale() Return stored channel scaling factors
get_target_scale() Return stored target scaling factors
to_original_scale(...) Convert a posterior variable or array to original scale
to_scaled(...) Convert an original-scale array back to model scale

mmm.summary

mmm.summary returns MMMSummaryFactory.

Direct import path:

from abacus.mmm.summary import MMMSummaryFactory

If you instantiate it manually, pass model=mmm when you need transform-backed curve summaries:

summary = MMMSummaryFactory(mmm.data, model=mmm)

Main methods:

Method Purpose
posterior_predictive(...) Predictive summary table with observed target
contributions(...) Tidy contribution summaries
mean_contributions_over_time(...) Wide decomposition table
roas(...) ROAS summary
cost_per_target(...) Cost-per-target summary
efficiency(...) Target-type-aware efficiency summary
channel_spend(...) Raw spend table
saturation_curves(...) Saturation curve summary table
adstock_curves(...) Adstock curve summary table
total_contribution(...) Totals by component type
change_over_time(...) Percentage change in channel contributions

MMMSummaryFactory also exposes:

  • hdi_probs
  • output_format
  • efficiency_metric
  • efficiency_metric_label

mmm.diagnostics

mmm.diagnostics returns MMMDiagnosticsFactory.

Direct import path:

from abacus.mmm.diagnostics.factory import MMMDiagnosticsFactory

Main methods:

Method Purpose
design_summary(X, ...) Per-variable design checks
design_report(X, ...) Machine-readable design report
mcmc_summary(...) Parameter-level MCMC diagnostics
mcmc_report(...) Machine-readable MCMC report
predictive_summary(...) Aggregate predictive metrics
predictive_report(...) Machine-readable predictive report

The report methods return typed dataclass objects with to_dict().

mmm.plot

mmm.plot returns MMMPlotSuite.

Direct import path:

from abacus.mmm.plot import MMMPlotSuite

PanelMMM binds this automatically to the model’s idata, but the class also supports compatible custom InferenceData objects.

Main methods:

Method Purpose
posterior_predictive(...) Plot fitted or sampled predictive series
prior_predictive(...) Plot prior predictive series
residuals_over_time(...) Plot residual trajectories
residuals_posterior_distribution(...) Plot residual posterior distributions
contributions_over_time(...) Plot time-series contributions
posterior_distribution(...) Plot posterior violin distributions
channel_parameter(...) Plot channel-level parameter posteriors
prior_vs_posterior(...) Compare prior and posterior distributions
saturation_scatterplot(...) Plot spend-versus-contribution scatter views
saturation_curves(...) Plot sampled saturation curves
waterfall_components_decomposition(...) Plot waterfall decompositions
media_contribution_over_time(...) Plot stacked media contributions
channel_contribution_share_hdi(...) Plot contribution share intervals
budget_allocation(...) Plot optimisation allocation outputs
allocated_contribution_by_channel_over_time(...) Plot simulated allocation contributions

Direct idata utilities

The abacus.data.idata package also exports schema and utility helpers:

Import Purpose
MMMIdataSchema Expected structure for retained MMM InferenceData
VariableSchema Variable-level schema helper
InferenceDataGroupSchema Group-level schema helper
filter_idata_by_dates(...) Filter idata on a date window
filter_idata_by_dims(...) Filter idata on non-date dimensions
aggregate_idata_time(...) Aggregate idata over time
aggregate_idata_dims(...) Aggregate idata over non-time dims
subsample_draws(...) Subsample posterior draws

Optimisation API

Abacus exposes the retained optimisation surface through abacus.mmm.optimization.

For workflow guidance and interpretation, see Optimisation.

Primary wrapper

Recommended import path:

from abacus.mmm.optimization import PanelBudgetOptimizerWrapper

PanelBudgetOptimizerWrapper adapts a fitted PanelMMM to the generic budget optimiser.

Constructor:

wrapper = PanelBudgetOptimizerWrapper(
    model=mmm,
    start_date="2025-03-03",
    end_date="2025-03-31",
)

Main constructor arguments:

Argument Meaning
model Fitted PanelMMM
start_date Optimisation window start date
end_date Optimisation window end date
compile_kwargs Optional compilation settings

Useful attributes:

Attribute Meaning
start_date Requested window start
end_date Requested window end
num_periods Number of periods in the optimisation window
zero_data Synthetic zero-spend future dataset
channel_columns Modelled channels
dims Budget dims beyond date

Main methods

PanelBudgetOptimizerWrapper exposes two user-facing methods:

Method Purpose
optimize_budget(...) Optimise allocation over the future window
sample_response_distribution(...) Simulate spend and contribution outcomes for an allocation

optimize_budget(...)

Key arguments:

Argument Meaning
budget Total spend across all optimised cells for one model period
budget_bounds Optional per-cell lower and upper bounds
response_variable Objective variable to optimise
utility_function Utility function applied to the response distribution
constraints Extra custom constraints
default_constraints Whether to add the default sum constraint
budgets_to_optimize Optional boolean mask over budget cells
budget_distribution_over_period Optional date flighting weights
callback Whether to return iteration diagnostics

Return values:

  • allocation, result
  • allocation, result, callback_info when callback=True

allocation is an xarray.DataArray over the non-date budget dimensions. result is SciPy OptimizeResult.

sample_response_distribution(...)

Key arguments:

Argument Meaning
allocation_strategy Optimised or manually supplied allocation
noise_level Relative noise added to the synthetic future spend
additional_var_names Extra posterior predictive variables to include
include_last_observations Pass lag context into posterior predictive sampling
include_carryover Extend and zero the tail to capture carryover
budget_distribution_over_period Optional date flighting weights

It returns an xarray.Dataset containing:

  • allocation
  • one variable per channel for realised spend
  • the model output variable
  • channel_contribution
  • total_media_contribution_original_scale
  • any extra requested variables

Advanced exported helpers

The same module also exports:

from abacus.mmm.optimization import (
    CustomModelWrapper,
    MinimizeException,
    OptimizerCompatibleModelWrapper,
    optimizer_xarray_builder,
)

These are advanced surfaces for custom optimiser integrations.

Name Purpose
CustomModelWrapper Wrap a custom PyMC model for optimisation
OptimizerCompatibleModelWrapper Protocol for compatible wrappers
optimizer_xarray_builder(...) Build shaped xarray inputs for optimisation
MinimizeException Exception raised when optimisation fails

Import-path note

abacus.mmm.panel also re-exports PanelBudgetOptimizerWrapper, but the recommended reference import path is abacus.mmm.optimization.

Scenario Planner API

The scenario planner API lives under abacus.scenario_planner.

Use it when you want to compare current, manual, and fixed-budget optimised plans in total horizon spend units.

For workflow guidance, see Scenario Planning.

Main import path

from abacus.scenario_planner import (
    CurrentScenarioSpec,
    DataArraySpec,
    FixedBudgetOptimizedScenarioSpec,
    ManualAllocationScenarioSpec,
    ScenarioComparison,
    ScenarioPlanner,
    ScenarioResult,
)

The package also exports shared base types:

  • BaseScenarioSpec
  • HistoricalReferenceScenarioSpec
  • SimulatedScenarioSpec
  • ScenarioSpec

Scenario spec classes

Main concrete spec types:

Type Purpose
CurrentScenarioSpec Historical reference scenario
ManualAllocationScenarioSpec User-defined future allocation
FixedBudgetOptimizedScenarioSpec Fixed-budget optimised future allocation
DataArraySpec JSON-friendly or YAML-friendly xarray representation

Shared fields across the concrete specs include:

  • name
  • start_date
  • end_date
  • scenario_id

Planner service objects

Main service types:

Type Purpose
ScenarioPlanner Evaluate and compare scenarios for a fitted PanelMMM
ScenarioResult Output object from evaluate(...)
ScenarioComparison Combined output object from compare(...)

ScenarioPlanner

planner = ScenarioPlanner(mmm)

Main methods:

Method Purpose
evaluate(spec) Evaluate one scenario and return ScenarioResult
compare(specs) Evaluate several scenarios and return ScenarioComparison

Useful property:

Property Meaning
channels Modelled channel names

ScenarioResult

ScenarioResult exposes:

  • spec
  • totals
  • channels
  • contributions_over_time
  • allocation
  • metadata

ScenarioComparison

ScenarioComparison exposes:

  • totals
  • channels
  • contributions_over_time
  • allocations
  • metadata

It also provides:

payload = comparison.to_store_payload()

to_store_payload() returns a JSON-friendly payload for client-side UIs.

Dash app entry point

The optional UI lives in a separate module:

from abacus.scenario_planner.dash_app import create_scenario_planner_dash_app

Use it like this:

app = create_scenario_planner_dash_app(comparison)
app.run(debug=True)

The Dash app visualises a precomputed ScenarioComparison. It does not run the planner itself.

Builders and Pipeline

Abacus exposes one public YAML builder and one structured pipeline runner.

Use these surfaces when you want configuration-driven model construction or a staged run directory with machine-readable artefacts.

YAML builder

Import path:

from abacus.mmm.builders.yaml import build_mmm_from_yaml

Signature:

model = build_mmm_from_yaml(
    config_path,
    X=X,
    y=y,
    model_kwargs=None,
    holidays_path=None,
)

Main inputs:

Argument Meaning
config_path YAML file path
X Optional pre-loaded feature data
y Optional pre-loaded target data
model_kwargs Model init overrides
holidays_path Optional holiday CSV override

It returns a built PanelMMM.

The builder orchestrates:

  • model construction
  • optional additive effects
  • holiday augmentation
  • build_model(X, y)
  • optional original_scale_vars
  • optional calibration steps
  • optional inference-data attachment

Structured pipeline runner

Top-level import path:

from abacus.pipeline import PipelineRunConfig, PipelineRunResult, run_pipeline

PipelineRunConfig

PipelineRunConfig is the user-facing run configuration dataclass.

Key fields:

Field Meaning
config_path YAML config file
output_dir Output root for run directories
run_name Optional logical run name
dataset_path Optional combined dataset CSV
x_path / y_path Optional separate feature and target CSVs
holidays_path Optional holiday CSV override
target_column Optional target-column override
prior_samples Prior predictive sample count
draws, tune, chains, cores Sampler overrides
random_seed Global random seed override
curve_samples Curve summary sample count
curve_points Curve summary x-axis resolution

It also exposes:

  • effective_run_name()

run_pipeline(...)

Use run_pipeline(...) to execute the structured runner:

from abacus.pipeline import PipelineRunConfig, run_pipeline

result = run_pipeline(
    PipelineRunConfig(
        config_path="config.yml",
        dataset_path="data.csv",
    )
)

run_pipeline(...):

  • loads the YAML config
  • loads data from the configured or overridden paths
  • resolves sampler overrides
  • creates the run directory and manifest
  • runs the retained stage sequence

The stage sequence is:

  1. metadata
  2. preflight
  3. fit
  4. assessment
  5. decomposition
  6. diagnostics
  7. curves
  8. optimisation

PipelineRunResult

PipelineRunResult is a small dataclass with:

Field Meaning
run_dir Concrete run directory path
manifest_path Manifest JSON path

CLI entry point

The CLI entry point lives in abacus.pipeline.runner:

python -m abacus.pipeline.runner --config config.yml --dataset-path data.csv

For full CLI usage, see CLI Reference.

Additive Effects and Events

Abacus supports advanced additive components through mu_effects and dated event surfaces.

These are extension points rather than the default modelling path, but they are part of the retained public API.

MuEffect protocol surface

Import path:

from abacus.mmm.additive_effect import MuEffect

MuEffect is the abstract base class for additive components appended to mmm.mu_effects.

Required methods:

Method Purpose
create_data(mmm) Register any required pm.Data inputs
create_effect(mmm) Return the additive contribution tensor
set_data(mmm, model, X) Update the effect for new prediction data

Custom effects should inherit from MuEffect so they can participate in model serialization logic.

Built-in additive effect classes

Import path:

from abacus.mmm.additive_effect import (
    EventAdditiveEffect,
    FourierEffect,
    LinearTrendEffect,
)

Built-in types:

Type Purpose
FourierEffect Wrap a FourierBase component as a MuEffect
LinearTrendEffect Wrap a LinearTrend component as a MuEffect
EventAdditiveEffect Turn dated events into additive model effects

Typical usage:

from abacus.mmm import WeeklyFourier
from abacus.mmm.additive_effect import FourierEffect

mmm.mu_effects.append(
    FourierEffect(fourier=WeeklyFourier(n_order=2, prefix="weekly"))
)

Event surfaces

Import path:

from abacus.mmm.events import (
    AsymmetricGaussianBasis,
    EventEffect,
    GaussianBasis,
    HalfGaussianBasis,
)

Main public event types:

Type Purpose
EventEffect Event effect specification combining a basis and effect size prior
GaussianBasis Symmetric Gaussian event basis
HalfGaussianBasis One-sided Gaussian event basis
AsymmetricGaussianBasis Gaussian basis with different pre and post widths

You can use EventEffect either:

  • directly with PanelMMM.add_events(...), or
  • indirectly through EventAdditiveEffect

Example: direct event attachment

from pymc_extras.prior import Prior

from abacus.mmm.events import EventEffect, GaussianBasis

effect = EventEffect(
    basis=GaussianBasis(),
    effect_size=Prior("Normal", mu=0, sigma=1, dims="promo"),
    dims=("promo",),
)

mmm.add_events(df_events=df_events, prefix="promo", effect=effect)

Serialisation note

FourierEffect and LinearTrendEffect participate in the PanelMMM round-trip path.

EventAdditiveEffect does not currently round-trip through PanelMMM.load(...), because the original event DataFrame is not serialised.