Posterior Predictive Checks

Use posterior predictive draws to check in-sample fit and to generate predictions for new rows that follow the fitted panel layout.

For diagnostic metrics after sampling, see Diagnostics. For table export, see Summary and Export.

Sample posterior predictive draws

Use PanelMMM.sample_posterior_predictive(...) on a fitted model:

posterior_predictive = mmm.sample_posterior_predictive(
    X=X,
    random_seed=42,
    progressbar=False,
)

sample_posterior_predictive(...):

requires X
uses the fitted posterior stored on mmm.idata
reshapes X into the model’s panel xarray layout
runs pymc.sample_posterior_predictive(...)
returns an extracted xarray.Dataset

By default, combined=True, so the returned dataset uses a sample dimension. If you want separate chain and draw dimensions, set combined=False.

Store or return only

By default, Abacus also writes the predictive samples back to mmm.idata:

posterior_predictive = mmm.sample_posterior_predictive(
    X=X,
    extend_idata=True,
    random_seed=42,
    progressbar=False,
)

With extend_idata=True, Abacus adds:

idata.posterior_predictive
idata.posterior_predictive_constant_data

If you only want the returned samples and do not want to update mmm.idata, set extend_idata=False.

Check training-fit values against observed data

For an in-sample check, pass the same design matrix you used for fitting. This is the same pattern used by the pipeline’s Stage 30 training-fit assessment:

mmm.sample_posterior_predictive(
    X=X,
    random_seed=42,
    progressbar=False,
)

fit_table = mmm.summary.posterior_predictive(hdi_probs=[0.94])
figure, axes = mmm.plot.posterior_predictive(
    var=[mmm.output_var],
    hdi_prob=0.94,
)

Example posterior predictive output:

mmm.summary.posterior_predictive() returns a table with:

observed target values
posterior predictive mean and median
HDI bound columns such as abs_error_94_lower and abs_error_94_upper

You can also access the predictive draws directly:

predictive = mmm.data.get_posterior_predictive(original_scale=True)
errors = mmm.data.get_errors(original_scale=True)

Blocked holdout validation

For the structured pipeline’s Stage 35 validation, Abacus fits a fresh model on the training window and then scores only the holdout dates:

holdout_predictive = validation_mmm.sample_posterior_predictive(
    X=X_holdout,
    include_last_observations=True,
    random_seed=42,
    progressbar=False,
)

That holdout path is different from the in-sample check above:

the model is fit on X_train and y_train only
the holdout X contains only future dates
include_last_observations=True keeps lag history for adstock carryover
the returned samples are used to compute holdout metrics such as RMSE, MAE, NRMSE, NMAE, CRPS, bias, and coverage at 50%, 80%, and 94%

The holdout stage is more expensive than the in-sample check because it adds a second fit.

Predict on new dates

For future prediction, pass a new X with the same structural columns as the training data:

future_predictive = mmm.sample_posterior_predictive(
    X=X_future,
    include_last_observations=True,
    random_seed=42,
    progressbar=False,
)

sample_posterior_predictive(...) does not take y. For a holdout or future window, keep the actual target outside the model and align it yourself if you want external evaluation.

Use `include_last_observations` correctly

Set include_last_observations=True when the forecast window needs lag history for adstock carryover.

When enabled, Abacus:

prepends the last adstock.l_max training observations internally
samples posterior predictive values on the padded data
removes the prepended rows from the returned result

This only works when the input dates do not overlap with the training dates. If they do overlap, Abacus raises a ValueError.

Practical guidance

Use the training X for fitted-versus-observed checks.
Use future-only dates for forward prediction.
Use the training-window refit pattern for blocked holdout validation.
Keep combined=True if you want a simpler sample dimension.
Use combined=False if you need explicit chain and draw dimensions.
Call sample_posterior_predictive(...) before using mmm.diagnostics.predictive_summary() or mmm.summary.posterior_predictive().

Common pitfalls

Calling sample_posterior_predictive(...) without X
Expecting y to be passed into the predictive method
Using include_last_observations=True on dates that overlap with training data
Forgetting that the returned object is extracted samples, while the stored idata.posterior_predictive group keeps the native posterior predictive structure