Quickstart: Pipeline Runner

Use the pipeline runner when you want a full staged run instead of only an in-memory model fit.

The runner writes:

a run manifest
copied and resolved config files
fitted model artefacts
posterior predictive assessment outputs
decomposition, diagnostics, and response-curve artefacts

Fastest first run: bundled demo

From the repository root, the quickest way to see a real structured run is the demo launcher:

python3 runme.py --demo timeseries

Other bundled demos are:

geo_panel
geo_brand_panel

List them explicitly with:

python3 runme.py --list-demos

runme.py is a convenience wrapper around the structured pipeline. It resolves the demo config under data/demo/<demo_name>/config.yml and runs the pipeline for you.

Run the pipeline from Python

The direct Python API is:

from pathlib import Path

from abacus.pipeline import PipelineRunConfig, run_pipeline

result = run_pipeline(
    PipelineRunConfig(
        config_path=Path("data/demo/geo_panel/config.yml"),
        output_dir=Path("results"),
        run_name="geo_panel_quickstart",
        prior_samples=10,
        draws=200,
        tune=200,
        chains=2,
        cores=2,
        random_seed=42,
        curve_samples=50,
        curve_points=50,
    )
)

print(result.run_dir)
print(result.manifest_path)

If the YAML config already contains data.dataset_path, you do not need to pass dataset_path again.

Run the thin CLI directly

The pipeline also exposes a thin CLI in abacus.pipeline.runner:

python3 -m abacus.pipeline.runner \
  --config data/demo/geo_panel/config.yml \
  --output-dir results \
  --run-name geo_panel_quickstart \
  --prior-samples 10 \
  --draws 200 \
  --tune 200 \
  --chains 2 \
  --cores 2 \
  --random-seed 42 \
  --curve-samples 50 \
  --curve-points 50

The CLI prints the final run directory when the pipeline completes.

Override data paths

Use one of these patterns:

Pattern	Arguments
Combined dataset override	`dataset_path=` in Python or `--dataset-path` in the CLI
Separate feature and target files	`x_path=` and `y_path=` in Python or `--x-path` and `--y-path` in the CLI
Target column override	`target_column=` in Python or `--target-column` in the CLI

Configured relative paths are resolved relative to the YAML config directory.

If you want Stage 50 to use different warn/fail cutoffs, add a runner-only diagnostics.thresholds block to the YAML. See YAML Configuration.

What you get back

run_pipeline(...) returns a PipelineRunResult with:

run_dir
manifest_path

The output directory contains stage folders such as:

00_run_metadata
20_model_fit
30_model_assessment
50_diagnostics
60_response_curves

60_response_curves now includes three complementary curve families:

saturation-only transformation artefacts
forward-pass direct contribution artefacts built from scaled observed history
adstock carryover artefacts

When to use the runner

Choose the runner when you want:

a reproducible run directory on disk
structured metadata and manifest files
staged artefacts for diagnostics and reporting
a config-driven workflow for repeated runs

If you only need to fit a model interactively in a notebook or script, start with Quickstart: Python API or Quickstart: YAML Builder.