Quickstart: Pipeline Runner

Use the pipeline runner when you want a full staged run instead of only an in-memory model fit.

The runner writes:

  • a run manifest
  • copied and resolved config files
  • fitted model artefacts
  • posterior predictive assessment outputs
  • decomposition, diagnostics, and response-curve artefacts

Fastest first run: bundled demo

From the repository root, the quickest way to see a real structured run is the demo launcher:

python3 runme.py --demo timeseries

Other bundled demos are:

  • geo_panel
  • geo_brand_panel

List them explicitly with:

python3 runme.py --list-demos

runme.py is a convenience wrapper around the structured pipeline. It resolves the demo config under data/demo/<demo_name>/config.yml and runs the pipeline for you.

Run the pipeline from Python

The direct Python API is:

from pathlib import Path

from abacus.pipeline import PipelineRunConfig, run_pipeline

result = run_pipeline(
    PipelineRunConfig(
        config_path=Path("data/demo/geo_panel/config.yml"),
        output_dir=Path("results"),
        run_name="geo_panel_quickstart",
        prior_samples=10,
        draws=200,
        tune=200,
        chains=2,
        cores=2,
        random_seed=42,
        curve_samples=50,
        curve_points=50,
    )
)

print(result.run_dir)
print(result.manifest_path)

If the YAML config already contains data.dataset_path, you do not need to pass dataset_path again.

Run the thin CLI directly

The pipeline also exposes a thin CLI in abacus.pipeline.runner:

python3 -m abacus.pipeline.runner \
  --config data/demo/geo_panel/config.yml \
  --output-dir results \
  --run-name geo_panel_quickstart \
  --prior-samples 10 \
  --draws 200 \
  --tune 200 \
  --chains 2 \
  --cores 2 \
  --random-seed 42 \
  --curve-samples 50 \
  --curve-points 50

The CLI prints the final run directory when the pipeline completes.

Override data paths

Use one of these patterns:

Pattern Arguments
Combined dataset override dataset_path= in Python or --dataset-path in the CLI
Separate feature and target files x_path= and y_path= in Python or --x-path and --y-path in the CLI
Target column override target_column= in Python or --target-column in the CLI

Configured relative paths are resolved relative to the YAML config directory.

If you want Stage 50 to use different warn/fail cutoffs, add a runner-only diagnostics.thresholds block to the YAML. See YAML Configuration.

What you get back

run_pipeline(...) returns a PipelineRunResult with:

  • run_dir
  • manifest_path

The output directory contains stage folders such as:

  • 00_run_metadata
  • 20_model_fit
  • 30_model_assessment
  • 50_diagnostics
  • 60_response_curves

60_response_curves now includes three complementary curve families:

  • saturation-only transformation artefacts
  • forward-pass direct contribution artefacts built from scaled observed history
  • adstock carryover artefacts

When to use the runner

Choose the runner when you want:

  • a reproducible run directory on disk
  • structured metadata and manifest files
  • staged artefacts for diagnostics and reporting
  • a config-driven workflow for repeated runs

If you only need to fit a model interactively in a notebook or script, start with Quickstart: Python API or Quickstart: YAML Builder.