dvc
dvc copied to clipboard
Allow naming matrix stage expansions
Summary
When a stage uses the matrix fan-out, DVC currently auto-generates suffixes such as stage@set0_set1. I would like to let users supply an optional name template so each matrix entry can be named in a predictable way.
Motivation
- Large pipelines combine many datasets and models. Auto-generated suffixes are opaque, which makes
dvc stage list,dvc repro, and log monitoring harder to follow. - My automation relies on stable stage identifiers that reflect dataset/model keys; today I have to reverse-map
dataset0_model0back to the real combinations with dvc stage list. Allowing for exemple in name"${item.model.key}_${item.dataset.key}"would remove that indirection.
Proposed change
- Accept an optional
namefield inside matrix stage definitions . The field is validated to ensure the resolved value is non-empty, doesn’t contain the@separator, and stays unique across the fan-out. ​ link to PR
Example
stages:
inference:
matrix:
dataset: ${datasets_list}
model: ${models_list}
name: "${item.model.key}_${item.dataset.key}"
cmd: >
inference
dvc stage list would then show entries such as inference@model_alpha_dataset_a instead of inference@dataset0_model0.