dvc icon indicating copy to clipboard operation
dvc copied to clipboard

plots: allow headerless top-level plots

Open dberenbaum opened this issue 3 years ago • 3 comments

Follow up to https://github.com/iterative/dvc/issues/7754. It's common to have essentially a one-column csv with predicted or actual values for your data. See https://dvc.org/doc/command-reference/plots/show#sourcing-x-and-y-from-different-files for an example. However, it's unlikely that these files will have a header row like in the example above. Therefore, it would be useful to support the no-header option in top-level plots along with a numeric column index.

dberenbaum avatar Nov 21 '22 20:11 dberenbaum

Another use case here: CSV files without column names.

0.000,0.000
2.000,0.017
4.000,0.031
6.000,0.046
8.000,0.060

Can we just use column indexes to specify data for x and y?

plots:
  - data.csv:
      x: 0
      y: 1

mnrozhkov avatar Feb 13 '24 12:02 mnrozhkov

@mnrozhkov Does it come from a customer?

dberenbaum avatar Feb 13 '24 12:02 dberenbaum

@dberenbaum Yes, this is specific for simulation pipelines when data is generated with a third-party tool and doesn't have a column name

  • the first column is a step, time or timestamp
  • the second column is a value of the target metric (usually defined by the file name)

As a workaround, we now have to add post-processing of these files - adding column names.

mnrozhkov avatar Feb 13 '24 13:02 mnrozhkov