dvc Support strings in metrics

dvc metrics represent scalar numbers

This is nice for finding the difference in a metric between two models, however a couple metrics I'm interested in would benefit from being made more human readable by adding units. Specifically:

model file size (kilobytes, megabytes, and gigabytes)
inference latency (milliseconds, seconds)

There are other reasons beyond units to support strings, for example we use vertex.ai's training service, which can and does change without warning, so storing the date of the training would be useful. I would also be interested in having the model sha, so that given a model binary I can quickly verify which row corresponds to the model binary I have.

Jul 01 '22 20:07 shortcipher3

I have a couple more thoughts on this.

If I commit a file to the git repo with git then in a pr/mr I can see the text differences branches under github/gitlabs changes. Maybe text data should just be left up to git.

It would be helpful to be able to diff files from different branches stored in dvc - is there a way to do that? If I had the two files locally I could diff file1 file2, if stored in git I can git diff branch1:file1 branch2:file2, is there an equivalent way to run a diff in dvc?

I thought that dvc diff branch1:file1 branch2:file2 might do what I would like, instead it just reports that the files are different.

Jul 29 '22 22:07 shortcipher3

You can already use git to version text-based metrics files by using cache: false to tell DVC not to handle it as a DVC-tracked file, and then tracking the file in git yourself. (But you would still be affected by the existing DVC limitation where metrics files must only contain numeric values)

Regarding diffing, DVC does not know anything about the type of file that it is tracking - everything is treated as arbitrary binary data, so we don't provide any kind of contextual diffing (which depends on handling specific file types).

There are some existing feature requests regarding diff behavior (like https://github.com/iterative/dvc/issues/7657), but essentially you would need to implement something that wraps dvc diff [--json] output yourself and then passes the relevant cache paths into a separate diff tool (as described in #7657)

Aug 01 '22 04:08 pmrowla

HI @pmrowla @shortcipher3 @daavoo @codito @tizoc I would like to join iterative Team and contribute to the development of the project please do let me know where I start with the open source contribution till the time I join the team. I am a python developer with 4 years of experience.

Apr 08 '23 06:04 kshitiz305

Hi @shortcipher3 Could you provide an example of what is the current version doing and what is the requirements so that I can make the changes accordingly.

Apr 21 '23 13:04 kshitiz305

I created an example repo here

Essentially I have two branches with a metrics file for a tiny model:

{
  "size": "1 MB",
  "latency": "20 ms",
  "mAP": 0.7,
  "precision": 0.6,
  "recall": 0.8,
  "model": "tiny"
}

and a large model:

{
  "size": "1 GB",
  "latency": "2 s",
  "mAP": 0.8,
  "precision": 0.7,
  "recall": 0.9,
  "model": "Gigantamax"
}

When I run a diff I get the following:

# dvc metrics diff --target metrics.json -- main tiny
Path          Metric     main    tiny    Change
metrics.json  mAP        0.8     0.7     -0.1
metrics.json  precision  0.7     0.6     -0.1
metrics.json  recall     0.9     0.8     -0.1

I would love to get something more like:

# dvc metrics diff --target metrics.json -- main tiny
Path          Metric     main    tiny    Change
metrics.json  mAP        0.8     0.7     -0.1
metrics.json  precision  0.7     0.6     -0.1
metrics.json  recall     0.9     0.8     -0.1
metrics.json  size       1 GB    1 MB     ---
metrics.json  latency    2 s     20 ms    ---
metrics.json  model    gigantamax   tiny  ---

That way I'm getting a nice table of results and I'm able to easily compare metrics that are on completely different scales (GB/MB and seconds/milliseconds) - it would be hard to read if I converted the GB and MB to bytes, I would be slowing down to count number of digits.

I can also add in meaningful data to help the reader understand the difference.

Apr 21 '23 20:04 shortcipher3

As for being able to do a local diff, a lot of state of the art research are producing a family of models rather than just a single model, I would love to have a metrics file for each model and be able to do a diff on these. An example is DINO v2

They actually have a table comparing the models on a few metrics one of which has a string for units.

Some other models with multiple sizes are:

efficient-det shows units on latency, throughput, #params, and FLOPs
efficientnet
chatgpt

I would think we could generate some useful tables for understanding some of these parameters automatically, making it easier for the data scientist to make decisions.

Apr 21 '23 20:04 shortcipher3

Hello @shortcipher3 and @daavoo ,

I had a look into this issue and I might have a suitable solution.

I am new to the community, so I am not sure what is the best way to proceed. Should the issue be assigned to me before a pull request?

Thanks.

Apr 29 '23 17:04 paulourbano

Hei @paulourbano ! feel free to open the P.R.

May 01 '23 17:05 daavoo

dvc dvc copied to clipboard

Support strings in metrics

dvc
dvc copied to clipboard