supervision icon indicating copy to clipboard operation
supervision copied to clipboard

Add `train, val, test` folder paths in `data.yaml` at `save_data_yaml()`

Open xaristeidou opened this issue 6 months ago • 8 comments

Description

In the process of developing a notebook as scheduled in #1388, I used the sv.DetectionDataset().as_yolo() method which executes in the backed the save_data_yaml() function to create the data.yaml file need for the dataset.

As YOLO the model construction for training, and the documentation, the model needs prerequisite train, val arguments in the data.yaml file, and the test argument is not needed but could be passed also if a test dataset exists. When trying to run model.train() the following error raises: SyntaxError: /content/dataset/data.yaml 'train:' key missing ❌. 'train' and 'val' are required in all data YAMLs.

Therefore in save_data_yaml() except the nc, names arguments we should export also the train, val, test paths in order to be ready for executing the model.train() process. I think we should export the default paths as follows:

  • train: train/images
  • val: valid/images
  • test: test/images

If someone has a different working directory than the root of the folder containing the data.yaml, should change these paths manually. At least it will be easier to debug and modify the path if needed than to add the arguments in the yaml file.

List any dependencies that are required for this change.

None

Please delete options that are not relevant.

  • [x] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)

How has this change been tested, please provide a testcase or example of how you tested the change?

Using sv.DetectionDataset.as_yolo() exports the data.yaml file with prerequisite train, val, test paths.

Any specific deployment considerations

None

Docs

  • [ ] Docs updated? No need to

xaristeidou avatar Aug 01 '24 12:08 xaristeidou