notebooks
notebooks copied to clipboard
Missing dataset name in test split path in data.yaml
Search before asking
- [X] I have searched the Roboflow Notebooks issues and found no similar bug report.
Notebook name
YOLOv5 PyTorch Object Detection
Bug
When using the Roboflow data import, the dataset name is not written to the path of the dataset test split in the data.yaml
file.
Example:
test: ../test/images
train: dataset_name/train/images
val: dataset_name/valid/images
This can lead to problems, e.g. when trying to validate on the dataset test split (with --task test
):
FileNotFoundError: test: /content/yolov5/test/images does not exist
It is no problem adding the dataset name to the test path in the data.yaml
file, but an inexperienced user might not know how to do this.
I don't know if there is a reason behind not adding the dataset name to the test path, if not this might be a bug.
Environment
- Google Colab
- OS: Ubuntu 20.04.5 LTS (Focal Fossa)
- Python: 3.9.16
- roboflow 0.2.34
- YOLOv5 v7.0-120-g3e55763
Minimal Reproducible Example
No response
Additional
No response
Are you willing to submit a PR?
- [ ] Yes I'd like to help by submitting a PR!
👋 Hello @maxsitt, thank you for leaving an issue on Roboflow Notebooks.
🐞 Bug reports
If you are filing a bug report, please be as detailed as possible. This will help us more easily diagnose and resolve the problem you are facing. To learn more about contributing, check out our Contributing Guidelines.
If you require support with custom code that is not part of Roboflow Notebooks, please reach out on the Roboflow Forum or on the GitHub Discussions page associated with this repository.
💬 Get in touch
Do you have more questions about Roboflow that we haven't responded to yet? Feel free to ask them on the Roboflow Discuss forum. Our developer advocates and community team actively respond to questions there.
To ask questions about Notebooks, head over to the GitHub Discussions section of this repository.
Hi, @maxsitt! Could you send me the link to your dataset on Roboflow?
Hi, @maxsitt I just tested, and the download works as expected. Training as well.
I used this code to download your dataset:
%cd /content/yolov5
from roboflow import Roboflow
rf = Roboflow(api_key="API_KEY")
project = rf.workspace("maximilian-sittinger").project("insect_detect_detection")
dataset = project.version(7).download("yolov5")
When I take a look at data.yaml
it looks like this:
names:
- insect
nc: 1
roboflow:
license: CC BY 4.0
project: insect_detect_detection
url: https://universe.roboflow.com/maximilian-sittinger/insect_detect_detection/dataset/7
version: 7
workspace: maximilian-sittinger
test: ../test/images
train: Insect_Detect_detection-7/train/images
val: Insect_Detect_detection-7/valid/images
Yes, as described the only problem occurs when you need to get the path to the dataset test split from the data.yaml
, e.g. for validating on it.
If you try to run:
%cd /content/yolov5
!python val.py --weights runs/train/exp/weights/best.pt --data {dataset.location}/data.yaml --img 320 --task test
The dataset test split is not found because of the wrong path (missing dataset name) in the data.yaml
file:
FileNotFoundError: test: /content/yolov5/test/images does not exist
Hi @SkalskiP ,
I think this is happening because the "__reformat_yaml" method of version.py is reformating only the train and val locations (as shown below) when the model_format is "yolov5" or "yolov5pytorch".
I think adding
content["test"] = location + content["test"].lstrip(".")
after line 728 will solve the issue.
Shall I proceed with making the changes in roboflow-python repository and raise a PR?
Hi, @arijitde92! 👋🏻 Let me ask around internally on Slack first.
Hi @xabierr , did you use your own custom dataset or is "Noosa_2-1" available in the roboflow datasets?
@Jacobsolawetz / @yeldarby / @mo-traor3-ai, is that intentional behavior?
Hi @SkalskiP @arijitde92,
any updates on this? Why is the path to the test folder only reformatted for YOLOv6?
Same problem in this issue.
Thanks!
Hi @SkalskiP , did you get any information about whether it is an intentional behavior?
Hi @arijitde92 👋🏻 Unfortunately, I didn't. Let me try to ping the dev team once again.
I have the same problem FileNotFoundError: Dataset 'data.yaml' not found ⚠️, missing paths ['/content/gdrive/MyDrive/yolov8/datasets/valid/images']
I have the same problem FileNotFoundError: Dataset 'data.yaml' not found ⚠️, missing paths ['/content/gdrive/MyDrive/yolov8/datasets/valid/images']
Did you make sure you have all three necessary folders, subfolders and its data/files available in your runtime session or file location? Make sure this is the case.
-- /content
-- datasets
-- NAME_OF_YOUR_DATASET
-- train
-- valid
-- test
It works for me even with a missing 'test'-folder, but it is not working when you only have one of these three (e.g. only 'train').