ultralytics validation set

Search before asking

[X] I have searched the Ultralytics YOLO issues and discussions and found no similar questions.

Question

Hello author, why do you say that the validation set has no effect on training? I used the same training set and different validation sets, and found that the test results of the final data test were different, which shows that different validation sets should have an impact on training.

Additional

No response

Aug 15 '24 01:08 Ellohiye

Is it possible to train without a validation set during yolov8?

Aug 15 '24 02:08 Ellohiye

Is it possible to train without a validation set during yolov8?

Yes

from ultralytics import YOLO

model = YOLO("yolov8n.pt")
result = model.train(data="coco8.yaml", val=False)

Aug 15 '24 02:08 Burhan-Q

@Burhan-Q Thank you for your reply! Can you answer my doubts about the first question? Does the validation set not participate in the loss update during the training process? Does it have no effect on the training process?

Aug 15 '24 04:08 Ellohiye

@Burhan-Q from ultralytics import YOLO

model = YOLO("yolov8n.pt") result = model.train(data="coco8.yaml", val=False) Dear author, is there anything else I need to set besides this one? Training error: FileNotFoundError: val: Error loading data from None See https://docs.ultralytics.com/datasets/detect for dataset formatting guidance.

Aug 15 '24 05:08 Ellohiye

Validation set is used to select the best.pt.

Can you test the last.pt from all the training?

Aug 15 '24 05:08 Y-T-G

@Y-T-G Thanks for your reply. I have tested it. The accuracy of the last.pt model obtained by training on the same training set and the validation set processed with leetbox/not processed with LetterBox is still different. Since the validation set does not affect the generation of last.pt, why does the LetterBox operation affect the accuracy results?

Aug 15 '24 08:08 Ellohiye

@Y-T-G Also, I set val=False in the code, but a validation set is still required, otherwise the code cannot be trained properly.

Aug 15 '24 08:08 Ellohiye

In general, there's randomness that would appear in training even if you use the same dataset. But ultralytics manually sets the seed to prevent that. But I am not sure if it is working correctly.

Aug 15 '24 10:08 Y-T-G

@Y-T-G This has nothing to do with the random seed. What I am wondering is why the test accuracy of the trained model with the same training set and different validation sets is different? I guarantee that I completed the training in the same environment.

Aug 15 '24 11:08 Ellohiye

I just said. You can train a model on the same training dataset and have different results. Because model training isn't deterministic. You have to fix the seed to make it deterministic which is done by default, but might not be working.

Aug 15 '24 11:08 Y-T-G

Also how are you verifying whether it's the same or not?

The model performance on different validation sets will of course be different.

But on the same validation/test set, last.pt that's trained on the same training set and different validation set should show the same performance (provided deterministic training is working).

Aug 15 '24 11:08 Y-T-G

I just tested and the results are the same for last.pt from two different trainings with two different validation set but the same training set.

I ran model.val() on the same validation set for the two models.

I am not sure how you're training or testing because you didn't post any code.

Aug 15 '24 12:08 Y-T-G

@Y-T-G “The accuracy of the last.pt model obtained by training on the same training set and the validation set processed with leetbox/not processed with LetterBox is still different.” This is the way I tested it. Of course, as I mentioned earlier, I ensured that the training was carried out in a unified environment (the random seed was fixed).

Aug 15 '24 15:08 Ellohiye

Provide the code for the test. I don't understand what you're saying about letterbox or why it is relevant.

Here's how I tested it:

Train model_1 with train_set_1 and val_set_1.
Train model_2 with train_set_1 and val_set_2.
Run model.val() with model_1's last.pt on val_set_1
Run model.val() with model_2's last.pt on val_set_1
Results are the same.

Provide how you trained and tested it step by step.

Aug 15 '24 15:08 Y-T-G

I canceled the LetterBox operation in the above location of the "ultralytics-main\ultralytics\data\dataset.py" file. This only affects the validation set. My operation process is:

Train model_1 with train_set_1 and val_set_1(with LetterBox).
Train model_2 with train_set_1 and val_set_2（no LetterBox）.
Run model.val() with model_1's last.pt on val_set_1(with LetterBox)
Run model.val() with model_2's last.pt on val_set_1（no LetterBox）
Results are different. @Y-T-G

Aug 15 '24 15:08 Ellohiye

If you're using different preprocessing, of course the results will be different.

Aug 15 '24 15:08 Y-T-G

Even the model_1 result will be different with no letterbox vs. with letterbox.

Aug 15 '24 15:08 Y-T-G

The results would be same if you test model_2 with letterbox just like you tested model_1 with letterbox.

Or if you test both without letterbox.

Aug 15 '24 15:08 Y-T-G

Either run model.val() for both with letterbox, or both without letterbox. Not one with letterbox and one without.

Aug 15 '24 15:08 Y-T-G

@Y-T-G Why does this preprocessing affect the final detection accuracy? Isn't this processing only for the data of the validation set?

Aug 15 '24 15:08 Ellohiye

model.val() is going to use that preprocessing. It will only skip it during training for training set.

Aug 15 '24 16:08 Y-T-G

@Y-T-G Sorry, my test steps are like this. During testing, LetterBox processing is not used. It is only different during training. 1、Train model_1 with train_set_1 and val_set_1(with LetterBox). 2、Train model_2 with train_set_1 and val_set_2（no LetterBox）. 3、Run model.val() with model_1's last.pt on val_set_1(no LetterBox) 4、Run model.val() with model_2's last.pt on val_set_1（no LetterBox）

Aug 15 '24 16:08 Ellohiye

Unless your images are all the same size, that will give an error.

And if they're same size, the results will be the same. I didn't find any difference in the results.

Run:

import torch
from ultralytics import YOLO

model_1 = YOLO("model_1.pt")
sd1 = model_1.model.model.state_dict()

model_2 = YOLO("model_2.pt")
sd2= model_2.model.model.state_dict()

same = True 
for k1, k2 in zip(sd1, sd2):
     same &= torch.all(torch.eq(sd1[k1], sd2[k2]).view(-1))

print(same)

If it says True, then they're the same.

Aug 15 '24 16:08 Y-T-G

@Y-T-G Did you test the model with and without LetterBox operation?

Aug 15 '24 16:08 Ellohiye

@Y-T-G After my test, under the condition of ensuring the same training set and different test sets, the test results of the last.pt model are the same. But the LetterBox operation is an exception. My operation process:

Train model_1 with train_set_1 and val_set_1(with LetterBox).
Train model_2 with train_set_1 and val_set_2（no LetterBox）.
Run model.val() with model_1's last.pt on val_set_1(no LetterBox)
Run model.val() with model_2's last.pt on val_set_1（no LetterBox） I don't understand why using the LetterBox operation during training will affect the final test accuracy.

Aug 15 '24 16:08 Ellohiye

Besides updating Letterbox, you also need to set rect=False during validation to prevent minimum rectangle validation.

Oct 16 '24 06:10 Y-T-G

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Docs: https://docs.ultralytics.com
HUB: https://hub.ultralytics.com
Community: https://community.ultralytics.com

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

Nov 16 '24 00:11 github-actions[bot]