ultralytics
ultralytics copied to clipboard
Parametrized OKS sigma values
As described by COCO, Object Keypoint Similarity plays a crucial role when training a keypoint detection model, it's a very similar concept to the IoU. However now OKS value are hardcoded in the following way:
- in case of 17 kpts (i.e. human pose estimation) default 17 oks defined by coco are used.
- else: oks are automatically set to 1/nkpts.
This implementation do not allow the user to tune these important values for his custom application. I propose a slight change in the code that allows to load the OKS_SIGMA values from yolo parameters. The default human pose OKS that were hardcoded, are now in the default.yaml and can be changed there or passed as parameter. The solution of using as default 1/nkpts as sigma value, is adopted only in case there's a mismatch between the number of keypoints of the dataset being used and the length of oks_sigma passed.
π€ Generated by Copilot at f84efd8
Summary
ππ₯π οΈ
This pull request refactors the pose estimation model and its evaluation to use OKS_SIGMA
values from the configuration file ultralytics/cfg/default.yaml
. This improves the customization and consistency of the pose metrics across different datasets and keypoint shapes.
Oh we're the crew of the ultralytics ship And we work on the pose estimation We pull and we push on the
OKS_SIGMA
To adjust it for every situation
Walkthrough
- Add
OKS_SIGMA
option to configuration file to specify sigma values for keypoints OKS metric (link) - Read
OKS_SIGMA
values from configuration file inPoseValidator
andv8PoseLoss
classes, and check for length consistency with number of keypoints (link, link) - Remove import and definition of
OKS_SIGMA
frommetrics.py
,val.py
, andloss.py
modules, as they are no longer needed (link, link, link)
π οΈ PR Summary
Made with β€οΈ by Ultralytics Actions
π Key Changes
- The default Object Keypoint Similarity (OKS) sigma values have been defined in the
ultralytics/cfg/default.yaml
configuration file. - Code related to the computation of OKS in
val.py
andloss.py
has been updated to read the sigma values from the configuration rather than using a hardcoded array.
π― Purpose & Impact
- π¨ The customization of OKS sigma values allows developers to tune the pose estimation models more precisely for different datasets beyond COCO's human pose estimation.
- π₯ This change could improve model accuracy for users working with datasets requiring different keypoint configurations and improve the flexibility of the software.
π Summary
Introduce parametrized OKS sigma values to enhance pose estimation customization and accuracy. π―ππ§
CLA Assistant Lite bot All Contributors have signed the CLA. β
I have read the CLA Document and I sign the CLA
@Laughing-q what do you think of this idea?
Maybe i would put a warning message in case the number of keypoints you want to detect and the number of oks do not match, informing the user that yolo will default to 1/n.
@PallottaEnrico @glenn-jocher yeah allowing users to modify this arg could be very helpful. But I just don't know how much effect modifying this arg will have on users' training. I used to train a facial landmark(5 keypoints) model with default 1/nkpt
as oks-sigma and I've got visually good results already.
If users would get much better results after customizing this arg then we should go this idea, if not getting any improvements or just a slight better result then probably I don't recommend adding this arg.
@PallottaEnrico I saw your comment here https://github.com/ultralytics/ultralytics/issues/3061#issuecomment-1661745896, could you show us the visually better predicted ones you've got in M_0.05
experiment? and please tell me more about the unstable M_0.025
training you talked about. Thanks!
Hi @Laughing-q , unfortunately I'm working with data that I'm strictly prohibited to share, so i can't show you the visual improvement. However, in my case I'm using keypoints detection to estimate vertexes of an object and i need very high precision in localizing the them, since this information will be used for 3d geometry purposes. Using the default 1/nkpt in my case led to very high metrics (99.2% mAP50-95) but still, visually, the results were not too good, and i wasn't able to reach higher metrics score by tuning other hyperparameters. As i said in #3061 testing this trained model with different oks gave these results (i'll write them again to explain better)
- 0.1 -> mAP 98.4%
- 0.05 -> mAP 93.5%
- 0.025 -> mAP 75.5%
Training M_0.05 led to 96.3% mAP which, if evaluated in OKS 0.25 results in 99.4%, showing that using OKS 0.05 not only allows to get better precision but also to appreciate more the change in performance.
A +2.8% (in OKS 0.05 metric) is reflected in a much more accurate localization of keypoints.
Regarding the unstable training of M_0.025, here there's the training report:
I've tried to train with different learning rates and scheduler settings, but it never goes above ~45%.
So i think it's important to not use too small OKS, it acts as a sort of label smoothing, as explained by Zhou et al, allowing for more stable training.
For me, the OKS values were THE game chaning parameter to tune :laughing: , whether putting them in the args or not depends on how much easy customization you want to provide.
I understand that this will be probably used only by more experienced users, but maybe if you provide a short, good documentation about it, more people will start tuning it depending on their task requirements.
@PallottaEnrico ok so you did get much better results after modifying this arg, then I think it's worth to add it :). @glenn-jocher
@PallottaEnrico thanks for the updates and @Laughing-q thanks for the review.
This all looks good, except that we've only had single variables as cfg values up until now, so adding a list may complicate other parts of the codebase like hyperparameter tuning.
Would it make sense to add a single OKS gain as a float hyperparameter the way we have with for example the loss gains?
When you tune this array do you tune each value independently or apply a single gain/bias to the entire vector?
@glenn-jocher in my case, since I'm detecting vertexes of a "perfectly symmetric" object, I set a single value for all OKS. I think that in general is better to change each oks independently, as each keypoint can have different characteristics (indeed COCO tuned the 17 kps independently).
But yes that's a point, i didn't think about automated hyperparameter tuning procedures.
@PallottaEnrico in that case, perhaps one possible solution could be to allow both types of input. If a float value is provided in the yaml, then this value can be treated as a global adjustment/gain applied to all keypoints. On the other hand, if a list is provided, then we can assume that it is defining the individual sigma values for each keypoint. This way, we can maintain compatibility with hyperparameter tuning systems that expect a single value, while also providing the capability for more advanced users to tune each keypoint independently. This of course would need to be carefully coded and documented to ensure the functionality is clear to all users.
@glenn-jocher I thought about it but there's a problem with that solution. The OKS value are used by both the loss function and the metrics computation, doing an automated hyp tuning of a parameter that changes the reference metrics doesn't make sense. The result would be that higher values of OKS will be chosen since they allow to have higher metrics. I think i would be better to exclude OKS from the automatically tunable parameters.
@PallottaEnrico Your point is well taken. Automated hyperparameter tuning would be biased towards higher OKS values as they correspond to higher metrics, which could lead to inflated performance measures. Excluding OKS from automatically tune-able parameters could be one solution. This could allow users to manually tune this parameter based on their specific use case and understanding of their detection tasks. We appreciate your thoughtful insights on this matter.
@glenn-jocher let me know if something else needs to be done
Codecov Report
Merging #4135 (37df1e4) into main (87ce15d) will decrease coverage by
0.01%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## main #4135 +/- ##
==========================================
- Coverage 79.43% 79.42% -0.01%
==========================================
Files 107 107
Lines 12368 12366 -2
==========================================
- Hits 9824 9822 -2
Misses 2544 2544
Flag | Coverage Ξ | |
---|---|---|
Benchmarks | 39.19% <60.00%> (-0.01%) |
:arrow_down: |
Tests | 76.88% <100.00%> (-0.01%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Files Changed | Coverage Ξ | |
---|---|---|
ultralytics/utils/metrics.py | 91.23% <ΓΈ> (-0.03%) |
:arrow_down: |
ultralytics/models/yolo/pose/val.py | 75.22% <100.00%> (ΓΈ) |
|
ultralytics/utils/loss.py | 92.85% <100.00%> (-0.03%) |
:arrow_down: |
:mega: Weβre building smart automated test selection to slash your CI/CD build times. Learn more