Don't run on_validation_epoch_end() during sanity checking if val-accuracy-interval is greater than max epochs.
In https://deepforest.readthedocs.io/en/v1.4.1/user_guide/09_configuration_file.html#val-accuracy-interval, we tell users that to turn off the validation callback, set the interval to greater than the number of epochs.
https://github.com/weecology/DeepForest/blob/e5f3301f6a1c931bd41fad7ab06d17b3b191df3f/src/deepforest/main.py#L718
But that's not quite true, because during sanity checking the self.current_epoch = 0, this means it still checks the line above, there is no reason and is probably dangerous, to check the validation dataloader if we don't intend on ever calling it during training.
More on sanity checking
https://lightning.ai/forums/t/validation-sanity-check/174
https://pytorch-lightning.readthedocs.io/en/0.9.0/api/pytorch_lightning.trainer.evaluation_loop.html
- We need to test and document and check the current behavior.
- Update docs with what is actually happening.
This issue should wait on the refactor of #858
Hi @bw4sz, I have an approach to this issue. Can you let me know if I've got it right?
Approach:
-
Modify on_validation_epoch_end()
- Add a condition to skip validation when val_accuracy_interval > max_epochs.
- This ensures that validation is disabled during both sanity check and training when it's not needed.
-
Create tests to verify the behavior:
- test_validation_disabled(): Tests when validation should be disabled
- test_validation_enabled(): Tests when validation should be enabled
- test_sanity_check_validation(): Specifically tests sanity check behavior
Does this approach address the issue correctly?
Hi @reddykkeerthi (ref: #1025) and @Abhishek-Dimri, Thank you both for your valuable contributions to the project! Since you’ve each brought useful ideas to this issue, I’d like to ask you to collaborate and submit a single, combined PR that reflects both of your approaches. Please take some time to review each other’s designs and work together toward a unified solution. Also, please do not close PRs #1025 or #1036. You can update their titles to include [Duplicate] or something similar, but keep them open for reference. Looking forward to your joint effort!
Hi @henrykironde, I think you might have meant to tag @reddykkeerthi instead of rhydham-nith.
@Abhishek-Dimri thanks
Hi @reddykkeerthi👋
I used your validation condition logic as a base and added a tweak to skip validation during PyTorch Lightning’s sanity check phase, which was still triggering on_validation_epoch_end() unexpectedly.
Key Changes:
- Added a guard for
self.trainer.sanity_checking(backed by Lightning Docs) - Expanded test coverage:
test_validation_skip_during_sanity_checktest_validation_runs_in_standalone_mode- Enhanced
test_validation_skipped_when_interval_exceeds_epochs
Your original structure was super helpful. Let me know if you'd like to collaborate further on the combined PR! Happy to iterate together.