transformers
transformers copied to clipboard
Fix bug with passing capture_* args to neptune callback
What does this PR do?
Fixes # (issue)
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [ ] Did you read the contributor guideline, Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
- [ ] Did you write any new necessary tests?
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.
What does this PR do
This PR aims to fix a bug that appears when using NeptuneCallback with run=None and at least one of the capture_* params set.
Cause of the problem
This is due to the fact, that in one of the methods those params have hardcoded values, but the kwargs passed to the constructor are also passed to that method. Therefore a TypeError like this occurs during evaluation:
TypeError: neptune.metadata_containers.run.Run() got multiple values for keyword argument 'capture_stdout'
Solution
Before invoking the problematic method we want to make sure that the capture_* params are not present in the callback's state by temporarily removing them. Once the method returns, the original kwargs state is restored.
Thanks for opening this PR @AleksanderWWW! Let us know when you want a review 🤗
Thank you @amyeroberts !
Let me wait for my colleagues at Neptune to share their thoughts on my proposal :smiley:
Hey @amyeroberts
It seems that the integration_utils.py module has become very long and rather hard to navigate.
It's difficult to find what you're looking for and making changes in more than one place in one PR is a nightmare.
Would it be possible to split it into multiple modules separating different integrations?
Hi @AleksanderWWW, what kind of structure would you propose for the files?
@amyeroberts
I thought of something similar to what pytorch_lightning has: https://github.com/Lightning-AI/pytorch-lightning/tree/master/src/lightning/pytorch/loggers
The root could be the already existing integrations package, but each provider (e.g. Neptune, CometML, etc.) has its own module.
Does it make sense?
And btw. the PR is ready for review from the code owners' side :rocket:
@AleksanderWWW OK, sounds like a good idea. We'd have to make all objects importable from the integrations module for backwards compatibility. If you'd like to open a PR I'd be very happy to review :)
If you'd like to open a PR I'd be very happy to review :)
Will do :)
The failing CI test ModelUtilsTest.test_use_safetensors doesn't seem to have anything to do with my changes. Is it possible to maybe rerun it?
@amyeroberts will you merge it once the checks pass? :rocket:
@AleksanderWWW A fix has been merged into main which should resolve this. Could you try rebasing to include this and trigger an new CI run?
@amyeroberts Done :smiley: