xbatcher
xbatcher copied to clipboard
Refactor Pytorch dataset
Description of proposed changes
This PR introduces a number of changes to xbatcher's pytorch MapDataset class. These changes came out of the learnings while preparing my AMS talk (slides and blog post coming soon). The changes here allow for more control over how w
- includes concurrent loading of x/y batches with dask.compute()
- transforms now are expected to take an xarray object and return a torch.Tensor
Codecov Report
Attention: Patch coverage is 90.62500% with 3 lines in your changes missing coverage. Please review.
Project coverage is 97.46%. Comparing base (
43c9135) to head (e74de9e). Report is 17 commits behind head on main.
| Files | Patch % | Lines |
|---|---|---|
| xbatcher/loaders/torch.py | 90.62% | 1 Missing and 2 partials :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #202 +/- ##
==========================================
+ Coverage 96.25% 97.46% +1.21%
==========================================
Files 6 6
Lines 347 395 +48
Branches 82 92 +10
==========================================
+ Hits 334 385 +51
+ Misses 8 5 -3
Partials 5 5
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Thanks for contributing these improvements @jhamman! It would be good to hit the test coverage target, so do you mind if I push additional tests as part of the review?
Sorry that I did not have time to review this yet! I am about to head out of office but would be glad to work on this with you after returning on March 5, or would welcome other reviewers in the meantime.
pre-commit.ci autofix
@jhamman, i pushed additional tests/fixes to your branch.. i hope you don';t mind :)
@maxrjones, curious to hear your feedback on this before I merge it into main when you get a chance
@weiji14 would you have any interest in reviewing this PR? I can if you don't have time or interest, I just expect that your expertise would be particularly beneficial on the PyTorch dataset revisions. It'd also be helpful to know if we need to expect downstream implications for zen3geo.
I'll be travelling most of the next two weeks (about to head to the airport actually), so probably won't get to this until September. If you can do the review @maxrjones, that would be great!
I'll be travelling most of the next two weeks (about to head to the airport actually), so probably won't get to this until September. If you can do the review @maxrjones, that would be great!
Sounds good, safe travels! @andersy005 I'll block some time to look at this on Friday.
@andersy005 I'll block some time to look at this on Friday.
I've had some other work come up today so unfortunately won't get a chance to review. I'll send an updated timeline next week, but feel welcome to proceed without me if you prefer.
this can definitely wait. so, no rush/worries @maxrjones