xarray Do not transpose 1d arrays during interpolation

Do not transpose 1d arrays during interpolation

Open Illviljan opened this issue 3 years ago • 8 comments

Seems a waste of time to transpose 1d arrays.

[ ] Closes #xxxx
[ ] Tests added
[ ] Passes pre-commit run --all-files
[ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
[ ] New functions/methods are listed in api.rst

Jun 27 '21 20:06 Illviljan

It's just a simple copy/paste job at the moment. I welcome suggestions for a more elegant solution.

Jun 27 '21 21:06 Illviljan

Unit Test Results

        6 files         6 suites 52m 1s :stopwatch: 16 204 tests 14 485 :heavy_check_mark: 1 719 :zzz: 0 :x: 90 420 runs 82 260 :heavy_check_mark: 8 160 :zzz: 0 :x:

Results for commit fb68d59b.

:recycle: This comment has been updated with latest results.

Jun 27 '21 21:06 github-actions[bot]

Sorry this didn't get @Illviljan .

It looks good, without me having that much context. Do you have any info on whether this has any performance impact?

I imagine this isn't as easy as it sounds, but do you have a view on whether we could apply this concept more broadly, and make transposing 1D arrays a no-op in the transpose method, rather than writing that logic for each method that calls .transpose?

Jul 25 '21 18:07 max-sixty

Yes this improves the interp performance a bit. The .copy in the transpose is rather slow so it seems better to just not do it. An alternative is removing the .copy in the transpose or make it an option?

Jul 25 '21 19:07 Illviljan

~Yeah, I guess if someone does x = y.transpose() and then x[0] = 42, then y would be inconsistently updated. Don't think we're likely to get copy-on-write semantics soon!~

~So maybe this is the best we can hope for.~

Edit: actually is this already implemented? https://github.com/pydata/xarray/blob/main/xarray/core/variable.py#L1441-L1444. Does interpolate not hit this code path?

Is it worth adding an ASV? I've found them fairly quick to set up a new one, though takes some lift to set up the environment etc. I think in general we should try and have them for performance work, so we can track if it regresses.

Jul 25 '21 20:07 max-sixty

Edit: actually is this already implemented? https://github.com/pydata/xarray/blob/main/xarray/core/variable.py#L1441-L1444. Does interpolate not hit this code path?

Yes, that's the copy that's slow and there's no real need to create a new copy in the 1D case. My (bad) idea was to just return the original array there instead, but as you noted that might not be fully intuitive.

A ASV for 1D- and ND-interpolation and having it running in the CI would be nice. Though I'm not really interested in implementing that, It would be just another thing to document for me because I use other profilers anyway and as you said getting it setup requires some work that I don't enjoy.

Jul 25 '21 21:07 Illviljan

It's curious that's slow — it's not a deep copy and so should be fast (in python terms!), since it's just copying the class instance.

Totally understand re ASV — and more generally you should choose the most meaningful work for you. I hope you continue to become more involved with the project, and there'll be plenty of time to expand into other areas.

(though down a level, we should have some way of justifying the merge — let me know if you have any profiles to hand, no rush)

Thanks as ever @Illviljan !

Jul 26 '21 06:07 max-sixty

It probably is decently fast but it can't compare to just not running the .copy() code if a copy isn't necessary. :) But this isn't a major bottleneck either:

bild Before, the transpose is one of the largest bottlenecks in missing.interp where the .copy() is what takes time. missing.interp took 1.33s to run.

bild After, the transpose obviously isn't a factor anymore since it isn't triggered. missing.interp took 1.24s to run.

Jul 26 '21 07:07 Illviljan

xarray xarray copied to clipboard

Do not transpose 1d arrays during interpolation

Unit Test Results

xarray
xarray copied to clipboard