earthaccess icon indicating copy to clipboard operation
earthaccess copied to clipboard

Opening virtual datasets (dmr-adapter)

Open ayushnag opened this issue 1 year ago • 2 comments

  • [x] Closes #605
  • [x] Fix scale_factor, add_offset bug
  • [x] Find precise permissions URL (not DAAC general)
  • [x] Add docs
  • [x] Unit tests. Update current test to test specific portions of the virtual dataset and also test the numpy/dask loaded dataset
  • [x] Check indirect access support
  • [ ] Update earthaccess documentation
  • [x] Use updated virtualizarr version (with numpy 2.0 manifest)

📚 Documentation preview 📚: https://earthaccess--606.org.readthedocs.build/en/606/

ayushnag avatar Jun 18 '24 23:06 ayushnag

This PR looks good to me (we need to fix some minor formatting issues with Ruff). Maybe the only missing thing would be a notebook demonstrating how to use this feature? @ayushnag

betolink avatar Jun 20 '24 14:06 betolink

https://gist.github.com/ayushnag/bcf946a71122f5e7a54bc72b581bd31b

Better viewing experience: https://nbviewer.org/gist/ayushnag/bcf946a71122f5e7a54bc72b581bd31b

If there's more you want me to add or if any step is unclear I can update the notebook

ayushnag avatar Jun 20 '24 17:06 ayushnag

Binder :point_left: Launch a binder notebook on this branch for commit b09c3f034ed0d4adc4c99554c26c4d763d38823f

I will automatically update this comment whenever this PR is modified

Binder :point_left: Launch a binder notebook on this branch for commit ed4a98b656ac7ae5392cde8700c8de2cf81aa22d

Binder :point_left: Launch a binder notebook on this branch for commit abe2c28d58a939d567261c17d30f914a8d0aa054

Binder :point_left: Launch a binder notebook on this branch for commit dde9c579a8c58d1a7abf6a60f27d982f3d5090f0

Binder :point_left: Launch a binder notebook on this branch for commit a9a9234386bc6b85badd9e7452e40bf9a8f828c4

Binder :point_left: Launch a binder notebook on this branch for commit 8fb01404fd24760ce7f078a84972bf0e405f02f3

Binder :point_left: Launch a binder notebook on this branch for commit 2c28278a22bca68c97cc2bf7a71f26ebcdcb04cf

Binder :point_left: Launch a binder notebook on this branch for commit c3e43ac5a2035b2fa2160d6c8aa3a1da7449eab5

Binder :point_left: Launch a binder notebook on this branch for commit d3d6e7c9abd280638971aac6b69408dc57312663

Binder :point_left: Launch a binder notebook on this branch for commit 512e89c48c479141d5a7c43ce023831b2c56108c

Binder :point_left: Launch a binder notebook on this branch for commit 67dbbe7b357efc3a32d2415fb0aafe634557d653

Binder :point_left: Launch a binder notebook on this branch for commit 61afb957e25980631ebe17cc98b35f22335bd936

github-actions[bot] avatar Nov 13 '24 00:11 github-actions[bot]

@ayushnag, perhaps I'm missing something here, but why have you copied code from the virtualizarr library into earthaccess? I don't see anything gained by this. We can simply use virtualizarr directly. Can you clarify?

chuckwondo avatar Nov 13 '24 13:11 chuckwondo

Yes I can clarify. At some point the dmrpp parser was going to be a part of earthaccess instead of virtualizarr. However that is not the case anymore and this PR will be updated to just call virtualizarr directly

ayushnag avatar Nov 13 '24 16:11 ayushnag

This PR is ready to review now. I have updated the pyproject.toml with an added dependency but let me know if the uv.lock or other files need to modified.

ayushnag avatar Nov 14 '24 19:11 ayushnag

I can take a look this evening, great work @ayushnag !!

betolink avatar Nov 14 '24 19:11 betolink

I think this PR is ready to be merged, there are many considerations to the future of virtual datasets with NASA data but this PR get us closer to a workflow where we won't have to generate metadata when there is a format already available. We need to produce a few examples and put it in the documentation, maybe using SWOT, ICESat-2 and TEMPO datasets cc @danielfromearth @DeanHenze

Is there anything we are missing @TomNicholas @ayushnag?

betolink avatar Dec 10 '24 17:12 betolink

@betolink One quick comment before we merge. This function doesn't appear to show up in the API reference with the docs build. I think if the new functions are imported with .api that should fix it?

ayushnag avatar Dec 10 '24 18:12 ayushnag

Although then the function has to be part of api.py. Maybe not in that case

ayushnag avatar Dec 10 '24 18:12 ayushnag

@betolink One quick comment before we merge. This function doesn't appear to show up in the API reference with the docs build. I think if the new functions are imported with .api that should fix it?

i think you're correct, I don't think the function needs to be part of API but it definitely needs to be imported, if we want the docs to show there. I think since we are including it also in the __init__.py it's already part of earthaccess., good catch! do you want to push one commit and see how the docs render?

betolink avatar Dec 10 '24 19:12 betolink

I think this PR is ready to be merged, there are many considerations to the future of virtual datasets with NASA data but this PR get us closer to a workflow where we won't have to generate metadata when there is a format already available. We need to produce a few examples and put it in the documentation, maybe using SWOT, ICESat-2 and TEMPO datasets cc @danielfromearth @DeanHenze

Is there anything we are missing @TomNicholas @ayushnag?

@betolink, do you think we should create a new issue for examples with open virtual dataset?

danielfromearth avatar Dec 12 '24 20:12 danielfromearth

@TomNicholas do you think after changing the virtualizarr version in pyproject.toml we can merge? The other changes seem more like long term updates to me. Or which changes do you think are required now?

ayushnag avatar Dec 12 '24 22:12 ayushnag

Just catching up here. Awesome work @ayushnag 🥇 !

Mikejmnez avatar Dec 16 '24 16:12 Mikejmnez