earthaccess Adding earthaccess catalog in Intake 2

trafficstars

I have written a little code which enables calling the earthaccess functions from within intake. The point of this, is that certain queries and dataset results could then be persisted in catalogs without having to keep code snippets around. The users still need to register and understand what the query parameters mean.

Do people here think this is a useful thing to do, and does the implementation look OK? Am I right in assuming that the DOI is the best unique identifier of a data product?

Nov 10 '23 21:11 martindurant

Nice! I haven't used Intake before, but excited to see more integrations :) What would using this look like?

Am I right in assuming that the DOI is the best unique identifier of a data product?

I think collection_concept_id is going to be the "best" unique identifier (as intended by the CMR API, not necessarily easiest-to-use). Under the hood, earthaccess is translating the doi query to a concept_id query by doing a collection search to get the concept_id.

https://github.com/nsidc/earthaccess/blob/7db2e59fb76d9eea87a343bbf2af505a57c43e10/earthaccess/search.py#L699-L702

Nov 10 '23 21:11 MattF-NSIDC

collection_concept_id is going to be the "best" unique identifier

Thanks, I'll use that.

The use pattern would be like

import intake.readers.catalogs
spec = intake.readers.catalogs.EarthdataCatalogReader(temporal=("2002-01-01", "2002-01-02"), ....)
cat = spec.read()
list(cat) # shows available identifiers, which all have metadata
reader = cat[<identifier>]
ds = reader.read() # outputs an xr.DataSet

Of course, the flow is nearly exactly the same as you have anyway, but the point is that spec and reader with their parameters can be saved in catalogs.

Nov 10 '23 21:11 martindurant

I am working with provisional ATL07/10 data, and would like to set up some access to our local repositories. These are pre-decisional data, and cannot be added for general access. I have been looking for instructions and/or tutorials on how to set up intake/earthaccess to access local files/repositories, but have not figured it out yet, so I thought I would ask here .

As a note, it has been 5+ years since I worked on setting up any intake catalogs, so pointers to instructions on setting this out would be helpful. I will be glad to post tutorials and instructions once I get this worked out, but I will first have to get permission for the public release.

Dec 06 '23 07:12 ebo

The general Earth catalog maker for Intake 2 is here: https://github.com/intake/intake/blob/745ebd42db371aa7d0f5d7d2ca8744103532819d/intake/readers/catalogs.py#L623

This calls earthaccess.search_datasets - so I don't know how you would change that to point to local resources.

Dec 06 '23 14:12 martindurant

Thanks! This gives me a place to start. Ill post something here if I find a workable solution.

Dec 06 '23 15:12 ebo

earthaccess earthaccess copied to clipboard

Adding earthaccess catalog in Intake 2

earthaccess
earthaccess copied to clipboard