earthaccess
earthaccess copied to clipboard
Adding earthaccess catalog in Intake 2
I have written a little code which enables calling the earthaccess functions from within intake. The point of this, is that certain queries and dataset results could then be persisted in catalogs without having to keep code snippets around. The users still need to register and understand what the query parameters mean.
Do people here think this is a useful thing to do, and does the implementation look OK? Am I right in assuming that the DOI is the best unique identifier of a data product?
Nice! I haven't used Intake before, but excited to see more integrations :) What would using this look like?
Am I right in assuming that the DOI is the best unique identifier of a data product?
I think collection_concept_id is going to be the "best" unique identifier (as intended by the CMR API, not necessarily easiest-to-use). Under the hood, earthaccess is translating the doi query to a concept_id query by doing a collection search to get the concept_id.
https://github.com/nsidc/earthaccess/blob/7db2e59fb76d9eea87a343bbf2af505a57c43e10/earthaccess/search.py#L699-L702
collection_concept_id is going to be the "best" unique identifier
Thanks, I'll use that.
The use pattern would be like
import intake.readers.catalogs
spec = intake.readers.catalogs.EarthdataCatalogReader(temporal=("2002-01-01", "2002-01-02"), ....)
cat = spec.read()
list(cat) # shows available identifiers, which all have metadata
reader = cat[<identifier>]
ds = reader.read() # outputs an xr.DataSet
Of course, the flow is nearly exactly the same as you have anyway, but the point is that spec and reader with their parameters can be saved in catalogs.
I am working with provisional ATL07/10 data, and would like to set up some access to our local repositories. These are pre-decisional data, and cannot be added for general access. I have been looking for instructions and/or tutorials on how to set up intake/earthaccess to access local files/repositories, but have not figured it out yet, so I thought I would ask here .
As a note, it has been 5+ years since I worked on setting up any intake catalogs, so pointers to instructions on setting this out would be helpful. I will be glad to post tutorials and instructions once I get this worked out, but I will first have to get permission for the public release.
The general Earth catalog maker for Intake 2 is here: https://github.com/intake/intake/blob/745ebd42db371aa7d0f5d7d2ca8744103532819d/intake/readers/catalogs.py#L623
This calls earthaccess.search_datasets - so I don't know how you would change that to point to local resources.
Thanks! This gives me a place to start. Ill post something here if I find a workable solution.