mlhub-tutorials icon indicating copy to clipboard operation
mlhub-tutorials copied to clipboard

LandCoverNet NA - ValidationError

Open bchewy15 opened this issue 1 year ago • 1 comments

Hello, I am having some issues when I try to download the LandCoverNet North America dataset. I have been following along with the LandCoverNet tutorial but keep hitting the same issue.

Environment: Ubuntu 18.04.6 Python 3.8.0 mlhub 0.5.2

My Code:

import os
from radiant_mlhub import Dataset

os.environ['MLHUB_API_KEY'] = 'apikey'
dataset = Dataset.fetch('ref_landcovernet_na_v1')

print(f'Title: {dataset.title}')
print(f'DOI: {dataset.doi}')
print(f'Citation: {dataset.citation}')
print('\nCollection IDs and License:')
for collection in dataset.collections:
    print(f'    {collection.id} - {collection.license}')
dataset.download()
ref_landcovernet_na_v1: fetch stac catalog: 89932KB [00:13, 6556.94KB/s]        
unarchive ref_landcovernet_na_v1.tar.gz: 100%|█| 562974/562974 [00:50<00:00, 112

---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
Cell In [3], line 1
----> 1 dataset.download()

File ~/.local/lib/python3.8/site-packages/radiant_mlhub/models/dataset.py:361, in Dataset.download(self, output_dir, catalog_only, if_exists, api_key, profile, bbox, intersects, datetime, collection_filter)
    347 config = CatalogDownloaderConfig(
    348     catalog_only=catalog_only,
    349     api_key=api_key,
   (...)
    358     temporal_query=datetime,
    359 )
    360 dl = CatalogDownloader(config=config)
--> 361 dl()

File ~/.local/lib/python3.8/site-packages/radiant_mlhub/client/catalog_downloader.py:740, in CatalogDownloader.__call__(self)
    738 # call each step
    739 for step in steps:
--> 740     step()
    742 # inspect the error report
    743 self.err_report.flush()

File ~/.local/lib/python3.8/site-packages/radiant_mlhub/client/catalog_downloader.py:282, in CatalogDownloader._create_asset_list_step(self)
    280             _handle_collection(stac_item)
    281         else:
--> 282             _handle_item(stac_item)
    283 log.info(f'{self._fetch_unfiltered_count()} unique assets in stac catalog.')

File ~/.local/lib/python3.8/site-packages/radiant_mlhub/client/catalog_downloader.py:233, in CatalogDownloader._create_asset_list_step.<locals>._handle_item(stac_item)
    231 n = 0
    232 for k, v in assets.items():
--> 233     rec = AssetRecord(
    234         collection_id=stac_item['collection'],
    235         item_id=item_id,
    236         asset_key=k,
    237         common_asset=k in COMMON_ASSET_NAMES,
    238         asset_url=v['href'],
    239         bbox_json=json.dumps(bbox) if bbox else None,
    240         geometry_json=json.dumps(geometry) if geometry else None,
    241         single_datetime=props.get('datetime', None),
    242         start_datetime=common_meta.get('start_datetime', None),
    243         end_datetime=common_meta.get('end_datetime', None),
    244     )
    245     asset_save_path = _asset_save_path(rec).relative_to(self.work_dir)
    246     rec.asset_save_path = str(asset_save_path)

File ~/.local/lib/python3.8/site-packages/pydantic/main.py:341, in pydantic.main.BaseModel.__init__()

ValidationError: 1 validation error for AssetRecord
single_datetime
  invalid type; expected datetime, string, bytes, int or float (type=type_error)

Any help would be appreciated, thank you.

bchewy15 avatar Sep 30 '22 13:09 bchewy15

Hi @bchewy15 thank you for brining this to our attention. This is a known issue with the metadata in the STAC catalog for this dataset, I'm sorry for the inconvenience.

Currently we are working as a team to update the metadata across all of our catalogs, I will prioritize fixing this one so you should be able to download in the next few days. I'll update you when it's ready.

KennSmithDS avatar Oct 03 '22 17:10 KennSmithDS

Hi @bchewy15 thank you for brining this to our attention. This is a known issue with the metadata in the STAC catalog for this dataset, I'm sorry for the inconvenience.

Currently we are working as a team to update the metadata across all of our catalogs, I will prioritize fixing this one so you should be able to download in the next few days. I'll update you when it's ready.

Hello again,

Sorry to be a bother, but I was just wondering if there was any timeline for this? I'm not in any hurry as I've been practising by using another one of your datasets, but just figured I would touch base to see if there was an estimate for when this set will be available.

Thanks for all your hard work, Ben

bchewy15 avatar Oct 18 '22 19:10 bchewy15