torchgeo icon indicating copy to clipboard operation
torchgeo copied to clipboard

Mirror So2Sat and implement download functionality

Open calebrob6 opened this issue 3 years ago • 4 comments
trafficstars

Currently, the So2Sat dataset can't be auto downloaded because torchvision's download_url doesn't seem to support FTP URLs like "ftp://m1483140:[email protected]/training.h5".

The So2Sat authors are okay with having the dataset mirrored on a website like https://zenodo.org/. Once the dataset has been mirrored we can implement the download functionality as normal.

(@mehmetgunturkun this would be a great contribution if you have the time)

calebrob6 avatar Feb 02 '22 16:02 calebrob6

Note that this also applies to SEN12MS. @calebrob6 did the SEN12MS authors give us permission for SEN12MS too?

adamjstewart avatar Feb 02 '22 17:02 adamjstewart

I haven't asked, however SEN12MS is 510GB so it wouldn't be trivial to upload to Zenodo (we'd have to split it up), and it would take a longg time to download over http. Maybe it is better to make users understand what they are getting into in that case?

For reference:

  • The official So2Sat repo is here https://github.com/zhu-xlab/So2Sat-LCZ42
  • The official SEN12MS repo is here https://github.com/schmitt-muc/SEN12MS

calebrob6 avatar Feb 02 '22 17:02 calebrob6

Yes of course, it was the main reason why I asked! I'll take a look

mehmetgunturkun avatar Feb 02 '22 19:02 mehmetgunturkun

@wangyi111 and I talked about uploading these datasets to HF.

adamjstewart avatar Oct 17 '23 11:10 adamjstewart