torchgeo icon indicating copy to clipboard operation
torchgeo copied to clipboard

FAIR1M url invalid

Open BarryTang22 opened this issue 10 months ago • 7 comments

Description

Hi Team,

The download links of the FAIR1M dataset are invalid, except for the first URL of the train split.

Steps to reproduce

Run the following code to download FAIR1M will produce 404: Not Found:

from torchgeo.datasets import FAIR1M

dataset = FAIR1M(
    root=r"./FAIR1M",
    split="test",
    download=True
)

Version

0.6.2

BarryTang22 avatar Feb 16 '25 20:02 BarryTang22

Looks like the http://gaofen-challenge.com/ website is down too. Someone seems to have redistributed the dataset at https://huggingface.co/datasets/blanchon/FAIR1M. You can use that source for now, we'll likely redistribute on HF as well.

adamjstewart avatar Feb 16 '25 23:02 adamjstewart

I just emailed the FAIR1M authors, let's see if we get a response.

adamjstewart avatar Feb 16 '25 23:02 adamjstewart

I just emailed the FAIR1M authors, let's see if we get a response.

I think the HF's one is not a complete dataset. Thank you very much for contacting them!

BarryTang22 avatar Feb 16 '25 23:02 BarryTang22

Hi Team @adamjstewart , Just checking if the FAIR1M dataset redistribution you mentioned has happened? If yes, could you share the link? Thanks!

cs-an avatar Apr 22 '25 14:04 cs-an

Still no response from the authors...

adamjstewart avatar Apr 22 '25 15:04 adamjstewart

Website is back up. Can you check if this issue still occurs?

adamjstewart avatar Oct 29 '25 19:10 adamjstewart

The Fair1M dataset is trying to download not from gaofen dataset but from a google drive

Torchgeo code:

    urls: ClassVar[dict[str, tuple[str, ...]]] = {
        'train': (
            'https://drive.google.com/file/d/1LWT_ybL-s88Lzg9A9wHpj0h2rJHrqrVf',
            'https://drive.google.com/file/d/1CnOuS8oX6T9JMqQnfFsbmf7U38G6Vc8u',
            'https://drive.google.com/file/d/1cx4MRfpmh68SnGAYetNlDy68w0NgKucJ',
            'https://drive.google.com/file/d/1RFVjadTHA_bsB7BJwSZoQbiyM7KIDEUI',
        ),
        'val': (
            'https://drive.google.com/file/d/1lSSHOD02B6_sUmr2b-R1iqhgWRQRw-S9',
            'https://drive.google.com/file/d/1sTTna1C5n3Senpfo-73PdiNilnja1AV4',
        ),
        'test': (
            'https://drive.google.com/file/d/1HtOOVfK9qetDBjE7MM0dK_u5u7n4gdw3',
            'https://drive.google.com/file/d/1iXKCPmmJtRYcyuWCQC35bk97NmyAsasq',
            'https://drive.google.com/file/d/1oUc25FVf8Zcp4pzJ31A1j1sOLNHu63P0',
        ),

On those, only the first train link is not dead

I tried to look on the website of the dataset https://gaofen-challenge.com/benchmark the two google drive link are dead, the baidu one seems to be still up

gatienc avatar Dec 08 '25 22:12 gatienc