THU-Cloud-Downloader icon indicating copy to clipboard operation
THU-Cloud-Downloader copied to clipboard

Enhancing download_single_file to Handle Non-Existent Files

Open eggry opened this issue 9 months ago • 0 comments

https://github.com/chenyifanthu/THU-Cloud-Downloader/blob/8b610c4269ec0a27d335353bd42e5bd4c47b461d/thu_cloud_download.py#L89-L95

When the download_single_file function encounters a URL that points to a non-existent file, it currently does not raise an error and proceeds as if downloading a regular file. As a result, the downloaded file is merely a webpage saying “文件不存在”.

A notable challenge is that the server still responds with a status code of 200 instead of 404 in such case. An ugly-but-effective way solution involves checking for redirection in resp: an exist file will undergo a redirection from files/?dl=1 to seafhttp/files/. In contrast, a Not Found response is served directly from files/?dl=1 without any redirection.

def download_single_file(url: str, fname: str, pbar: tqdm):
    global sess
    resp = sess.get(url, stream=True)
    if not resp.history:
        raise ValueError("File may non-exist!")
    with open(fname, 'wb') as file:
        for data in resp.iter_content(chunk_size=1024):
            size = file.write(data)
            pbar.update(size)

Wonder if there are more elegant solutions...

eggry avatar May 16 '24 08:05 eggry