weather-tools
weather-tools copied to clipboard
Retry downloading a shard when file sizes are zero.
Sometimes, files downloaded by weather-dl are created in GCS, but contain zero bytes. To address this, we could:
- Add a check after fetch is complete that checks the file size in the bucket. If zero, retry the DL.
- When we check if the file exists or needs to be skipped, we should also check the file size.
Workarounds: Users can make use of gsutil
or equivalent tools to find all empty files. Then, they can delete these and re-download (with a separate invocation).
@alxmrs I would like to work on this, could you please assign this to me
I think @fredzyda can help you with the assignment. Though, I don’t think any one else is working on it right now, feel free to take a crack at the implementation. :)
Before you do, check with @mahrsee to see if this is still an issue — it may have been fixed in wdl2.
We haven't observed this issue recently, and it's not implemented in WDL and WDLv2. However, implementing this would serve as a great safety check, so we should proceed with it.
@nagavenkateshgavini,
Pointers for WDL:
- We can incorporate validation logic in
fetch_data()
.
Pointers for WDLv2:
- We can incorporate validation logic in
main()
.