gpt-2-output-dataset
gpt-2-output-dataset copied to clipboard
Download script improvements
This PR:
- makes sure the download script doesn't clobber any existing files if they seem correct enough (same size as remote)
- ensures 404, etc. errors don't get written as outputs (by raising an error when they occur)
- uses a larger read chunk size for a modest speed boost; there's no need to read and write things 1000 bytes at a time
sorry for getting to this so late, could you resolve the conflicts and i'll take another look?
@WuTheFWasThat Rebased!