algorithmic-efficiency
algorithmic-efficiency copied to clipboard
Publish md5 hashes of datasets
Description
Is it possible to publish file hashes and directory layouts for all datasets, post processing. I would like to run some checks to ensure that there are no discrepancies with the data my team has downloaded and processed.
The dataset layouts and final sizes are documented in datasets/README.md in the dropdown items saying "The final directory structure should look like this:".
Thanks, that's useful. Would it be possible to publish hashes of the files as well?
@chandramouli-sastry could you help close this request? I have all the data from the setup scripts downloaded in kasimbeg-8 in /home/kasimbeg/data. The remaining work is to:
- Check the README for data setup to make sure the file structure matches and there are no additional files left from the download.
- Get the hashes for all of the datasets and add them to the data setup README.