dataverse
dataverse copied to clipboard
md5 hash displayed to user is wrong
What steps does it take to reproduce the issue?
See the dataset file page here.
- This page says the "Original File MD5" begins with
9e9be.... But this is not true. - The "Stata Binary (Original File Format)" file has an md5 hash beginning with
20ddc4.... - The "Tab-Delimited" file has an md5 hash beginning with
1f75c2.... - However "Tab-Delimited" file without the header row (
cat file.tab | tail --lines=+2 | md5sum) has an md5 hash of9e9be....
This is a bug because it will lead users to believe that they downloaded a corrupted file.
There are two parts: the incorrect labeling, and cutting off the header row. The label should be "Tab-Delimited File MD5" not "Original File MD5." Cutting off the header row is more interesting. Why does Dataverse send the file to the user, but hash a transformed version of that file?
- When does this issue occur?
Unknown
- Which page(s) does it occurs on?
The Stata files of this dataset that I checked by hand.
Which version of Dataverse are you using?
The one hosted at https://dataverse.harvard.edu/, 5.13 build 1244-79d6e57