ignore shapefiles if they are under a hidden directory in the zip file
What this PR does / why we need it: Zip files containing shape files under hidden directories should labelled as a "ZIP Archive". Having it labelled as a "Shapefile as ZIP Archive" might be confusing to anyone looking to download the data.
Which issue(s) this PR closes: SPIKE: Improve how Dataverse labels shapefiles to prevent mislabelling of zip files that aren't shapefiles #8945
Closes #8945
Special notes for your reviewer:
Suggestions on how to test this: zip test.zip src/test/resources/hiddenShapefiles.zip upload this file which contains shapefile data under a hidden directory. Was showing as 'Shapefile as ZIP Archive'. Now shows 'ZIP Archive' Upload double zip file with shapefiles in visible directory and see that it shows as 'Shapefile as ZIP Archive'.
Does this PR introduce a user interface change? If mockups are available, please link/include them here: No
Is there a release notes update needed for this change?: Included
Additional documentation: None
coverage: 20.594% (+0.02%) from 20.574% when pulling 7b9319eea4f26ad7d2b6986cfdc1976b7f433e9a on 8945-prevent-mislabelling-non-shapefiles-in-zip into 5bf6b6defb1c22971951233f30b679d762496832 on develop.
:package: Pushed preview images as
ghcr.io/gdcc/dataverse:8945-prevent-mislabelling-non-shapefiles-in-zip
ghcr.io/gdcc/configbaker:8945-prevent-mislabelling-non-shapefiles-in-zip
:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.
coverage: 20.594% (+0.02%) from 20.574% when pulling 2eea8e6a46cf22273ea2f272ac87c4517ed444e1 on 8945-prevent-mislabelling-non-shapefiles-in-zip into 5bf6b6defb1c22971951233f30b679d762496832 on develop.
:package: Pushed preview images as
ghcr.io/gdcc/dataverse:8945-prevent-mislabelling-non-shapefiles-in-zip
ghcr.io/gdcc/configbaker:8945-prevent-mislabelling-non-shapefiles-in-zip
:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.
Hi @stevenwinship. After this improvement makes its way to Harvard Dataverse, would I be able to change the label of a file that was labelled as "Shapefile as ZIP Archive", like the file in the dataset at https://doi.org/10.7910/DVN/HWVUER?
Maybe with the redetect file type API endpoint?
coverage: 20.594% (+0.02%) from 20.574% when pulling 1b3e312572ef922572fc4df20e121169de2b929e on 8945-prevent-mislabelling-non-shapefiles-in-zip into 5bf6b6defb1c22971951233f30b679d762496832 on develop.
coverage: 20.594% (+0.02%) from 20.574% when pulling 1b3e312572ef922572fc4df20e121169de2b929e on 8945-prevent-mislabelling-non-shapefiles-in-zip into 5bf6b6defb1c22971951233f30b679d762496832 on develop.
:package: Pushed preview images as
ghcr.io/gdcc/dataverse:8945-prevent-mislabelling-non-shapefiles-in-zip
ghcr.io/gdcc/configbaker:8945-prevent-mislabelling-non-shapefiles-in-zip
:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.
Hi @stevenwinship. After this improvement makes its way to Harvard Dataverse, would I be able to change the label of a file that was labelled as "Shapefile as ZIP Archive", like the file in the dataset at https://doi.org/10.7910/DVN/HWVUER?
Maybe with the redetect file type API endpoint?
Yes. I just tested the redetect endpoint and after exiting the ui and going back in the label changed to 'ZIP Archive'. Not sure why I had to exit and come back in but at least it looks correct.