datahub
datahub copied to clipboard
Cleanup possibly redundant files in biology hub home directory
I think a semester or so ago, people might have had to upload individual copies of a large dataset to the biology hub as we didn't have shared directory functionality at that point. We do now! We should investigate if this is the case, and if so clean up the old files somehow.
TODO
- [ ] Determine if there is a lot of duplicated large files in the biology hub
- [ ] Figure out if they can be deleted
- [ ] Determine if we need to give notice to users
- [ ] Delete those files
@petersudmant do you remember if this duplication happened before?
-
I don't remember - but, if you point me to the data I'd be happy to look. FWIW, all the datasets we are currently using are in the shared folder.
-
The 2nd half of the class, which starts in 2 weeks, will involve more data, but, we will be starting from scratch, so everything that's not in those shared folders can be deleted.
-
The students will be starting their projects too. This will involve more data. I think it would be lovely to budget 250Gb at minimum for the whole class for our use (though 2-4X that would be phenomenal). I think what we can "plan" though is that, all data students download for projects this semester will be deleted, perhaps at the start of the next semester.