Limit the amount of data depositors may upload every day
Greetings,
There are already several topics discussing the possibility of limiting data input, whether on a dataset or user basis:
- #938 – Storage metering: Allow tracking and limiting total and individual file upload space per user/group
- #3939 – File and dataset limits: Add a programmatic way to limit file size and dataset size
- #4339 – Storage allocation quota
We were also wondering to what extent it might be interesting to place a cap on the amount of data individuals depositor may deposit in a single day. This would counter the risk of mischievous users trying to jam-pack a Dataverse's storage capacity.
Hey @BPeuch - this is a challenging problem to solve, and as you linked there have been a few different proposals for how to handle it.
I've thought that the best option would be to address this from a technology optimization and monitoring perspective instead of building in limits at the Dataverse Collection, Dataset, or User level. I think trying to build in these limits could result in people working around it and not following best practices for data deposit, where better monitoring/alerting could be used to notify and allow curation teams to intervene during the deposit process.
You mention of daily limits is interesting, though. This could allow the block to happen while also allowing the curation/admin team to be notified so they can get involved and discuss with the depositor. A person could deposit slowly over time to get around the limit of course, but the combined daily limit (temporary stop) and notification may be a good option here.
Hey @djbrooke thanks for your quick reaction!
I am not as skilled yet in general computer science as I wish I were, so I certainly trust your judgment 😶 I understand things can (or even should) be done either at the software or at the higher/broader infrastructure level 🔧