flatdata
flatdata copied to clipboard
Add clean/delete function to ArchiveBuilder(s)
Often the user wants to override an existing archive, and we should provide safe methods to do so:
- check that only flatdata files are in the directory
- only then delete it
This could either be another parameter to open
, or another function like remove
etc.
Hi, we use flatdata in Rust and also interested in this particular feature. We are creating in memory resource storages, and subdirectories, and given a flatdata storage we would like to be able to replace the existing one with a new one (at the same path).
Currently I can't find a way to be able to edit/change neither the in memory storage (BTreeMap keys), nor be able to create a new archive builder using an already existing storage without having to call the builder new
which internally calls create_archive
that raises an error when the storage already exists at the given path.
Even just being able to overwrite an existing storage with empty data without raising an error would be a nice addition. Otherwise the only workaround that probably comes to mind is to not make use of the subdir
feature and create/drop a new in memory storage every time, that is, having to handle the archive storage structure outside of this library.
Please let me know if there are better workarounds, thanks!
One thing to consider with regards to flatdata and Rust's memory safety guarantees is that Rust requires that the contents of memory mapped flatdata resources does not change as long as they are open (since they are marked as const
not mut
).
This means that if you want to "release" new data for a sub-archive you need to delete the old files (which will be kept alive by the OS as long as they are still memory mapped), and create new files in their place, and then open a new flatdata archive instance in the same location (which would load the new files). Any existing instance of the flatdata archive already loaded would still see the old deleted files (since they are kept alive by the OS).
The way I have seen this usually handled is to move the data that is regularly updated into a separate (top-level) archive, and handle it in a transactional/copy-on-write filesystem/DB layer, e.g. have a folder with multiple "in use" versions of an archive, regularly publish new data, and delete unused versions, and have the "consumer" of the data regularly check for updates, re-opening archives when needed.
A more concrete example:
You might have map data in a main flatdata archive Map
, and then a folder with an archive TrafficOverlay
. A traffic-producer
could regularly publish new TrafficOverlays
into that folder (and clean up unused versions), while a map-rendering-service
could regularly check the folder for new data and replace the TrafficOverlay
archive it has loaded in memory with a new one, while all the other threads of the `map-rendering-service continue processing, until they fetch a new version for the next request.
That's why I suspect that this feature might not help you much in achieving what you want to achieve. It is more useful for local development: You are building one flatdata archive after another and do not want to rm -r my_archive
all the time.
Thanks for your reply.
Yes, in my scenario of "local development" I cannot rm -r my_archive
because there's no such file as we use in memory storages. To make a more concrete example let's say that
- I create a new in memory storage
let storage = MemoryResourceStorage::new("/in-memory");
- I then create a subdirectory
let storage_sub = storage.subdir("subdir_name");
- I create an archive builder
let builder = MyArchiveBuilder::new(storage_sub).unwrap();
what is now, given storage
as input, the best way to create a new archive builder for the same subdirectory stored in memory of the original archive (where all the data that was previously written in that subdirectory has been cleaned)?
As mentioned above, constructing again the archive builder will raise the error Io(Custom { kind: AlreadyExists })