flatdata icon indicating copy to clipboard operation
flatdata copied to clipboard

Add clean/delete function to ArchiveBuilder(s)

Open VeaaC opened this issue 4 years ago • 3 comments

Often the user wants to override an existing archive, and we should provide safe methods to do so:

  • check that only flatdata files are in the directory
  • only then delete it

This could either be another parameter to open, or another function like remove etc.

VeaaC avatar May 15 '20 09:05 VeaaC

Hi, we use flatdata in Rust and also interested in this particular feature. We are creating in memory resource storages, and subdirectories, and given a flatdata storage we would like to be able to replace the existing one with a new one (at the same path).

Currently I can't find a way to be able to edit/change neither the in memory storage (BTreeMap keys), nor be able to create a new archive builder using an already existing storage without having to call the builder new which internally calls create_archive that raises an error when the storage already exists at the given path.

Even just being able to overwrite an existing storage with empty data without raising an error would be a nice addition. Otherwise the only workaround that probably comes to mind is to not make use of the subdir feature and create/drop a new in memory storage every time, that is, having to handle the archive storage structure outside of this library.

Please let me know if there are better workarounds, thanks!

gliderkite avatar Jun 27 '23 12:06 gliderkite

One thing to consider with regards to flatdata and Rust's memory safety guarantees is that Rust requires that the contents of memory mapped flatdata resources does not change as long as they are open (since they are marked as const not mut).

This means that if you want to "release" new data for a sub-archive you need to delete the old files (which will be kept alive by the OS as long as they are still memory mapped), and create new files in their place, and then open a new flatdata archive instance in the same location (which would load the new files). Any existing instance of the flatdata archive already loaded would still see the old deleted files (since they are kept alive by the OS).

The way I have seen this usually handled is to move the data that is regularly updated into a separate (top-level) archive, and handle it in a transactional/copy-on-write filesystem/DB layer, e.g. have a folder with multiple "in use" versions of an archive, regularly publish new data, and delete unused versions, and have the "consumer" of the data regularly check for updates, re-opening archives when needed.

A more concrete example: You might have map data in a main flatdata archive Map, and then a folder with an archive TrafficOverlay. A traffic-producer could regularly publish new TrafficOverlays into that folder (and clean up unused versions), while a map-rendering-service could regularly check the folder for new data and replace the TrafficOverlay archive it has loaded in memory with a new one, while all the other threads of the `map-rendering-service continue processing, until they fetch a new version for the next request.

That's why I suspect that this feature might not help you much in achieving what you want to achieve. It is more useful for local development: You are building one flatdata archive after another and do not want to rm -r my_archive all the time.

VeaaC avatar Jun 27 '23 13:06 VeaaC

Thanks for your reply. Yes, in my scenario of "local development" I cannot rm -r my_archive because there's no such file as we use in memory storages. To make a more concrete example let's say that

  1. I create a new in memory storage
let storage = MemoryResourceStorage::new("/in-memory");
  1. I then create a subdirectory
let storage_sub = storage.subdir("subdir_name");
  1. I create an archive builder
let builder = MyArchiveBuilder::new(storage_sub).unwrap();

what is now, given storage as input, the best way to create a new archive builder for the same subdirectory stored in memory of the original archive (where all the data that was previously written in that subdirectory has been cleaned)?

As mentioned above, constructing again the archive builder will raise the error Io(Custom { kind: AlreadyExists })

gliderkite avatar Jun 27 '23 14:06 gliderkite