maggma icon indicating copy to clipboard operation
maggma copied to clipboard

Feature Suggestions: Additional local data stores, e.g. `MongitaStore`, `FerretDB`

Open Andrew-S-Rosen opened this issue 2 years ago • 3 comments

https://github.com/scottrogowski/mongita

It would be great to have more file-based DB options in maggma. One such option (which I think @rkingsbury first told me about) is mongita. This is mostly a note to myself that it would be nice to implement this one day. Perhaps I'll come back to this.

One general concern I have is whether the existing MontyStore or a potential MongitaStore would be suitable for high-throughput calculations. It'd be worth knowing if either package prevents a race condition where multiple jobs try to read/write to the database at the same time. The README for mongita at least mentions that it has (experimental) lock support for multithreading behavior, which is promising.

Andrew-S-Rosen avatar Jul 25 '23 01:07 Andrew-S-Rosen

Edit: Looks like we are out of luck there for Mongita.

Mongita is an embedded database. It is not process-safe. When you have multiple clients, a traditional server/client database is the correct choice.

Same for MontyDB, as noted here.

There's nothing in maggma specifically to resolve this, right @munrojm?

Andrew-S-Rosen avatar Jul 25 '23 02:07 Andrew-S-Rosen

Yeah, thanks for the suggestion @arosen93 . Definitely no objections to implementing a MongitaStore someday. I looked briefly at mongita as a substitute for mongomock to power MemoryStore, but I noted the following limitations:

mongitadoes not support many query operations including$regexor$exists. It also doesn't support bulk_writeorestimated_document_count` although those can be worked around.

rkingsbury avatar Jul 26 '23 15:07 rkingsbury

https://github.com/openjournals/joss-reviews/issues/5995#issuecomment-1810759481

Worth jotting down FerretDB here, as highlighted by @utf above. This one seems particularly promising!

Andrew-S-Rosen avatar Nov 14 '23 17:11 Andrew-S-Rosen