Support file based DB backend (alternative to mongodb)
Having to depend on mongodb in order to have a working setup is a pain in the ass in lots of scenarios:
- Running unit tests (meson -v tests)
- Setting up a development/testing env in one's personal system
- Creating test environments in docker or other tools.
It complicates the setup incredibly, specially since mongodb got its license changed and several mainstream linux distros decided to stop providing prebuilt packages for it in their package repositories [1]. Compiling mongodb took me more than 1 hour on an 8-core i7 machine.
Hence, I'd propose to support (while still using mongodb as default main DB) an alternative DB backend which requires a lot less hassle and which is a lot easier to set up, populate and modify. Some possibilities would include an sqlite database, or a directory tree with csv.
I didn't check the code yet to figure out how difficult it would be, but we could have some code layer API which is then implemented with mongodb or sqlite. One or the other can be selected through yaml config.
[1] https://lists.archlinux.org/pipermail/arch-dev-public/2019-January/029430.html
Admittedly, this has also always been one of my biggest pain points of open5gs. The fact that there's a node.js UI can be worked around by inserting json directly into mongodb (or use the python utilities since they exist). But the mongodb, at least for the hss, is difficult. I remember seveal times where I locally patched the code to disable e.g. the PCRF access to mongodb, as it is really not needed if all your subscribers have the same profile.
I understand this may not be the highest priority for the project overall, but it would certainly make it much easiert to use in a number of different situations, as @pespin has pointed out.
I think thre are several dimensions to the problem:
- is there really a need for an object / non-SQL database? IMHO the subscriber records, APNs, QoS, etc. are all very well-structured data that fits the classic model of a relational database.
- the license problem of mongodb, meaning it is not universally packaged on all distibutions, and also not something that all organizations/users will want to use
- the technical dependency of more programs to mongodb than strictly neccessary. At least at some point in the past [almost?] all open5gs executables dependet on mongodb, as the initialization was done somewhere from shared ogslib. This meant that programs that had no data in mongodb still depended on it. Not sure if it is still he case?
FYI, I've also read people who reported: "yeah it's a pain ... I ended up using a cloud instance to run it just because it's easier."
The easiest from a development point of view would probably be to keep everything in json (no change to the data representation) but swap mongodb against some simpler system capable of storing JSON objects e.g. in local files.
This is not my area of expertise. All the examples I could quickly locate were implemented in PHP, node.js or Go (e.g. buntdb). But maybe there is something simple capable for storing JSON objects for small setups (development/testing, few subscribers)...
quick look at the open5gs source shows that the lib/dbi is actually farily self-contained. There are few database related API functions that would have to be provided by an alternative implementation, such as:
- ogs_dbi_msisdn_data (lookup by msisdn, return ogs_msisdn_data_t)
- ogs_dbi_ims_data (lookup by supi, return ogs_ims_data_t)
- ogs_dbi_session_data (lookup by supi + nssai + dnn, return ogs_session_data_t)
- ogs_dbi_auth_info (lookup by supi, return ogs_dbi_auth_info_t)
- ogs_dbi_update_sqn (lookup by supi, write sqn)
- ogs_dbi_increment_sqn (lookup by supi, increment sqn)
- ogs_dbi_subscription_data (lookup by supi, return ogs_subscription_data_t)
I would support this feature, and would be glad to help with implementation, although this is also not my area of expertise either. The lack of mongodb support in major distros definitely causes me a deployment and maintenance headache I would love to help solve.
I would suggest avoiding the raw file solution though, which while would be sufficient for testing, would be difficult to use in any "real" deployment case without extensive logic to make sure the files cannot get irrecoverably corrupted on disk, in the presence of crashing processes, power failure, etc. Even just storing JSON as binary blobs within sqlite would probably be more robust, and logically very similar.
Yes, to an extent setting up mongodb and getting it working is pain, since the licence change and pre-built binaries missing from all major OS. As a alternative atleast for the test setups you can use https://www.ferretdb.io/ which uses postgress backend with a ferret proxy which translates all mongodb calls in to postgress DB calls. There is a docker file available which can be used as a drop-in replacement for MongoDB at https://github.com/FerretDB/FerretDB, for open5gs.
But at the same time I would say, that open5gs should provide a DB API layer which can talk to any database of choice, but not a raw json file solution.
PS: I am not anyway attached to ferretdb, but I have used this in other projects when MongoDB changed its license model from Open Source to SSPL.
A combination of mysql and Redis?
Dear Community People,
I didn't know that using MongoDB would be so difficult. Because I use the mongo image of the docker environment, it had not been affected by license or distribution.
https://github.com/open5gs/open5gs/blob/main/docker/docker-compose.yml
For 5G, the interface has already been changed to HTTP2 (JSON). As a result, I think that it is a little more convenient to store it in the DB in JSON format. So I kept using it like this.
Of course, this does not mean that 5G Core should be implemented in a No-SQL manner. It would be great if it could additionally support SQLite, PostreSQL, Redis, etc. However, my resources are limited, so I can't afford to spend time here.
Let me know if there is anything else what I need to know.
Thank you so much for raising this issue.
Best Regards, Sukchan
just some more random thoughts about different ways how to approach this:
- one could implement alternative versions of the ogs_dbi_* API functions I listed in an earlier comment on this ticket. This would give full flexibility to any imlpementation, but of course also introduce a lot of new code, which would need to be maintained whenever the data types get extended by @acetcom in the main backend for mongodb
- alternatively, one could introduce some kind of intermediate API operating at "BSON" level. This means that all the calls to libbson (bson_) to serialize/deserialzie the data could be shared code, only the layer around it (mongoc_ APIs) would then need to move in a database specific backend. More shared code.
- one could provide a libmongoc compatible API for those parts that open5gs uses. At this point, no changes to the open5gs code would be required, simply link against another library which stores the data e.g. in local files.
- one could use something implementing the mongodb wire protocol, so that the original libmongoc is used. Only the database server is replaced. Some people have done this in golang and nodejs, if I remember correctly. But it means having to run another database server in another programming language/environment, also not the best solution, IMHO.
I think realistically, only 2, 3 or 4 from this list have a chance of working in the long term.
Hi Everyone,
FerretDB maintainer here. We're working on achieving compatibility with real use cases, and Open5GS is on our list: https://github.com/FerretDB/FerretDB/issues/2449
Feel free to add any comments or information about missing features we have to provide. If you can test FerretDB and provide insight, it will significantly contribute to our project!