libelektra icon indicating copy to clipboard operation
libelektra copied to clipboard

Make TOML work as default storage plugin

Open kodebach opened this issue 4 years ago • 13 comments
trafficstars

Ensure that the toml plugin can be used as the default storage plugin

Known Issues

  • [x] #3925
  • [x] #3911
  • [x] #3912
  • [x] #4466
  • ~Whitespace is not fully preserved (probably acceptable, as long as newlines and comment positions are preserved)~
  • [ ] Some invalid TOML files are parsed as valid (Happens on invalid simple table/tablearray combinations or redefintions)
  • [ ] Fail on unknown metadata.

Other open tasks

  • [ ] Allow to make it default:
    • Remove the libelektra-storage symlink
    • Add symlinks libelektra-storage-bootstrap -> libelektra-dump, libelektra-storage-spec -> libelektra-ni and libelektra-storage-config -> libelektra-toml
    • (A) Change the hardcoded setup for the default mountpoints in kdbOpen to use the new symlinks
    • (B) Update the code in kdb mount to use the separate symlinks
  • [ ] We need a build job that successfully uses TOML as default storage (For all namespaces except of spec:, ni can be used there)
  • [ ] Fully support (reading) TOML v1.0.0 spec-compliant files, or specify in README what isn't supported
  • [ ] Describe in README what guarantees for file preservation there are
  • [x] Use (f)lex "Start Conditions" for strings, comments, etc. (combined with #3925)

kodebach avatar Jun 30 '21 22:06 kodebach

Thank you for summarizing the current issues!

I think the quoting issue is the most important problem, I put it to the top.

markus2330 avatar Jul 01 '21 08:07 markus2330

I think currently there is no mechanism to load additional plugins, if KDB_DEFAULT_STORAGE has a dependency (e.g. base64 or directoryvalue). I think this is not a problem right now, but ideally the default plugin should work without dependencies. I.e. if a dependency is missing the plugin should fail safely.

kodebach avatar Jul 02 '21 12:07 kodebach

You are right, there is no automatic mechanism. But it is quite easy to build one, e.g., see the dini plugin.

markus2330 avatar Jul 03 '21 13:07 markus2330

I think the TOML plugin should be sufficiently complete to try using it as the default again.

kodebach avatar Sep 13 '21 11:09 kodebach

@markus2330 Do we really want to try making TOML the default before the new backend system (#3693) is done? AFAICT the only way to do this is to create a weird pseudo-plugin (similar to what dini was) that calls either toml or ni depending on the parent key. I would call this very much a hack and not a solution.

The problem here is that we currently only have a single default backend that is used, if a key doesn't match any mountpoint. This backend (like any other backend) only has one storage plugin.

With #3693 in place, we could create a special backend plugin that is hard-coded to:

  • Use a fixed resolver
  • Use a fixed storage plugin based on the namespace of the key Alternatively, it would also be a lot easier in the system to have separate default backends for every namespace.

PS. I stumbled upon KDL. It would map to a KeySet much better than TOML, an also seems easier to parse. But it is again pretty niche, so maybe just an additional format for the future. Maybe as a replacement for dump.

kodebach avatar Oct 16 '21 15:10 kodebach

Do we really want to try making TOML the default before the new backend system (#3693) is done?

It is not absolutely needed but I would tie as little as possible to #3693. TOML is much more popular than dump, so if we can avoid presenting dump to newcomers would be a big gain.

Maybe as a replacement for dump.

dump will probably never be something standardized as the #1 goal of dump is to resemble the KeySet as closely as possible. Once dump is not the default format anymore, it will be just yet-another-unimportant plugin. quickdump/mmapstorage are much better choices if you want a binary format which can represent any KeySet.

markus2330 avatar Nov 08 '21 17:11 markus2330

TOML is much more popular than dump, so if we can avoid presenting dump to newcomers would be a big gain.

Totally agree, but the metadata issue in toml makes it quite hard to use as a general plugin. It simply doesn't work for spec:/. Since we currently can't easily separate the default for spec:/ from the other namespaces, I don't think there is anything we can do right now.

If you think it is really that important, we could simply adapt the kdb mount tool. Currently it simply uses the default storage plugin (I think via the storage symlink), when the user doesn't specify one. But changing the C++ code so that depending on the namespace of the mountpoint toml/ni is used as the default shouldn't be too hard.

(Note: I don't really have time to do this right now. But it might be simply enough to let one of the FOSS people do it.)

But as stated already, changing libelektra-kdb so that toml/ni is used even for the hardcoded default mountpoints is very hard in the current code. A delegating plugin like the old dini might do the trick, but again, I don't really want to invest the time myself. You are of course free to try yourself ;)

Once dump is not the default format anymore, it will be just yet-another-unimportant plugin.

I don't think so. Because the format is designed to be both human-readable (i.e. text-only as long as key values are text-only) and a 1:1 mapping of a KeySet, it is the best format for debugging. For example, if a user reports a problem with their mountpoint config, I would request a dump export from them, because I know that this will tell me exactly what is going on.

The bootstrap files (i.e. /elektra keys) should also remain in dump format. The format for the bootstrap files must be as simple as possible to eliminate possible errors. The only alternative IMO would be quickdump, but that is not human-readable so it makes debugging harder.

In short dump is a useful format, but it should never have been the default for user facing files.

kodebach avatar Nov 08 '21 19:11 kodebach

Yes, I agree: having TOML for mounting non-spec and ni for spec by default would already be one big step. If we also recommend that people do a / and spec mountpoint when they setup Elektra (let us create a kdb setup which does basic mountpoints including kdb mount-info), we basically have what you described:

  1. dump for /elektra with the name given as in CMakeLists.txt (we probably should hard-code this to reduce CMakeLists.txt complexity)
  2. ni for spec:/
  3. toml for /

The storage symlink then stops making sense, we probably should either drop it or make 3 symlinks:

  1. bootstrap-storage
  2. spec-storage
  3. config-storage

But it might be simply enough to let one of the FOSS people do it.

They already have their issues. What we write here is quite trivial to do but first we exactly need to know what we want :wink:

markus2330 avatar Nov 09 '21 06:11 markus2330

@kodebach are you still working on this issue? I think this could be a FLOSS project. Can you please update the top-post with what is to be done that we can use TOML as default storage plugin?

markus2330 avatar Sep 27 '22 17:09 markus2330

No, I'm not working on this anymore. I unassigned myself from this and any other issue where I think somebody else could help out.

I think this could be a FLOSS project.

Would be a pretty big project with lots of things to work out, I think.

what is to be done that we can use TOML as default storage plugin?

That depends on what a "default storage plugin" must be able to do. Going by https://github.com/ElektraInitiative/libelektra/issues/3910#issuecomment-963849792 there is quite a few things outside the toml plugin to solve.

  • Remove the libelektra-storage symlink
  • Add symlinks libelektra-storage-bootstrap -> libelektra-dump, libelektra-storage-spec -> libelektra-ni and libelektra-storage-config -> libelektra-toml
  • (A) Change the hardcoded setup for the default mountpoints in kdbOpen to use the new symlinks
  • (B) Update the code in kdb mount to use the separate symlinks

The (A) and (B) points above depend on what "default storage plugin" means. Is it (A) the one use for the root mountpoints (system:/, user:/, etc.) and/or (B) the one that is selected by kdb mount if no storage plugin is provided.

As for the toml plugin itself: It still doesn't really support metadata and I don't think we have a good solution for that. To make TOML the default (outside spec:/) there are 3 options AFAICT:

  1. Storing metadata outside spec:/ is not allowed. This should be enforced and needs changes in kdbSet, i.e. kdbSet should fail immediately before the storage phase, if a key outside spec:/ has metadata.
  2. Find a solution to store metadata with toml.
  3. Only choose option (B) above. Then e.g. kdb meta-set user:/foo type long without a previous kdb mount works and if you need to store metadata in a mountpoint created with kdb mount, you just choose a different storage plugin.

IMO option 3 is the best. Option 1 might also be possible, but I don't like the fact that we have this complicated metadata system and then only supported in such a limited fashion. Option 2 seems very unlikely considering the long discussion in #3911.

kodebach avatar Sep 27 '22 19:09 kodebach

Would be a pretty big project with lots of things to work out, I think.

Projects are 40% of FLOSS, so they can be big. This topic is obviously only for someone with prior experience with parsing. Making the build job can be teamwork T2.

That depends on what a "default storage plugin" must be able to do. Going by https://github.com/ElektraInitiative/libelektra/issues/3910#issuecomment-963849792 there is quite a few things outside the toml plugin to solve.

Thank you, I added that task. Furthermore, I added one more task above:

We need a build job that successfully uses TOML as default storage (For all namespaces except of spec:, ni can be used there)

Storing metadata outside spec:/ is not allowed. This should be enforced and needs changes in kdbSet, i.e. kdbSet should fail immediately before the storage phase, if a key outside spec:/ has metadata.

I wouldn't totally prohibit it. Furthermore, elektraKdbSet currently cannot really know which metadata a storage plugin is capable to work with. Instead storage plugins should simply fail on unknown metadata. (This actually needs to be done in most of our storage plugins. They shouldn't silently eat what they don't understand.) I created #4496.

markus2330 avatar Sep 28 '22 14:09 markus2330

I don't think the parsing part is the hard part of this issue. In some cases it might be tricky, but flex/bison is well documented and there is a lot of info online. The hard part IMO is figuring out all the Elektra related details: What must a default storage plugin support, etc.?

kodebach avatar Sep 28 '22 22:09 kodebach

Some parts of it are described in doc/tutorials/storage-plugins.md (to be extended as part of the project).

markus2330 avatar Oct 06 '22 16:10 markus2330

I mark this stale as it did not have any activity for one year. I'll close it in two weeks if no further activity occurs. If you want it to be alive again, ping by writing a message here or create a new issue with the remainder of this issue. Thank you for your contributions :sparkling_heart:

github-actions[bot] avatar Mar 26 '24 01:03 github-actions[bot]

I closed this now because it has been inactive for more than one year. If I closed it by mistake, please do not hesitate to reopen it or create a new issue with the remainder of this issue. Thank you for your contributions :sparkling_heart:

github-actions[bot] avatar Apr 09 '24 01:04 github-actions[bot]