Re: TOML performance
I'm not sure where else to report this, but re: https://github.com/madmurphy/libconfini/wiki/An-INI-critique-of-TOML#18-performance
This is really a tomlc99 issue, not a TOML issue. The problem is that it allocates memory in 1,000 byte chucks, which takes quite a while (see start of toml_parse_file()), and after this it spends most of its time in check_key(), which loops over an array of keys, and is called every time it parses a new key, so ... yeah,
But these are not issues with TOML the file format. Other parsers are much faster; for example:
% ls -lh ~/50M.toml
-rw-r--r-- 1 martin martin 49M Sep 30 20:23 /home/martin/50M.toml
With C++:
% time ./tomlv++ ~/50M.toml
./tomlv++ ~/50M.toml 2.19s user 0.15s system 99% cpu 2.348 total
Go:
% time tomlv-go ~/50M.toml
tomlv-go ~/50M.toml 6.31s user 0.51s system 154% cpu 4.421 total
And Python:
% time ./tomlv.py ~/50M.toml
./tomlv.py ~/50M.toml 11.95s user 0.42s system 99% cpu 12.413 total
None are spectacularly fast as such, but orders of magnitudes faster than tomlc99, which is the outlier here. I also don't know how this compares to the 50M test-file you used (the TOML file lacks comments and is more "dense" in syntax than the big_file.ini that your script generates, so it's probably not a fair comparison).
Most of the other points in that article are more subjective, but I felt this one was clearly wrong.
@arp242
Thank you for pointing that out. I can write a paragraph about it in the next days (but I am not sure I will have time before mid October).
I also don't know how this compares to the 50M test-file you used (the TOML file lacks comments and is more "dense" in syntax than the big_file.ini that your script generates, so it's probably not a fair comparison).
If I don't remember wrong, I used the INI file generated by dev/tests/performance/prepare.sh (you will need to clone the repository before launching that script). Maybe you could re-post the timings you get using this file?
--madmurphy
I can write a paragraph about it in the next days (but I am not sure I will have time before mid October).
No hurries; I came across this almost 2 weeks ago. I also hope to fix the worst of the issue in tomlc99 when I have the time (not my code, so not entirely sure how hard it is).
Maybe you could re-post the timings you get using this file?
That file doesn't produce valid TOML. A few weeks ago I looked in to it and manually fixed some obvious things, but then you run in to:
% tomlv ./big_file.ini
Error in './big_file.ini': toml: line 35: Key 'section' has already been defined.
So this needs more work.
I wrote a little script to frob with TOML table and key names and generated my 50M TOML file with that, but you need a "real" parser for this (can't just s/../../ it), and this won't output comments, so hard to really reproduce that big_file.ini in a way that's valid TOML at the moment.
tomlc99 also can't parse it, which is why you got the error. I wouldn't be surprised if all of those 13 minutes you spent waiting was just for the allocation, and it didn't actually parse anything.