Drop YAJL dependency
yajl library has been unmaintained upstream[1] since 2015. Last published release contians multiple CVEs (CVE-2023-33460, CVE-2022-24795, CVE-2017-16516) and fixes have had to be carried by downstream distributions and third parties. This is not an ideal situation. While a fork exists[2], it is unclear how widely adopted or blessed it is downstream.
As the maintainer of libmodsecurity for Fedora[3] and EPEL, I recently found that the yajl library will not be shipped with RHEL 10. This means a new maintainer would be required to add it to EPEL. The previous maintainer for RHEL & EPEL has recommended moving away from yajl[4].
Currently, libmodsecurity can be built without yajl, but I understand that making it mandatory is considered desirable, as per #3144 and #3151.
Given the security concerns and the upstream status of yajl, I recommend opening a discussion on dropping yajl as a dependency and exploring alternative JSON libraries, such as JSON-C, which is actively maintained and more widely adopted.
[1] https://github.com/lloyd/yajl [2] https://github.com/robohack/yajl/ [3] https://src.fedoraproject.org/rpms/libmodsecurity [4] https://lists.fedoraproject.org/archives/list/[email protected]/message/YPFHPOKAND3RZR7ZKWTDHUQEESG6IUJ3/
Note, I added label [2.x] too, because mod_security2 is also affected.
This does not sound good. Thank you for bringing this to the attention @mikelolasagasti.
What do you think about the "official" YAML library?
I'm the Fedora yajl maintainer quoted in the link above. While YAML may be a superset of JSON, I'd not recommend using a YAML parser instead of a native JSON parser. It'll perform worse, and potentially open the code up to attacks that are impossible with pure JSON (aka the "Billion laughs attack" in YAML context). I can vouch for using json-c as an alternative to yajl, as that's what was successfully adopted in projects I work on. Be slightly cautious about jansson - when we previously tried it, we found it was not able to cope with the full numeric range of the uint64 type - if that's important to modsecurity's needs.
Hi @berrange,
While YAML may be a superset of JSON, I'd not recommend using a YAML parser instead of a native JSON parser.
You're right, that was a mistake from me - sorry, and thank you.
Thank you for this very valuable input @berrange. That puts the replacement discussion on a far better base.
These are the libraries that are reported (by their authors and third-parties) as the fastest:
- https://github.com/stephenberry/glaze
- https://github.com/simdjson/simdjson
- https://ibireme.github.io/yyjson/
It would be interesting to check their functionalities for our intended use:
- validation
- callback to store ARGS (or should we avoid that and use only the resulting sJSON onjects?)
- memory footprint
There are a few subtle behavior quirks of yajl that should be examined for alternative libraries too. And then documented in case.
The behavior with empty request body for example. From the back of my head I am no longer sure if yajl rejects that and other parsers are OK with it or if it's the other way around. But I've seen this problem in production before.
This would definitely need extensive test cases
Just a quick remark for this issue:
- I started to make a small tool which can help to compare different JSON implementations; still work in progress
- I've contacted with YAJL developer but after (almost) two weeks still no reply
- Perhaps we should contact other contributors and fork YAJL repository, then fix the available issues...?
Other ideas?
I've contacted with YAJL developer but after (almost) two weeks still no reply
AFAICT they haven't replied to anyone asking about YAJL on github in many years. I see no notable github activity on their account in 10 years.
* Perhaps we should contact other contributors and fork YAJL repository, then fix the available issues...?
The thing about yajl forks is that there are already many to choose from, each with a random selection of changes, but none seem to gain critical mass/attention.
Personally I'd just recommend letting yajl fade out. There are already two actively maintained alternative plain C JSON libraries - jansson & json-c - json-c slightly more active, as well as glib-json for apps which happen to use GLib. Between them they ought to be able to satisfy JSON needs for C applications, unless there's an unavoidable need for historical bug compatibility with YAJL. Even bug compatibility is questionable since JSON ought to be used/consumed in a way that ensures it is interoperable across impls, so it doesn't feel like further investment in YAJL is a long term benefit.
AFAICT they haven't replied to anyone asking about YAJL on github in many years. I see no notable github activity on their account in 10 years.
I saw that in some place but gave a try...
* Perhaps we should contact other contributors and fork YAJL repository, then fix the available issues...?The thing about yajl forks is that there are already many to choose from, each with a random selection of changes, but none seem to gain critical mass/attention.
that's said.
Personally I'd just recommend letting yajl fade out. There are already two actively maintained alternative plain C JSON libraries - jansson & json-c - json-c slightly more active, as well as glib-json for apps which happen to use GLib. Between them they ought to be able to satisfy JSON needs for C applications, unless there's an unavoidable need for historical bug compatibility with YAJL. Even bug compatibility is questionable since JSON ought to be used/consumed in a way that ensures it is interoperable across impls, so it doesn't feel like further investment in YAJL is a long term benefit.
If we don't have other chance we must switch, but honestly I'd like to avoid that. If we will switch, I do not see any reason to keep any compatibility with old bugs, I agree with you.
And thanks for following this issue.
Just a side note: it seems like Debian package maintainer made a fork and he tries to contribute the code. This is the upstream of official Debian package.
I made some researches in this topic, and now I'm completely disappointed. Honestly, I think it's harder to make a good decision than I thought.
Here are the libraries what I checked.
| Libray | Has GH organization | Language | Has SAX parser | Speed | Last commit | Last release | Has Debian package |
|---|---|---|---|---|---|---|---|
| json-c | ✅ | C | ❌ | slow (it uses traditional lexer and parser) | 2025.01.09 | 2024.09.15 | ✅ |
| simdjson | ✅ | C++ (no C extension, need to write a wrapper) | ❌ | extreme fast | 2025.03.06 | 2025.02.14 | ✅ |
| Jansson | ❌ | C | ❌ | slower than json-c | 2024.03.31 | 2021.09.09 | ✅ |
| cJSON | ❌ | C | ❌ | fast | 2024.09.23 | 2024.05.13 | ✅ |
| RapidJSON | ✅ | C++ (no C extension) | ✅ | fast | 2025.02.05 | 2016.08.25 | ✅ |
| yyjson | ❌ | C | ❌ | fast | 2025.03.12 | 2024.07.09 | ✅ (After Trixie) |
| Ultrajson | ✅ | C | ❌ | fast | 2025.03.01 | 2024.05.14 | ❌ |
| JSMN | ❌ | C | ✅ | fast | 2021.10.14 | 2019.06.23 | ✅ |
| Centijson | ❌ | C | ✅ | ? | 2024.01.20 | - | ❌ |
| cJSON | ❌ | C | ❌ | ? | 2024.09.23 | 2024.05.13 | ✅ |
I know the list is far not complete, but I haven't more time. If anyone can make similar research, please send me the results and I will add it to this sheet.
Here is a good collection of available parsers, but I'm not sure that's a complete list.
Aspects that I took into account:
- has a GH organization: perhaps this is a bit weird aspect but learning from the current situation (YAJL developer does not response to anyone, does not merge PR's, ...) I'd be happy if I saw that the project belongs to an organization
- language: it would be nice if the library also supported the C language; if the library's language is C++ but not native C support, we can make a wrapper, but then we bring a C++ compiler as a dependency
- has SAX parser: YAJL uses SAX parser, I think we should stay on this method, especially in case of huge payloads
- speed: this is a bit of a vague point; I didn't find any exact information, just asked chatGPT... but I think we should keep in mind
- last commit and last release: I think it's obviously; these parameters shows how active the project is
- has Debian package: yes I'm a Debian user, but I hadn't enough time to check other distributions that they have package or not; but I think it's an important indicator if the user can install dependencies from a package (especially the reason of this issue is that YAJL library no longer exists in the mentioned distribution).
If anyone has other aspects that we should consider, please leave a comment.
And any help are welcome to continue this research.
Update on 2025.03.27:
ȷson-cdoes not support SAX parsing- fixed
UltrajsonURL - added
JSMN - added
Centijson - added
cJSON
Update on 2025.03.28:
Jansondoes not support SAX parsing
Update on 2025.01.01:
yyjsonhas Debian package (after Trixie)
To all: I've reviewed the available JSON parsers and organized them by my criteria - see this comment.
There are some important requirements that the candidate must meet:
- must be use in native C - it's necessary for mod_security2; I don't want to use any own wrapper, because then the code will have a new dependency, a C++ compiler
- it must have a SAX parser - in case of DOM parser, the processor builds the whole tree in memory; I think that would be a waste
- be available as an operating system package - I think this does not need any special preparation: the original reason of this issue was that some distribution removed the YAJL packages.
Based on the sheet above (inside of the comment) there is only ONE JSON library which fit these requirements: the JSMN, but unfortunately that's - we can say - also an unmaintained codebase. The last commit was on 2021.10.14, and the last release was on 2019.06.23. This is very sad.
Actually I don't see the solution. I reviewed all listed library here (see the C libraries, you should scroll down), but I couldn't find any usable one.
Meanwhile I created a small tool which emulates the engines' behavior (especially it's closer to mod_security2's behavior). If anyone wants to try to add a new library, feel free to do that. Or anyone has an idea what library is missing from the list above, and thinks that would be a good candidate, please share that and I will add that.
The tool is here: https://github.com/digitalwave/jsonbench
Also please if you think I forget to add a usable library above, please notice me.
yyjson seems to have an option read data using YYJSON_READ_INSITU without having to parse the whole DOM document. It's not SAX, but could be close enough for modsecurity needs?
Also, yyjson is packaged for Debian Testing and many other distributions:
- https://tracker.debian.org/pkg/yyjson
- https://repology.org/project/yyjson/versions
Looks like the i3 project is also exploring alternatives to YAJL: https://github.com/i3/i3/issues/6257
yyjsonseems to have an option read data usingYYJSON_READ_INSITUwithout having to parse the whole DOM document. It's not SAX, but could be close enough for modsecurity needs?
I have to check this, thanks. Btw if you want to take a look and/or try that, feel free to add yyjson to jsonbench tool.
Also,
yyjsonis packaged for Debian Testing and many other distributions:* https://tracker.debian.org/pkg/yyjson * https://repology.org/project/yyjson/versions
Ah, thanks - now I realized that it has been added in May of last year, so it's part of SID and testing (which is the Trixie now). But it's a good sign.
Thanks again for your help!
Looks like the
i3project is also exploring alternatives to YAJL: i3/i3#6257
Yes, and it seems like they try yyjson too.
Thanks, now I think it worth a try.
The tool is here: https://github.com/digitalwave/jsonbench
It seems it's not public.
The tool is here: https://github.com/digitalwave/jsonbench
It seems it's not public.
Ah, sorry - now it's public.
Hello! Has anyone upgraded to RHEL10 and managed to install yajl? I have been stuck for a number of days. The package is not even in EPEL
@bujikun check this COPR I created with the builds for EPEL10: https://copr.fedorainfracloud.org/coprs/mikelo2/modsecurity-el10/
@bujikun check this COPR I created with the builds for EPEL10: https://copr.fedorainfracloud.org/coprs/mikelo2/modsecurity-el10/
Thanks @mikelolasagasti. This issue is still on my list with priority (unfortunately I was very busy in last few months).
Hello, If not afraid of "not recommended" repository but on official one "yajl-devel" is still in the rhel10 devel 10 repostitory : dnf install almalinux-release-devel (https://wiki.almalinux.org/repos/AlmaLinux.html)
Here is my output :
dnf search yajl
AlmaLinux 10 - Devel 1.0 MB/s | 463 kB 00:00
================================ Name & Summary Matched: yajl ================================
yajl.x86_64 : Yet Another JSON Library (YAJL)
yajl-devel.x86_64 : Libraries, includes, etc to develop with YAJL
I was able to create my own rpm and even nginx + nginx connector (+/- with mock too)
Good luck.
"yajl-devel" is still in the rhel10 devel 10 repostitory
That repository is an AlmaLinux feature and it's not available for RHEL/EPEL. It can't be used to create official EPEL builds.
Another data point from a prominent C library:
Xen 4.21.0-rc1 was released today. yajl is no longer a hard dependency. Xen will prefer json-c to yajl, if ./configure can find it.
Please test and feed back any issues you find.
Hi @gruenich,
Another data point from a prominent C library:
Xen 4.21.0-rc1 was released today. yajl is no longer a hard dependency. Xen will prefer json-c to yajl, if ./configure can find it. Please test and feed back any issues you find.
thanks for sharing this.
Unfortunately there is a very important missing feature from json-c: it does not have a SAX style parser (see my comparative sheet). Without that, the used JSON parser parses the whole request (or response) body, without limitation (number of arguments - the payload size still can be a limit).
I think based on the comparison there aren't too much option, I think the best choice now is the RapidJSON. It's presented in most distribution (this is the biggest disadvantage of YAJL now), and has a SAX parser. The problem is the C++ dependency, but I think all distribution has a C++ compiler, so this is less problematic than there is no required dependency at all.
I really hope that I can integrate this into the source trees this year and make releases with the new parser (actually I fight with two other serious issues (beside my regular work)).
RapidJSON had its last release in 2016, lots of open issues, and numerous unmerged pull requests. Similar to yajl, there are forks but non really took over, see https://github.com/Tencent/rapidjson/issues/2321. This will not improve the situation, it is out of the frying pan into the fire.
RapidJSON had its last release in 2016, lots of open issues, and numerous unmerged pull requests. Similar to yajl, there are forks but non really took over, see Tencent/rapidjson#2321. This will not improve the situation, it is out of the frying pan into the fire.
Thanks for showed this thread under rapidjson's repository.
Yes, I saw the last release date when I started collecting available libraries, but then I checked the last commits too. In case of RapidJSON that was a few months before that date. I read the last comment under your mentioned issue, and I don't think it's a big problem if there is no release, but the code was constantly maintained.
Anyway, also in your mentioned issue I saw a new library was showed, that I haven't seen yet, that's the Nlohmann-JSON. Unfortunately it does not belong to any organization (like Lloyd's YAJL), but it seems it's maintained and many distribution supports it. And it seems like that also has a SAX-parser, which is very important for us.
What do you think about this, guys?