rizin
rizin copied to clipboard
Update `librz/magic` to the latest from OpenBSD
Current code in librz/magic
is from 2009.
There were tons of the changes in OpenBSD since, see https://github.com/openbsd/src/tree/master/usr.bin/file
Note, that there are changes that were done in this code after the import, so they should be isolated and reapplied as appropriate. Also, probably it could be made as subproject, see https://github.com/rizinorg/rizin/issues/209
Initially it was imported in this commit: https://github.com/radareorg/radare2/commit/323441c122e0eb093c7aaafde1eed7352430466a
You can see the following changes by browsing the history of libr/magic
and librz/magic
correspondingly.
It makes me wonder whether we really need rz_magic
module at all...
From my point of view, the magic search feature is important. But we could also consider the possibility of using libmagic as a dependency, one of our subprojects. But it depends on how much it was customized, maybe the mainstream version is incompatible out of the box.
I agree it is an important feature indeed! I'm just not sure if it makes sense to have a module which essentially just mirrors either BSD implementation or GNU implementation, in case sys_magic is used...
I guess we need to research a bit more and see if there's any relevant change in rz_magic.
I also think the magic search feature is important. Moreover, it would be better to use libmagic, though after diffing with magic.c 1.6 (I can't find magic.c 1.8 anywhere) I see there are some changes that basically only make use of Rizin types:
-private int
-info_from_stat(struct magic_set *ms, mode_t md)
-{
+static int info_from_stat(RzMagic *ms, unsigned short md) {
/* We cannot open it, but we were able to stat it. */
- if (md & 0222)
- if (file_printf(ms, "writable, ") == -1)
+ if (md & 0222) {
+ if (file_printf(ms, "writable, ") == -1) {
return -1;
- if (md & 0111)
- if (file_printf(ms, "executable, ") == -1)
+ }
+ }
+ if (md & 0111) {
+ if (file_printf(ms, "executable, ") == -1) {
return -1;
- if (S_ISREG(md))
- if (file_printf(ms, "regular file, ") == -1)
+ }
+ }
+ if (S_ISREG(md)) {
+ if (file_printf(ms, "regular file, ") == -1) {
return -1;
- if (file_printf(ms, "no read permission") == -1)
+ }
+ }
+ if (file_printf(ms, "no read permission") == -1) {
return -1;
But then, there are some Rizin specific functions defined:
+RZ_API void rz_magic_free(RzMagic *ms) {
+ if (ms) {
+ free_mlist(ms->mlist);
+ free(ms->o.pbuf);
+ free(ms->o.buf);
+ free(ms->c.li);
+ free(ms);
+ }
+}
+
+RZ_API bool rz_magic_load_buffer(RzMagic *ms, const char *magicdata) {
+ if (*magicdata == '#') {
+ struct mlist *ml = file_apprentice(ms, magicdata, FILE_LOAD);
+ if (ml) {
+ free_mlist(ms->mlist);
+ ms->mlist = ml;
+ return true;
+ }
+ } else {
+ eprintf("Magic buffers should start with #\n");
+ }
+ return false;
+}
There is some content removed in the Rizin version, but I cannot tell if they are rizin specific changes or version changes (1.6 -> 1.8, as magic.c disappears from the OpenBSD history in the next version)
It shall be noted that all the GNU, BSD, etc versions come from https://www.darwinsys.com/file/
It shall be noted that all the GNU, BSD, etc versions come from https://www.darwinsys.com/file/
Yes, except the version we ship is from OpenBSD, that as said in the link wrote its own code to implement libmagic :(
@caribpa @ret2libc if the patching cannot be avoided we could employ semantic patching with Coccinelle suite:
- https://coccinelle.gitlabpages.inria.fr/website/sp.html
- https://dewaka.com/blog/2020/01/25/semantic-patching/
@caribpa @ret2libc if the patching cannot be avoided we could employ semantic patching with Coccinelle suite:
* https://coccinelle.gitlabpages.inria.fr/website/sp.html * https://dewaka.com/blog/2020/01/25/semantic-patching/
I'm not sure what coccinnelle has to do with libmagiz :) Could you elaborate?
From what I understand, @XVilka is suggesting to use the latest libmagiz (still needs to be discussed which one), and smart-patch it (with the Rizin specific changes) using Semantic patching (coccinelle).
I think we should see first if it is possible to simply use the system's libmagic and replace the unwanted behavior.
@ret2libc why the thumbs down? 😭
Btw, what I was suggesting when I said simply use the system's libmagic and replace the unwanted behavior
was using the weak
attribute (though I realized it is not portable), or dlsym (which is POSIX but apparently not on Windows).
Windows seems to have a pragma
for weak
-like, see this answer
According to this answer, the dlsym
equivalent in Windows is GetProcAddress
Well, apparently there's already rz_lib_dl_sym for solving the portability issues 🙂
@ret2libc why the thumbs down? sob
Just because I don't like the approach. And, again, I don't see why coccinelle would really help to do any non-trivial adjustments.
@caribpa @ret2libc I extracted all changes for files in librz/magic
since the initial import into the separate repository, also combined some commits for clarity, see https://github.com/XVilka/rizin-magic
Please note that it's made through git replace --graft
so GitHub does't show the right history, locally it should be fine.
git replace --graft f448ee2682c48d8cc3d2592147f96984ecbc6ced a8e896b7551107dd924e5ce8dc3b2be0ac8fe7ae
And here is the combined patch of all changes in both radare2 and rizin since the initial import:
magic_radare_rizin_changes.patch.zip
The same difference but without changes by clang-format
:
Excellent work @XVilka 👏
I think we should first research if we could make libmagic a dependency (like capstone), compile it, and then use rz_lib_dl_sym for getting the libmagic functions we don't want to modify (and implement our own version/wrap of the things we want to work differently) for future-proofing.
This issue has been automatically marked as stale because it has not had recent activity. Considering a lot has probably changed since its creation, we kindly ask you to check again if the issue you reported is still relevant in the current version of rizin. If it is, update this issue with a comment, otherwise it will be automatically closed if no further activity occurs. Thank you for your contributions.