libc-database
libc-database copied to clipboard
Name collision
While debugging #25, I identified an issue.
To verify that the regex was matching everything, I compared the extracted symbols with an old DB. The issue is that for the same file (same name), I got 2 different address for the same symbol.
$ ls -la ./db/musl_1.2.0-1_amd64.so ./db.1/musl_1.2.0-1_amd64.so
-rwx------. 1 user group 723440 16 août 00:23 ./db.1/musl_1.2.0-1_amd64.so
-rwx------. 1 user group 719344 8 sept. 22:51 ./db/musl_1.2.0-1_amd64.so
$ readelf -Ws ./db/musl_1.2.0-1_amd64.so | grep '\bprintf\b'
578: 000000000005d990 195 FUNC GLOBAL DEFAULT 9 printf
$ readelf -Ws ./db.1/musl_1.2.0-1_amd64.so | grep '\bprintf\b'
578: 000000000005e980 199 FUNC GLOBAL DEFAULT 9 printf
This happens only for musl_1.2.0-1_amd64.so
. This file parsed file was previously provided by http://security.ubuntu.com/ubuntu/pool/universe/m/musl//musl_1.2.0-1_amd64.deb
and, now, it's provided by https://http.kali.org/pool/main/m/musl//musl_1.2.0-1_amd64.deb
.
We should have something to avoid this.
Solutions that I see quickly (without thinking about the feasibility or the which one will be the most optimal) are:
- We can add the distribution name to the downloaded file.
- We rename libs using a checksum (without lots of collision, as sha256) and keep an index (as a file
<HASH>.index
keeping a trace of distribution providing the file (as done with the.url
file). Or we can only rename the file and use the.url
file to keep a listing of distributions providing the file (when many distributions are providing the same file).
I think to keep the IDs meaningful, we should go with option 1 and prefix the IDs with a category identifier, in this case ubuntu-* and kali-*. For backwards compat we might want to drop the ubuntu- and debian- prefixes, but I'll have to think about this