Version check makes potentially invalid assumptions about ELF layout
Is there an existing issue for this?
- [X] I have searched the existing issues
Current Behavior
I have been trying to debug why pystack thinks I am using python version 12.41 and it turns out that my python binary has a different layout. This problem seems to only occur (or at least I've only noticed thus far) when it attempts to find hte value of Py_Version. The version of python supplied by ubuntu has elf section for rodata that looks like this (obtained with readelf -S).
[18] .rodata PROGBITS 0000000000312000 00312000
000000000008806d 0000000000000000 A 0 0 32
Mine looks like this:
[16] .rodata PROGBITS 00000000008741c0 004741c0
000000000035f330 0000000000000000 A 0 0 64
Expected Behavior
The proper way to look up the address would be something like
<addr of Py_Version> - 0x00000000008741c0 + 004741c0
Since these values are the same in most python binaries, the issue would go unnoticed. I am not sure if this is some guarantee that the normal python build process makes or not, so this could also bite regular python versions later.
I was able to validate that the calculation above would work for my binary, where 0x0000000000ba2c68 is the address that pystack is attempting to look up.
dd if=(path to python) bs=1 skip=$((0x0000000000ba2c68-0x00000000008741c0+0x004741c0)) count=8| hexdump16+0 records in
16+0 records out
16 bytes copied, 8.1777e-05 s, 196 kB/s
0000000 04f0 030b 0000 0000
Steps To Reproduce
I am not sure how you would easily reproduce this issue as you'd need to produce a python binary that has the rodata addresses like the one in my example. If you are able to do that the issue reproduces very easily and all functionality of pystack will fail.
Pystack Version
1.3.0
Python Version
3.11
Linux distribution
Ubuntu
Anything else?
No response
Hi @jhance and thanks for opening the issue. We will take a look soon. Meanwhile, could you tell us what version of Ubuntu are you using and how are you obtaining Python (deadsnakes, main repo, pyenv...).
Also is this when analysing a live process or a core file?
Also, just to ensure we use what you already debugged: what's making your rodata section different than a regular binary? (It's not immediate clear from your comment)
I am compiling Python from source myself with a crosstool targetting towards a non-distribution provided version of glibc. As such, I don't expect many people to be following the same process. I am not sure what what is resulting in this difference though, maybe because I use gold instead of ld?
I was analyzing a core file while pointing to the same python binary that the core file was extracted from.
I just recently found a workaround is to essentially disable the Py_Version check and rely on the RSS check which seems to work and correctly detect 3.11.
My suspicion here is that what's going on is that you have a first PT_LOAD segment that doesn't map to the start of the file.
When the linker loads the file in memory, the sections (such as .rodata) don't matter anymore and the only thing the linker sees its LOAD segments. For this reason, we just need to find where the first LOAD segment (the mount point) it's in the file and correct by that (which we aren't doing at the time).
To corroborate this, do you mind sending the output of readelf -a over the binary?
https://pastebin.com/4h7wPER6
(I cut out some sections that contain literally all of the Python symbols).
We pass -Wl,-I to the linker to set a custom ld.so as the interpreter, maybe that is the cause for having an offset before the first LOAD segment?
Ah there you go:
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000008a8860 0x00000000008a8860 R E
I will try to make a patch this week
We pass -Wl,-I to the linker to set a custom ld.so as the interpreter, maybe that is the cause for having an offset before the first LOAD segment?
Yeah that's also my guess but I have seen this on the wild as well so I don't think it's unique of this situation
Thanks for the quick help, I am hardly an ELF expert so ended up reading a lot of docs today to figure out why this was not working...
After playing with this for a while, I still cannot reproduce in a variety of situations so making a patch it's going to be very difficult. I still think we are handling the code correctly:
The calculation works if you do:
<addr of Py_Version> - 0x00000000008741c0 + 004741c0
(which is wrong, it only works for offset 0). But that works for the case in this issue so the resulting offsets is correct. The offset was calculated later:
(0x0000000000ba2c68-0x00000000008741c0+0x004741c0)=0x7a2c68
But according to
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000008a8860 0x00000000008a8860 R E 0x1000
If you just do what’s in main now:
0x0000000000ba2c68 - 0x0000000000400000 = 0x7a2c68
Which should be correct. That’s also the first map I get:
DEBUG /src/src/pystack/_pystack.cpython-311-aarch64-linux-gnu.so:test_local_variables.py:18 VirtualMap(start=0x0000000000400000, end=0x0000000000915000, filesize=0x515000, offset=0x0, device='fe:01', flags='r-xp', inode=180507, path='/usr/bin/python3.11')
So I don’t understand the problem as it's described above. I will close this issue until we have a better reproducer that we can investigate.