pydistcheck
pydistcheck copied to clipboard
[bug] "compiled" file count in `--inspect` misses static libraries
What did you expect to happen?
The output of pydistcheck --inspect prints the number of compiled files within a distribution.
https://github.com/jameslamb/pydistcheck/blob/471660ea780c9e65ab1e90b3e9f694e6cbea8bbe/src/pydistcheck/inspect.py#L26
That count appears to miss static libraries.
What actually happened?
Static libraries should be counted in the count of compiled files.
How can someone else reproduce this problem?
Consider the following.
docker run --rm -it python:3.12 bash
pip download \
--no-deps \
--extra-index-url https://pypi.nvidia.com \
'librmm-cu12==24.10'
That project has some static library files.
unzip -l ./librmm_cu12*.whl | grep -E '\.a'
# 248882 2024-10-09 14:39 librmm/lib64/libfmt.a
# 1802910 2024-10-09 14:39 librmm/lib64/libspdlog.a
But those are not reported as "compiled" by pydistcheck --inspect.
pip install 'pydistcheck>=0.8.0'
pydistcheck --inspect ./librmm_cu12*.whl
You'll see output that begins like this:
checking './librmm_cu12-24.10.0-py3-none-any.whl'
----- package inspection summary -----
file size
* compressed size: 3.7M
* uncompressed size: 15.3M
* compression space saving: 76.1%
contents
* directories: 0
* files: 1770 (0 compiled)
I'd expected that (0 compiled) to actually say (2 compiled).
What version of pydistcheck are you using?
0.8.0
Notes
No response
Some helpful links:
- https://stackoverflow.com/a/41902135/3986677
- https://stackoverflow.com/a/60909689/3986677
- https://en.wikipedia.org/wiki/Ar_(Unix)
I was able to trace the bug to the fact that _FileFormat does not support the Unix archive format yet, and I added support for it – I might have a fix handy soon. Would you be interested in a PR? :D
BTW, I tested the fix on librmm_cu12-24.10.0-py3-none-any.whl from the reproducer above, and I noticed that pydistcheck reports that librmm/lib64/libfmt.a and librmm/lib64/libspdlog.a have debug symbols. However, I don't see anything with grep "debug" when I run llvm-objdump or llvm-nm over them.
Perhaps pydistcheck is incorrectly reporting that they have debug symbols when they don't, or vice versa? I don't have a Linux machine or Docker installed at the moment to check this particular wheel out. I do notice that the _nm_reports_debug_symbols function only checks if exported_symbols != all_symbols, but that condition may not translate into a high-confidence check where we can say with certainty that debug symbols are found in the binary. Either way, that should be a separate issue.
I was able to trace the bug to the fact that _FileFormat does not support the Unix archive format yet, and I added support for it – I might have a fix handy soon. Would you be interested in a PR?
Sure, that'd be great! I'd welcome a PR adding that support.
I noticed that pydistcheck reports that librmm/lib64/libfmt.a and librmm/lib64/libspdlog.a have debug symbols
Very possible that the nm check you're highlighting is not a great check, and is giving us a false positive here.
I'd want to dump the entire symbol tables for those objects with readelf or similar and check if I agree with the findings from it. Agree that it should be a separate issue.