trafficstars

Summary

I've tried to build Python on both of the Ubuntu platforms listed below, and the process always fails on cpython. Over several iterations on the Ubuntu server VM:

Tried to compile with flags --python=cypthon-3.8 and --optimizations=pgo+lto, but failed (gist).
Tried to compile with flags --python=cypthon-3.7 and --optimizations=pgo+lto, but failed (gist).
Tried to compile with flags --python=cypthon-3.7 and --optimizations=pgo, but failed (gist).
Tried to compile with the --python=cpython-3.7 flag but no optimizations and succeeded! (download and slice off the .zip from the end, sha256 is f994d12161adcf3ced5ad2daf964cab6e8f70913cc90534ee3ac1feb41420404).
Tried to compile with the --python=cpython-3.8 flag but no optimizations and succeeded! (download and slice off the .zip from the end, sha256 is 9743d8927e82503bd464bf645014ade886a01f29a8d20999081d4acb95050f41)

With this evidence, I'd have to say that @indygreg is probably right about musl not liking optimization.

Host Information

Platform	Laptop
Notes	Default development machine
System Model	`Purism Librem 15v3 w/ TPM`
Operating System	`Ubuntu 18.04 LTS x86_64`
Kernel	`5.3.0-51-generic #44~18.04.2-Ubuntu SMP Thu Apr 23 14:27:18 UTC 2020`
CPU	`Intel Core i7-6500U @ 2.5GHz (x4)`
RAM	`8GB`

Platform	Tower PC
Notes	Used only to host the Ubuntu server VM, via `VirtualBox v6.1.6 r137129 (Qt5.6.2)`; not used for building itself!
System Model	`Beefy (custom) build`
Operating System	`Windows 10 Pro 64-bit, v1909 (build 18363.836)`
CPU	`AMD Ryzen 9 3900X (x12)`
RAM	`32GB`

Platform	Ubuntu server (virtualized)
Notes	AMD-V hypervisor, Nested paging, PAE/NX, and KVM paravirtualization are all enabled
Operating System	`Ubuntu Server 20.04 LTS x86_64`
Kervel	`5.4.0-29-generic #33-Ubuntu SMP Wed Apr 29 14:32:27 UTC 2020`
CPU	`AMD Ryzen 9 3900X (x12)`
RAM	`16GB`

May 16 '20 04:05 jwarner112

The underlying failure seems to be:

cpython-3.7> /tools/host/bin/ld: /tools/clang-linux64/lib/clang/10.0.0/lib/linux/libclang_rt.profile-x86_64.a(InstrProfilingFile.c.o): in function `parseAndSetFilename':
cpython-3.7> InstrProfilingFile.c:(.text.parseAndSetFilename+0x/tools/host/bin/ld: fe): undefined reference to `__strdup'
cpython-3.7> /tools/clang-linux64/lib/clang/10.0.0/lib/linux/libclang_rt.profile-x86_64.a(InstrProfilingFile.c.o): in function `parseAndSetFilename':
cpython-3.7> InstrProfilingFile.c:(.text.parseAndSetFilename+0xfe): undefined reference to `__strdup'

What I think is happening here is that Clang needs to link some profiling code into the instrumented binary so that it can emit profile metrics when executed. Because Clang was built with glibc (instead of musl), that profiling code is expecting an environment that doesn't exist because we're using musl.

A possible workaround would be to build a version of Clang (potentially just libclang_rt.profile-x86_64) against musl so the static library can be linked into binaries built against musl.

If you want some optimizations for musl, I'm optimistic the lto optimizations would work: it's just pgo that needs to inject code into an instrumented binary.

May 17 '20 02:05 indygreg

I'm no expert on this, believe me -- but wouldn't link-time optimizations for a static binary be superfluous? I'm assuming I'm wrong on that now, but not sure how.

May 17 '20 03:05 jwarner112

No, LTO does more.

When you compile something, you first produce individual object files (from individual sources). Then once you have all of those, you link them together.

The compiler applies optimizations at compile time. These are typically the extent of optimizations. Traditionally when you link, the linker effectively assembles a bunch of already compiled/optimized code: it doesn't modify the generated machine code except to remove unused symbols, adjust memory addresses, etc.

Link-time optimization applies an additional round of optimizations at linking time. e.g. it can see function calls across object files and optimize accordingly. https://llvm.org/docs/LinkTimeOptimization.html has a very high-level overview. It might be best to think of LTO as whole-program optimizations and regular compiler optimization as single-file (technically compilation unit) optimization.

May 17 '20 04:05 indygreg

Oh alright, rad. Thanks for the link! I'll be sure to read up on the subject. In my tests I'd been disregarding LTO early on based on my misunderstanding.

May 17 '20 07:05 jwarner112

python-build-standalone
python-build-standalone copied to clipboard

Unable to compile cpython-3.x with optimizations on Ubuntu

Summary

Host Information

python-build-standalone python-build-standalone copied to clipboard

Unable to compile cpython-3.x with optimizations on Ubuntu

Summary

Host Information

python-build-standalone
python-build-standalone copied to clipboard