Comprehensive Feedback & Suggestions for Improving Build Robustness, Caching, Maintenance, and Packaging Efficiency
Introduction
Hi P4A Maintainers and Contributors,
First, I want acknowledge the immense complexity of the task python-for-android undertakes – enabling Python on Android, especially with compiled extensions, is a significant challenge. P4A has been a vital tool for many developers. This feedback aims to consolidate observations from various GitHub issues, recipe analyses, and code reviews into a constructive discussion about potential areas for improvement, focusing on enhancing the toolchain's robustness, maintainability, caching, packaging efficiency, and overall developer experience.
1. Challenges with Pip Fallback Mechanism
When p4a encounters a requirement without a dedicated recipe, the current pip fallback mechanism faces several recurring challenges, often leading to build failures or runtime errors:
- Incompatible Binaries from Host Wheels: Issues like #3110, #2755 (charset_normalizer), #2964 (curl-cffi), #2662 (pyodbc), and #2628 (fitz/PyMuPDF) consistently show runtime
dlopenerrors due to architecture mismatches (e.g.,EM_X86_64binary in anEM_AARCH64environment, or 64-bit vs 32-bit). This strongly suggests thatpip(running on the host build machine) downloads pre-built wheels for the host architecture, and a subsequent step in p4a inadvertently allows these incompatible.sofiles into the final Android package. - Source Build Failures: When pip attempts a source build for packages like
pycairo(#2851),readline(#2787), orpyaudio(#2265), failures often occur during cross-compilation, even though p4a correctly sets NDK environment variables (CC,CFLAGS, etc.). Common causes appear to be:- Host Path Contamination: Build scripts referencing host system headers/libraries (
/usr/include,/usr/lib) instead of relying solely on the NDK/target paths (e.g.,pycairo#2851). - Autotools Misconfiguration:
./configurescripts failing because they aren't invoked with the necessary--host=<target_triplet>flag (e.g.,readline#2787). - Missing C Dependencies: Required C libraries lacking corresponding p4a recipes (e.g.,
portaudioforpyaudio#2265).
- Host Path Contamination: Build scripts referencing host system headers/libraries (
- Transitive Dependency Exclusion: The command used in
build.py:run_pymodules_installexplicitly includes--no-deps. This prevents pip from installing transitive Python dependencies, forcing users to manually list the entire dependency tree for non-recipe packages to avoid runtimeImportErrors.
These points indicate the pip fallback needs careful handling for compiled dependencies and Python dependency resolution, often requiring a dedicated recipe for reliability.
2. C Extension Filename Tagging (.so Naming)
The runtime errors involving incompatible binaries appear linked to .so filename handling during cross-compilation:
setuptoolsNaming Behavior: Build backends likesetuptools, when run underhostpython, often seem to name the output.sofile using host platform tags, even when the code inside is compiled for the target.- The
reduce_object_file_namesWorkaround: This function inTargetPythonRecipestrips these incorrect host tags, allowing the target-compiled.soto be loaded generically.# In TargetPythonRecipe def reduce_object_file_names(self, dirn): # Strips tags like cpython-XYZ-x86_64-linux-gnu move(filen, join(file_dirname, parts[0] + '.so')) - Problematic Side Effect: This tag stripping also applies to incompatible host-architecture
.sofiles from pip wheels, allowing them into the package and causing runtime crashes mentioned above. - Deviation from Standards: Stripping standard platform tags removes metadata Python typically uses. Addressing the root cause (ensuring correct target tag generation by the build backend) would be ideal.
3. Build Caching and Invalidation Challenges
The current caching requires frequent manual cleaning, indicating limitations:
- Recipe Changes: Modifying recipe files/patches often doesn't trigger a rebuild, as invalidation seems based on output artifact existence, not input sources. Forces manual deletion of
build/other_builds/<recipe>. - Distribution Reuse: Stale distributions are reused even if underlying recipes are changed. Forces manual deletion of
dists/. - Site-packages Updates: Updating a Python recipe version doesn't make it do a clean reinstall into
build/python-installs/, requiring manual deletion there, since it only checks if the package exists in site-packages or not.
Implementing more robust invalidation (e.g., hashing inputs, tracking dependencies accurately, ensuring clean installs) would improve developer workflow.
4. Build and Packaging Inefficiency
Several areas contribute to larger build directories and final package sizes:
- Source Code Duplication: The build process unpacks or copies the entire source code for each recipe into separate, architecture-specific directories within
build/other_builds/. This duplicates the (usually architecture-independent) source code for every target architecture, increasing disk usage during builds. - Redundant Pure-Python Installs: Pure-Python packages appear to be installed redundantly into each architecture's staging directory (
build/python-installs/<dist>/<arch>), repeating the install process. libpybundle.soBloat: Bundling stdlib, pure-Python site-packages (*.pyc), and all extensions (*.so) into a per-architecture gzipped archive namedlibpybundle.so(to leverage OS extraction) duplicates all architecture-independent bytecode.- Bundling Unused Standard Library: The entire standard library (minus a small blacklist) is included in
stdlib.zip, rather than performing import analysis to include only necessary modules, further increasing size compared to tools like PyInstaller.
5. Linker Workarounds (LibPthread/LibRt)
- Recipes like
LibPthreadandLibRtexist solely to create fakelibpthread.so/librt.sosymlinks pointing tolibc.so. - This works around build systems (like
uvloop's) that incorrectly try to link-lpthreador-lrton Android, where these symbols are part oflibc. - This is a hack that pollutes the global linker path. The ideal fix is patching the dependent recipes to remove the unnecessary linker flags when targeting Android.
6. Recipe Maintenance and Ecosystem Support
- Outdated Core Recipes: Many foundational recipes (NumPy, OpenSSL, SciPy, flask, OpenCV, Pandas, Cython, ICU, etc.) are significantly behind current stable/secure versions, limiting usability and posing risks.
Summary & Potential Path Forward
A potential path forward could involve a focused effort on:
- Fix Cross-Compile Naming: Resolve the
.sofilename tagging issue at the build backend level. - Implement Robust Caching: Improve build/dist invalidation based on actual input changes.
- Optimize Build/Packaging: Avoid source/bytecode duplication; perform import analysis for stdlib.
- Fix Pip Fallback: Ensure correct transitive dependency bundling and improve build environment isolation.
- Add Build-Time Architecture Validation: Check
.sofiles match the target arch before final packaging. - Remove Hacks: Eliminate
reduce_object_file_names,LibPthread/LibRt, etc., as underlying issues are fixed. - Prioritize Core Recipe Updates: Focus on updating critical outdated recipes once the build system is more robust. Communicate challenges transparently.
Addressing these foundational areas seems key to making p4a more robust, maintainable, efficient, and capable of supporting the modern Python ecosystem on Android.
Thank you again for maintaining this important project and for considering this comprehensive feedback.