str(module) generates invalid LLVM IR with empty PHI instructions in complex loop cases
I had an AI find it and write a report.
Summary
str(module) generates invalid LLVM IR when multiple PHI instructions are present in loop structures. The PHI
instructions are printed without their incoming value pairs (e.g., %phi_17 = phi i64 instead of %phi_17 = phi i64 [%val1, %bb1], [%val2, %bb2]).
Environment
- llvmlite version: 0.43.0 (also reproduced on 0.44.0)
- Python version: 3.12.3
- LLVM triple: x86_64-unknown-linux-gnu
- OS: Ubuntu 24.04.2 LTS (WSL2)
Expected Behavior
All PHI instructions should include their incoming value pairs when printed via str(module).
Actual Behavior
Some PHI instructions are printed without incoming pairs:
bb7:
%"phi_17" = phi i64 ; ← Missing incoming pairs
%"phi_18" = phi i64 ; ← Missing incoming pairs
%"add_19" = add i64 %"phi_18", %"phi_17"
br label %"bb9"
Minimal Reproduction
A minimal test case with 2 PHIs in a simple if-else structure works correctly:
# See attached: repro_phi_str_print_min.py
# Output: PHIs correctly include [%val1, %bb1], [%val2, %bb2]
However, the issue appears in more complex cases with loops and multiple PHI nodes.
Reproduction Steps
1. Create complex loop structure with multiple PHI nodes
2. Call str(module) to generate IR string
3. Observe that some PHI instructions lack incoming value pairs
Attached Files
- Full IR showing the issue (line 36, 37, 42, 45, 46):
- See nyash_harness.ll (attached)
- Minimal reproduction script (works correctly):
- See repro_phi_str_print_min.py (attached)
- Environment details:
- See env.txt (attached)
Impact
The generated IR is invalid and cannot be verified or used by LLVM tools.
Workaround
We've implemented a post-processing step to detect and handle empty PHI lines, but this is not a proper solution.
[make_bundle.sh](https://github.com/user-attachments/files/23111829/make_bundle.sh)
[README.md](https://github.com/user-attachments/files/23111830/README.md)
[repro_from_hako_builder.sh](https://github.com/user-attachments/files/23111828/repro_from_hako_builder.sh)
[repro_phi_str_print_min.py](https://github.com/user-attachments/files/23111832/repro_phi_str_print_min.py)
[gather_env.sh](https://github.com/user-attachments/files/23111831/gather_env.sh)
@moe-charm Have you been able to reproduce this yourself?
Thanks for your reply. I'm enjoying using llvmlite.
I can reproduce this bug. The root cause appears to be that add_incoming() does not invalidate the PHI string cache.
Here is a minimal reproduction script that demonstrates the issue:
Minimal Reproduction (50 lines)
Python
#!/usr/bin/env python3
"""
Minimal reproduction for llvmlite issue #1337
Bug: add_incoming() does not invalidate PHI string cache
pip install llvmlite
python3 llvmlite_issue1337_simple.py
"""
import llvmlite.ir as ir
# Create module
module = ir.Module()
i64 = ir.IntType(64)
func = ir.Function(module, ir.FunctionType(i64, []), name="test")
# Create blocks
entry = func.append_basic_block("entry")
loop = func.append_basic_block("loop")
# Entry block
builder = ir.IRBuilder(entry)
zero = ir.Constant(i64, 0)
builder.branch(loop)
# Loop block with PHI
builder = ir.IRBuilder(loop)
builder.position_at_start(loop)
phi = builder.phi(i64, name="counter")
builder.position_at_end(loop)
one = ir.Constant(i64, 1)
next_val = builder.add(phi, one)
builder.branch(loop)
# BUG: Call str() BEFORE add_incoming() - this caches empty string
print("Before add_incoming():")
print(f" str(phi) = {str(phi)}")
# Add incoming edges
phi.add_incoming(zero, entry)
phi.add_incoming(next_val, loop)
# BUG: str() returns OLD cached value (empty)
print("\nAfter add_incoming():")
print(f" str(phi) = {str(phi)}")
print(f" Expected: phi i64 [ 0, %entry ], [ %next_val, %loop ]")
# PROOF: Clear cache fixes it
phi._clear_string_cache()
print(f"\nAfter _clear_string_cache():")
print(f" str(phi) = {str(phi)}")
Output
Before add_incoming():
str(phi) = %"counter" = phi i64
After add_incoming():
str(phi) = %"counter" = phi i64
Expected: phi i64 [ 0, %entry ], [ %next_val, %loop ]
After _clear_string_cache():
str(phi) = %"counter" = phi i64 [0, %"entry"], [%".3", %"loop"]
Explanation
PHI is created (no incomings)
str(phi) is called → empty string is cached
add_incoming() is called → cache NOT invalidated
str(phi) returns old cached string (empty)
Suggested Fix
The add_incoming() method should call self._clear_string_cache() to invalidate the cache.