polars
polars copied to clipboard
Stack overflow on joinon high cpu count machine
Checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the latest version of Polars.
Reproducible example
import polars as pl
NIDS = 500
NAXISA = 1000
NVALS = 100
NAXISB = 5000
ids = pl.Series("id", range(NIDS), pl.Int64)
vals = pl.Series("val", range(NVALS), pl.Int64)
extraa = pl.Series("axis", range(NAXISA), pl.Int64)
extrab = pl.Series("axis", range(NAXISB), pl.Int64)
a = pl.DataFrame(ids).join(pl.DataFrame(extraa), how='cross').join(pl.DataFrame(vals), how='cross')
b = pl.DataFrame(ids).join(pl.DataFrame(extrab), how='cross')
print("a={:,} b={:,}".format(len(a),len(b)))
for i in range(30):
print(i)
a.join(b,on=['id','axis'])
Log output
join parallel: true
CROSS join dataframes finished
join parallel: true
CROSS join dataframes finished
join parallel: true
CROSS join dataframes finished
'a=50,000,000 b=2,500,000'
0
join parallel: true
INNER join dataframes finished
1
join parallel: true
INNER join dataframes finished
2
join parallel: true
INNER join dataframes finished
3
join parallel: true
INNER join dataframes finished
4
join parallel: true
INNER join dataframes finished
5
join parallel: true
INNER join dataframes finished
6
join parallel: true
INNER join dataframes finished
7
join parallel: true
INNER join dataframes finished
8
join parallel: true
Segmentation fault (core dumped)```
Issue description
Joining two biggish dataframes (50 million and 2.5 million) occasionally is stack overflowing somewhere deep in rayon.
Note this seems to happen only on my machine with 128 logical cores, and not my smaller machine.
Expected behavior
It does not crash
Installed versions
--------Version info---------
Polars: 0.20.10
Index type: UInt32
Platform: Linux-6.5.0-14-generic-x86_64-with-glibc2.35
Python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
----Optional dependencies----
adbc_driver_manager: <not installed>
cloudpickle: <not installed>
connectorx: <not installed>
deltalake: <not installed>
fsspec: <not installed>
gevent: <not installed>
hvplot: <not installed>
matplotlib: 3.8.1
numpy: 1.26.1
openpyxl: <not installed>
pandas: 2.1.2
pyarrow: 14.0.0
pydantic: <not installed>
pyiceberg: <not installed>
pyxlsb: <not installed>
sqlalchemy: <not installed>
xlsx2csv: <not installed>
xlsxwriter: <not installed>
Hmm.. Could you see in GDB where you are at that point? Maybe make a debug compilation. (I am on vacation and have poor internet, let alone a 128 core machine. :))
Hmm.. Could you see in GDB where you are at that point? Maybe make a debug compilation. (I am on vacation and have poor internet, let alone a 128 core machine. :))
Here is a full gdb thread backtrace. compressed as it's 15MB uncompressed
fwiw you can replicate the behaviour by setting RUST_MIN_STACK
. On my 16 core machine I need to use 100000