dask-sql
dask-sql copied to clipboard
[BUG] Segfault when running with pytest
What happened:
Running tests with pytest causes a segfault in the JVM when calling c.sql("SELECT * FROM sometable") on a Context.
What you expected to happen: No segfault :)
Minimal Complete Verifiable Example: https://github.com/ksonj/dask-sql-segfault
Basically adding an import of import dask_sql.java in ./tests/conftest.py is enough to trigger the error when running pytest.
See the above repository for a complete reproduction, with which this is enough to reproduce on Ubuntu:
poetry install
poetry run py.test
Alternatively on the off chance you're familiar with Nix:
nix develop .#dev -c py.test
Otherwise all the nix stuff can be ignored.
Anything else we need to know?:
Environment: See pyproject.toml and poetry.lock in the reproduction repo for detailed environment info.
- dask-sql version: 2022.06.0
- Python version: 3.10
- Operating System: Ubuntu 18.04 and NixOS
- Install method (conda, pip, source): pip (via poetry, on Ubuntu) and source (on NixOS)
I have already read through a few similar issues here on Github, namely
- https://github.com/dask-contrib/dask-sql/issues/297
- https://github.com/dask-contrib/dask-sql/issues/415
- https://github.com/dask-contrib/dask-sql/issues/540
- https://github.com/dask-contrib/dask-sql/pull/294
but it's still not completely clear how I might want to go about this. Any guidance would be much appreciated.
Thanks for catching this @ksonj 🙂 you are correct in observing that segfaults have been a consistent issue in the past, which have been somewhat difficult to diagnose and resolve - currently, this is one of our larger motivators behind exploring Apache Arrow DataFusion as an alternative SQL parser, which seems to offer greater stability in this respect. Progress on this can be tracked on:
- https://github.com/dask-contrib/dask-sql/tree/datafusion-sql-planner
For now, it could be good to know the context around how you encountered the segfault - were you adding this import to the testing configuration to achieve something in testing related to a feature request or bug fix?
Thanks for the quick response. I actually imported one of my own modules in conftest to setup a fixture for unit tests in a different project. That module in turn imported dask_sql.Context, which imports dask_sql.java. I've just reduced it down to dask_sql.java for the sake of demonstrating the issue here.
@ksonj are you ok with us closing this now that we have completely moved away from Java? Don't want to close it if this is something still affecting you on an older version.
Yes, absolutely