dask-sql icon indicating copy to clipboard operation
dask-sql copied to clipboard

[BUG] Segfault when running with pytest

Open ksonj opened this issue 3 years ago • 2 comments

What happened: Running tests with pytest causes a segfault in the JVM when calling c.sql("SELECT * FROM sometable") on a Context.

What you expected to happen: No segfault :)

Minimal Complete Verifiable Example: https://github.com/ksonj/dask-sql-segfault

Basically adding an import of import dask_sql.java in ./tests/conftest.py is enough to trigger the error when running pytest.

See the above repository for a complete reproduction, with which this is enough to reproduce on Ubuntu:

poetry install
poetry run py.test

Alternatively on the off chance you're familiar with Nix:

nix develop .#dev -c py.test

Otherwise all the nix stuff can be ignored.

Anything else we need to know?:

Environment: See pyproject.toml and poetry.lock in the reproduction repo for detailed environment info.

  • dask-sql version: 2022.06.0
  • Python version: 3.10
  • Operating System: Ubuntu 18.04 and NixOS
  • Install method (conda, pip, source): pip (via poetry, on Ubuntu) and source (on NixOS)

I have already read through a few similar issues here on Github, namely

  • https://github.com/dask-contrib/dask-sql/issues/297
  • https://github.com/dask-contrib/dask-sql/issues/415
  • https://github.com/dask-contrib/dask-sql/issues/540
  • https://github.com/dask-contrib/dask-sql/pull/294

but it's still not completely clear how I might want to go about this. Any guidance would be much appreciated.

ksonj avatar Jun 17 '22 09:06 ksonj

Thanks for catching this @ksonj 🙂 you are correct in observing that segfaults have been a consistent issue in the past, which have been somewhat difficult to diagnose and resolve - currently, this is one of our larger motivators behind exploring Apache Arrow DataFusion as an alternative SQL parser, which seems to offer greater stability in this respect. Progress on this can be tracked on:

  • https://github.com/dask-contrib/dask-sql/tree/datafusion-sql-planner

For now, it could be good to know the context around how you encountered the segfault - were you adding this import to the testing configuration to achieve something in testing related to a feature request or bug fix?

charlesbluca avatar Jun 17 '22 14:06 charlesbluca

Thanks for the quick response. I actually imported one of my own modules in conftest to setup a fixture for unit tests in a different project. That module in turn imported dask_sql.Context, which imports dask_sql.java. I've just reduced it down to dask_sql.java for the sake of demonstrating the issue here.

ksonj avatar Jun 17 '22 15:06 ksonj

@ksonj are you ok with us closing this now that we have completely moved away from Java? Don't want to close it if this is something still affecting you on an older version.

jdye64 avatar Mar 13 '23 23:03 jdye64

Yes, absolutely

ksonj avatar Mar 14 '23 08:03 ksonj