hail icon indicating copy to clipboard operation
hail copied to clipboard

[query] read_movie_lens can sometimes fail

Open daniel-goldstein opened this issue 7 months ago • 1 comments

What happened?

A user running RHEL 9 reported an error importing movie lens data when running 0.2.126 but succeeded on 0.2.120. This error did not reproduce on MacOS. We should verify which version of hail this error is introduced and whether the hail installation is fully broken or for some reason is just movie lens/a subset of functionality.

Version

0.2.126

Relevant log output

2023-11-20 18:25:51.813 Hail: WARN: This Hail JAR was compiled for Spark 3.3.0, running with Spark 3.3.3.
  Compatibility is not guaranteed.
2023-11-20 18:25:53.340 Hail: INFO: SparkUI: http://xxxxx:4040
2023-11-20 18:25:54.037 Hail: INFO: Running Hail version 0.2.126-ee77707f4fab
2023-11-20 18:27:48.120 Hail: INFO: downloading MovieLens-100k data ...
  Source: https://files.grouplens.org/datasets/movielens/ml-100k.zip
2023-11-20 18:27:50.320 Hail: INFO: importing users table and writing to data/users.ht ...

daniel-goldstein avatar Nov 29 '23 15:11 daniel-goldstein

Hi, not sure if this is the right avenue, but I'd also like to report a similar orjson.JSONDecodeError: unexpected character: line 1 column 1 (char 0) bug first reported by https://discuss.hail.is/t/hail-fails-after-installing-it-on-a-single-computer/3653

Hail installed from https://anaconda.org/sfe1ed40/hail EDIT: the same error occurs after pip install hail into a fresh conda env, which produced hail version 0.2.130-bea04d9c79b5

Terminal output:

Python 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:36:39) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import hail as hl
hl.init()
>>> hl.init()
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Running on Apache Spark version 3.4.1
SparkUI available at http://xxxx:xxxx
Welcome to
     __  __     <>__
    / /_/ /__  __/ /
   / __  / _ `/ / /
  /_/ /_/\_,_/_/_/   version 0.2.127-d18228b9bc5b
LOGGING: writing to xxxx.log
>>> hl.utils.range_table(10).collect()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<decorator-gen-1234>", line 2, in collect
  File "/xxxx/lib/python3.10/site-packages/hail/typecheck/check.py", line 584, in wrapper
    return __original_func(*args_, **kwargs_)
  File "/xxxx/lib/python3.10/site-packages/hail/table.py", line 2213, in collect
    return Env.backend().execute(e._ir, timed=_timed)
  File "/xxxx/lib/python3.10/site-packages/hail/backend/backend.py", line 188, in execute
    result, timings = self._rpc(ActionTag.EXECUTE, payload)
  File "/xxxx/lib/python3.10/site-packages/hail/backend/py4j_backend.py", line 219, in _rpc
    error_json = orjson.loads(resp.content)
orjson.JSONDecodeError: unexpected character: line 1 column 1 (char 0)

Log file:

2024-04-25 16:07:16.773 Hail: INFO: SparkUI: http://xxxx:xxxx
2024-04-25 16:07:21.589 Hail: INFO: Running Hail version 0.2.127-d18228b9bc5b

digitase avatar Apr 25 '24 15:04 digitase