graph-data-science icon indicating copy to clipboard operation
graph-data-science copied to clipboard

Pipeline operations fail with "Expected positive long value, got -24"

Open cybersam opened this issue 2 years ago • 10 comments

Describe the bug Link prediction operations (e.g., .create, .addNodeProperty) fail, using GDS 2.5.1 and 2.5.3.

To Reproduce A. Execute either of these using the Python GDS client:

  • pipe = gds.lp_pipe("foo"), or
  • gds.run_cypher("""CALL gds.beta.pipeline.linkPrediction.create("foo")""")

B. Execute this in the Neo4j Browser:

  • CALL gds.beta.pipeline.linkPrediction.create("foo")

GDS version: 2.5.1, also 2.5.3 Neo4j version: standalone EE 5.13.0 Operating system: Amazon Linux (AMI ID: ami-0cd7323ab3e63805f)

Steps to reproduce the behavior (when using the Python GDS client):

from graphdatascience import GraphDataScience
gds = GraphDataScience(MY_URI, auth=MY_AUTH, database=MY_DB)
# Append here one of the above statements to create an LP pipeline.

Expected behavior The pipeline is created AND results are returned.

Observed behavior The call fails with Expected positive long value, got -24 error message. But it appears the pipeline is actually created.

Here is error, when executing pipe = gds.lp_pipe("foo")

---------------------------------------------------------------------------
DatabaseError                             Traceback (most recent call last)
Cell In[2], line 1
----> 1 pipe = gds.lp_pipe("foo")

File ~/pyenv/lib64/python3.9/site-packages/graphdatascience/pipeline/pipeline_endpoints.py:26, in PipelineEndpoints.lp_pipe(self, name)
     16 """
     17 Create a Link Prediction training pipeline, with all default settings.
     18 
   (...)
     23     A new instance of a Link Prediction pipeline object.
     24 """
     25 runner = PipelineBetaProcRunner(self._query_runner, f"{self._namespace}.beta.pipeline", self._server_version)
---> 26 p, _ = runner.linkPrediction.create(name)
     27 return p

File ~/pyenv/lib64/python3.9/site-packages/graphdatascience/pipeline/lp_pipeline_create_runner.py:16, in LPPipelineCreateRunner.create(self, name)
     14 query = f"CALL {self._namespace}($name)"
     15 params = {"name": name}
---> 16 result = self._query_runner.run_query(query, params).squeeze()
     18 return LPTrainingPipeline(name, self._query_runner, self._server_version), result

File ~/pyenv/lib64/python3.9/site-packages/graphdatascience/query_runner/neo4j_query_runner.py:70, in Neo4jQueryRunner.run_query(self, query, params, database, custom_error)
     63 # Though pandas support may be experimental in the `neo4j` package, it should always
     64 # be supported in the `graphdatascience` package.
     65 warnings.filterwarnings(
     66     "ignore",
     67     message=r"^pandas support is experimental and might be changed or removed in future versions$",
     68 )
---> 70 df = result.to_df()
     72 if self._NEO4J_DRIVER_VERSION < ServerVersion(5, 0, 0):
     73     self._last_bookmarks = [session.last_bookmark()]

File ~/pyenv/lib64/python3.9/site-packages/neo4j/_sync/work/result.py:748, in Result.to_df(self, expand, parse_dates)
    745 import pandas as pd  # type: ignore[import]
    747 if not expand:
--> 748     df = pd.DataFrame(self.values(), columns=self._keys)
    749 else:
    750     df_keys = None

File ~/pyenv/lib64/python3.9/site-packages/neo4j/_sync/work/result.py:603, in Result.values(self, *keys)
    585 def values(
    586     self, *keys: _TResultKey
    587 ) -> t.List[t.List[t.Any]]:
    588     """Return the remainder of the result as a list of values lists.
    589 
    590     :param keys: fields to return for each remaining record. Optionally filtering to include only certain values by index or key.
   (...)
    601     .. seealso:: :meth:`.Record.values`
    602     """
--> 603     return [record.values(*keys) for record in self]

File ~/pyenv/lib64/python3.9/site-packages/neo4j/_sync/work/result.py:603, in <listcomp>(.0)
    585 def values(
    586     self, *keys: _TResultKey
    587 ) -> t.List[t.List[t.Any]]:
    588     """Return the remainder of the result as a list of values lists.
    589 
    590     :param keys: fields to return for each remaining record. Optionally filtering to include only certain values by index or key.
   (...)
    601     .. seealso:: :meth:`.Record.values`
    602     """
--> 603     return [record.values(*keys) for record in self]

File ~/pyenv/lib64/python3.9/site-packages/neo4j/_sync/work/result.py:266, in Result.__iter__(self)
    264     yield self._record_buffer.popleft()
    265 elif self._streaming:
--> 266     self._connection.fetch_message()
    267 elif self._discarding:
    268     self._discard()

File ~/pyenv/lib64/python3.9/site-packages/neo4j/_sync/io/_common.py:180, in ConnectionErrorHandler.__getattr__.<locals>.outer.<locals>.inner(*args, **kwargs)
    178 def inner(*args, **kwargs):
    179     try:
--> 180         func(*args, **kwargs)
    181     except (Neo4jError, ServiceUnavailable, SessionExpired) as exc:
    182         assert not asyncio.iscoroutinefunction(self.__on_error)

File ~/pyenv/lib64/python3.9/site-packages/neo4j/_sync/io/_bolt.py:851, in Bolt.fetch_message(self)
    847 # Receive exactly one message
    848 tag, fields = self.inbox.pop(
    849     hydration_hooks=self.responses[0].hydration_hooks
    850 )
--> 851 res = self._process_message(tag, fields)
    852 self.idle_since = perf_counter()
    853 return res

File ~/pyenv/lib64/python3.9/site-packages/neo4j/_sync/io/_bolt5.py:376, in Bolt5x0._process_message(self, tag, fields)
    374 self._server_state_manager.state = self.bolt_states.FAILED
    375 try:
--> 376     response.on_failure(summary_metadata or {})
    377 except (ServiceUnavailable, DatabaseUnavailable):
    378     if self.pool:

File ~/pyenv/lib64/python3.9/site-packages/neo4j/_sync/io/_common.py:247, in Response.on_failure(self, metadata)
    245 handler = self.handlers.get("on_summary")
    246 Util.callback(handler)
--> 247 raise Neo4jError.hydrate(**metadata)

DatabaseError: {code: Neo.DatabaseError.Statement.ExecutionFailed} {message: Expected positive long value, got -24}

Additional context

  • This issue did not exist with GDS 2.5.0 and Neo4j 5.11.0.
  • Since the link prediction pipeline is created by .create even when it throws the error, I tried to use the Cypher GDS API on the Browser to call .addNodeProperty on the pipeline, but that also gave me the "Expected positive long value, got -24" error.

cybersam avatar Nov 14 '23 22:11 cybersam

Seems like a related bug was fixed in Neo4j 4.4.10 and 5.0.0:

Fix overflow in resource manager. Users could get errors like java.lang.IllegalArgumentException: Expected positive long value, got -8589934576 because of an overflow when trying to grow the number of tracked resources.

cybersam avatar Nov 15 '23 02:11 cybersam

We tried to reproduce this and pipeline creation worked with gds 2.5.3 and Neo4j 5.13. It worked all right creating a linkprediction pipeline on a fresh db. It is surprising this caused a bug as its a basic operation that we have test for. Are you able to run other gds algorithms such as for example pageRank?

Could you attach the neo4j logs including debug log?

breakanalysis avatar Nov 15 '23 10:11 breakanalysis

neo4j.log debug.log

I have attached the debug and neo4j logs.

I also tried the following, but still got the same error:

  • Using a fresh database (in same DBMS instance).
  • Removing all plugins except for bloom-plugin-5.x-2.10.0.jar and neo4j-graph-data-science-2.5.3.jar.
  • Running CALL gds.beta.pipeline.nodeClassification.create("pipe-nc") in the Browser.

However, all the PageRank algorithm examples in the docs succeeded and produced the expected results.

cybersam avatar Nov 15 '23 17:11 cybersam

In case the OS and its environment is related to this issue: I am using a standalone EE instance running on AWS EC2 with an Amazon Linux OS (AMI ID: ami-0cd7323ab3e63805f).

cybersam avatar Nov 15 '23 18:11 cybersam

I have created another AWS EC2 instance running neo4j 5.11.0 and GDS 2.5.1, in which both CALL gds.beta.pipeline.linkPrediction.create("foo") and gds.lp_pipe("foo") succeed.

The same with Neo4j 5.11.0 and GDS 2.5.3.

However, they fail (in the same way as before) with Neo4j 5.12.0 and GDS 2.5.0. So this seems to imply that the root issue is actually in Neo4j 5.12.0 onwards.

cybersam avatar Nov 16 '23 02:11 cybersam

Thanks for the info @cybersam ! What JVM are you using?

adamnsch avatar Nov 16 '23 10:11 adamnsch

...$ java -version openjdk version "17.0.9" 2023-10-17 LTS OpenJDK Runtime Environment Corretto-17.0.9.8.1 (build 17.0.9+8-LTS) OpenJDK 64-Bit Server VM Corretto-17.0.9.8.1 (build 17.0.9+8-LTS, mixed mode, sharing)

cybersam avatar Nov 16 '23 11:11 cybersam

[ec2-user@ip-172-31-13-182 ~]$ java -version openjdk version "17.0.9" 2023-10-17 LTS OpenJDK Runtime Environment Corretto-17.0.9.8.1 (build 17.0.9+8-LTS) OpenJDK 64-Bit Server VM Corretto-17.0.9.8.1 (build 17.0.9+8-LTS, mixed mode, sharing)

Great, thank you! We're looking into this

adamnsch avatar Nov 16 '23 11:11 adamnsch

Any updates or recommendations for this issue? Ideally not involving downgrading the Neo4j version. Thank you

lukepereira avatar Dec 19 '23 21:12 lukepereira

@cybersam @lukepereira

The issue was with the Cypher runtime but should be fixed as of Neo4j DB version 5.16. Would you be able to try your code with this version?

adamnsch avatar Jan 29 '24 13:01 adamnsch