arcadedb icon indicating copy to clipboard operation
arcadedb copied to clipboard

[Postgres driver] sending 5 queries = error

Open ExtReMLapin opened this issue 8 months ago • 14 comments

Hello !

When sending 5 queries queued with postgres driver it errors :


mars 25 14:15:28 cfia01 server.sh[255547]: 2025-03-25 14:15:28.130 INFO  [ArcadeDBServer] <ArcadeDB_0> - Postgres plugin started
mars 25 14:15:28 cfia01 server.sh[255547]: 2025-03-25 14:15:28.130 INFO  [HttpServer] <ArcadeDB_0> - Starting HTTP Server (host=0.0.0.0 port=2480-2489 httpsPort=2490-2499)...
mars 25 14:15:28 cfia01 server.sh[255547]: 2025-03-25 14:15:28.162 INFO  [undertow] starting server: Undertow - 2.3.18.Final
mars 25 14:15:28 cfia01 server.sh[255547]: 2025-03-25 14:15:28.168 INFO  [xnio] XNIO version 3.8.16.Final
mars 25 14:15:28 cfia01 server.sh[255547]: 2025-03-25 14:15:28.173 INFO  [nio] XNIO NIO Implementation Version 3.8.16.Final
mars 25 14:15:28 cfia01 server.sh[255547]: 2025-03-25 14:15:28.191 INFO  [threads] JBoss Threads version 3.5.0.Final
mars 25 14:15:28 cfia01 server.sh[255547]: 2025-03-25 14:15:28.245 INFO  [HttpServer] <ArcadeDB_0> - HTTP Server started (host=0.0.0.0 port=2480 httpsPort=2490)
mars 25 14:15:28 cfia01 server.sh[255547]: 2025-03-25 14:15:28.246 INFO  [ArcadeDBServer] <ArcadeDB_0> Available query languages: [sqlscript, mongo, gremlin, java, cypher, js, graphql, sql]
mars 25 14:15:28 cfia01 server.sh[255547]: 2025-03-25 14:15:28.247 INFO  [ArcadeDBServer] <ArcadeDB_0> ArcadeDB Server started in 'development' mode (CPUs=32 MAXRAM=29,97GB)
mars 25 14:15:53 cfia01 server.sh[255547]: 2025-03-25 14:15:28.260 INFO  [ArcadeDBServer] <ArcadeDB_0> Studio web tool available at http://192.168.25.25:2480 Unexpected message type '
mars 25 14:15:53 cfia01 server.sh[255547]: ' for message any
mars 25 14:15:53 cfia01 server.sh[255547]: com.arcadedb.postgres.PostgresProtocolException: Unexpected message type '
mars 25 14:15:53 cfia01 server.sh[255547]: ' for message any
mars 25 14:15:53 cfia01 server.sh[255547]:         at com.arcadedb.postgres.PostgresNetworkExecutor.readMessage(PostgresNetworkExecutor.java:1057)
mars 25 14:15:53 cfia01 server.sh[255547]:         at com.arcadedb.postgres.PostgresNetworkExecutor.run(PostgresNetworkExecutor.java:165)

Using latest release,

Python code to reproduce :

import psycopg

db_string_test = 'TEST_DATABASE'

count = 100
import time
with psycopg.connect(user="root", password="rootroot",
                    host='localhost',
                    port='5432',
                    dbname=db_string_test,
                    sslmode='disable'
                    ) as connection:
    
    connection.autocommit = True
    with connection.cursor() as cursor:
        cursor.execute("create vertex type `TEXT_EMBEDDING` if not exists;")
        
        cursor.execute("create property TEXT_EMBEDDING.str if not exists STRING;")
        cursor.execute("create property TEXT_EMBEDDING.embedding if not exists ARRAY_OF_FLOATS;")
    with connection.cursor() as cursor:
        for i in range(count):
            print(i)
            time.sleep(1)
            cursor.execute("INSERT INTO `TEXT_EMBEDDING` SET str = \"meow\", hash = [0.1, 0.2, 0.3] RETURN @rid")
            cursor.fetchall()
    

Traceback (most recent call last):
  File "/home/xxx/idextend/test_psgg.py", line 24, in <module>
    cursor.execute("INSERT INTO `TEXT_EMBEDDING` SET str = \"meow\", hash = [0.1, 0.2, 0.3] RETURN @rid")
  File "/home/xxx/idextend/venv/lib/python3.10/site-packages/psycopg/cursor.py", line 97, in execute
    raise ex.with_traceback(None)
psycopg.OperationalError: consuming input failed: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

ExtReMLapin avatar Mar 25 '25 14:03 ExtReMLapin

Debug code added in java allowed me to get those logs :

 got message with type p (0x70)
got message with type Q (0x51)
got message with type Q (0x51)
got message with type Q (0x51)
got message with type Q (0x51)
got message with type Q (0x51)
got message with type Q (0x51)
got message with type Q (0x51)
got message with type Q (0x51)
got message with type P (0x50)
got message with type S (0x53)
got message with type B (0x42)
got message with type  (0x0)
Unexpected message type '' (0x0) for message 'any'. Expected one of: 'P', 'B', 'E', 'Q', 'S', 'D', 'C', 'H', 'X'
com.arcadedb.postgres.PostgresProtocolException: Unexpected message type '' (0x0) for message 'any'. Expected one of: 'P', 'B', 'E', 'Q', 'S', 'D', 'C', 'H', 'X'
        at com.arcadedb.postgres.PostgresNetworkExecutor.readMessage(PostgresNetworkExecutor.java:1059)
        at com.arcadedb.postgres.PostgresNetworkExecutor.run(PostgresNetworkExecutor.java:165)


ExtReMLapin avatar Mar 25 '25 14:03 ExtReMLapin

Is this right?

            cursor.execute("INSERT INTO `TEXT_EMBEDDING` SET str = \"meow\", hash = [0.1, 0.2, 0.3] RETURN @rid")

I guess it shold be:

            cursor.execute("INSERT INTO `TEXT_EMBEDDING` SET str = \"meow\", embedding = [0.1, 0.2, 0.3] RETURN @rid")

embedding instead of hash

robfrank avatar Mar 26 '25 15:03 robfrank

You are right but even SELECT 1 causes an error

ExtReMLapin avatar Mar 26 '25 16:03 ExtReMLapin

While debugging I modified the query to identify it on the server side in this way:

                cursor.execute(f"INSERT INTO TEXT_EMBEDDING SET str = 'meow_{i}', embedding = [0.1,0.2,0.3] RETURN embedding;")

and all started to work fine. This is NOT the solution to the bug, of course, it's more a WT*-moment.

robfrank avatar Mar 27 '25 08:03 robfrank

Thanks for the temp fix, it's enough for me to do what I wanted : run benchmarks on http vs postgres.

ExtReMLapin avatar Mar 27 '25 08:03 ExtReMLapin

for anyone interested in the pg driver vs http rest

sql rest took 15.629427194595337
sql pgs took 2.1908366680145264

if __name__ == "__main__":
    from utils.config import get_config
    config = get_config("config.ini")
    CH = CypherHandler(config)
     
    import time
    import psycopg
    import numpy as np
    db_string_test = 'TEST_DATABASE'
    
    if db_string_test in CH.databases:
        CH.delete_database(db_string_test)
    
    new_db = CH.create_database(db_string_test)
    selected_case = new_db
    count = 5000

    with psycopg.connect(user="root", password="rootroot",
                        host='localhost',
                        port='5432',
                        dbname=db_string_test,
                        sslmode='disable'
                        ) as cursor:
        cursor.execute("create vertex type `TEXT_EMBEDDING` if not exists;")
        
        cursor.execute("create property TEXT_EMBEDDING.str if not exists STRING;")
        cursor.execute("create property TEXT_EMBEDDING.embedding if not exists ARRAY_OF_FLOATS;")
        
        

    _time = time.time()
    for i in range(count):
        random_array_size_3 = np.random.rand(3).tolist()
        selected_case.query('sql', f'INSERT INTO `TEXT_EMBEDDING` SET str = "meow{i}", embedding = {str(random_array_size_3)} RETURN @rid', is_command=True)
        
    print('sql rest took', time.time() - _time)
    
    _time = time.time()
    
    with psycopg.connect(user="root", password="rootroot",
                    host='localhost',
                    port='5432',
                    dbname=db_string_test,
                    sslmode='disable'
                    ) as cursor:
        for i in range(count):
            random_array_size_3 = np.random.rand(3).tolist()
            cursor.execute(f'INSERT INTO `TEXT_EMBEDDING` SET str = "meow{i}", embedding = {str(random_array_size_3)} RETURN @rid')
    print('sql pgs took', time.time() - _time)
    
    CH.delete_database(db_string_test)

ExtReMLapin avatar Mar 27 '25 10:03 ExtReMLapin

I replicated the scenarios in java using the postgres JDBC driver and our RemoteDatabase that uses http underneath. The differences are not so big, in your case pg is 7x faster compared to http. In my tests, with 100k insertions, Pg is 2x faster compared to RemoteDatabase.

PG (JDBC) keeps the connection open with the server and reuse it for every comnand/query executed, while in http a new connection is created for each command/query.

One strategy with http could be to batch the commands in a single sqlscript. I'll try with 100 inserts per batch

robfrank avatar Mar 28 '25 12:03 robfrank

After some experiments the numbers , for 100k insertions, on my machine, are in the order of 30 seconds for jdbc (Postgres protocol) an 1 minute for http. A I suggested, I tests the http with batching. Using batches of 100 insertions dropped down the time to less than seconds. Should you go on bathing? It depends on the insertion's rate. If you need to insert 100k per day, probably not, if you need yo insert 100k per minute probably yes.

robfrank avatar Mar 31 '25 07:03 robfrank

Should you go on bathing?

Depends, personnaly I take two shower a day (even more when I go to the swimming pool) 🤡


Jokes aside all that matters for us is that the PG drivers is faster for data insertion and mass data transfer ( https://github.com/ArcadeData/arcadedb/discussions/2005 ) .

While this code posted in the first post was created to do benchmarks, I just finished writing the "driver switch" for the python library and we met this issue because of recurring/similar queries using params.

So indeed, we meet this issue in 'real life' cases.

Right before getting this error, the last psygcop queries were :

        
pushing command {sqlscript}INSERT INTO `IMAGE` SET name = :name, file_path = :file_path, id_doc = :id_doc, mime_type = :mime_type, llava_flag = true, clip_flag = true, ocr_flag = true RETURN @rid;
pushing command {cypher}MATCH (a), (b) WHERE ID(a) = '#7:0' AND ID(b) = '#4:0' MERGE (a)-[r:in]->(b) RETURN a, b, r

pushing command {sqlscript}INSERT INTO `IMAGE` SET name = :name, file_path = :file_path, id_doc = :id_doc, mime_type = :mime_type, llava_flag = true, clip_flag = true, ocr_flag = true RETURN @rid;
pushing command {cypher}MATCH (a), (b) WHERE ID(a) = '#7:1' AND ID(b) = '#4:0' MERGE (a)-[r:in]->(b) RETURN a, b, r

pushing command {sqlscript}INSERT INTO `IMAGE` SET name = :name, file_path = :file_path, id_doc = :id_doc, mime_type = :mime_type, llava_flag = true, clip_flag = true, ocr_flag = true RETURN @rid;
pushing command {cypher}MATCH (a), (b) WHERE ID(a) = '#7:2' AND ID(b) = '#4:0' MERGE (a)-[r:in]->(b) RETURN a, b, r

pushing command {sqlscript}INSERT INTO `IMAGE` SET name = :name, file_path = :file_path, id_doc = :id_doc, mime_type = :mime_type, llava_flag = true, clip_flag = true, ocr_flag = true RETURN @rid;
pushing command {cypher}MATCH (a), (b) WHERE ID(a) = '#7:3' AND ID(b) = '#4:0' MERGE (a)-[r:in]->(b) RETURN a, b, r

pushing command {sqlscript}INSERT INTO `IMAGE` SET name = :name, file_path = :file_path, id_doc = :id_doc, mime_type = :mime_type, llava_flag = true, clip_flag = true, ocr_flag = true RETURN @rid;
pushing command {cypher}MATCH (a), (b) WHERE ID(a) = '#7:4' AND ID(b) = '#4:0' MERGE (a)-[r:in]->(b) RETURN a, b, r

pushing command {sqlscript}INSERT INTO `IMAGE` SET name = :name, file_path = :file_path, id_doc = :id_doc, mime_type = :mime_type, llava_flag = true, clip_flag = true, ocr_flag = true RETURN @rid;

ExtReMLapin avatar Mar 31 '25 08:03 ExtReMLapin

got a fix working, need to do more testing tho

ExtReMLapin avatar Mar 31 '25 12:03 ExtReMLapin

What's the status of this issue?

lvca avatar May 30 '25 16:05 lvca

Not going to lie, didn't test for a while, I can try back on monday, there is still the python code to reproduce

ExtReMLapin avatar May 30 '25 16:05 ExtReMLapin

@ExtReMLapin it would be great, thanks! For now I'm moving it to the next month release

lvca avatar May 30 '25 16:05 lvca

Well .... monday is going to turn into someday, sorry again for the delay, we're a little busy at the office

ExtReMLapin avatar Jun 15 '25 21:06 ExtReMLapin

@ExtReMLapin I'm closing this issue for now, if the problem is still there, please comment for a reopening.

lvca avatar Jul 04 '25 04:07 lvca