[CYPHER] UNWIND + data + type check doesn't execute whole batch.

Open camomiy opened this issue 10 months ago • 1 comments

Hello

ArcadeDB Version:

ArcadeDB Server v24.11.2 (build 055592c73d27d894c26f3faaf7df22e15c28f03d/1733838531445/main)

OS and JDK Version:

Running on Linux 5.15.0-105-generic - OpenJDK 64-Bit Server VM 17.0.13

Expected behavior

So the idea is we want to execute the following request :

            UNWIND $batch as row
            MATCH (a:CHUNK) WHERE ID(a) = row.source_id
            MATCH (b) WHERE ID(b) = row.target_id
            MERGE (a)-[r:TEST_TEST_TEST_TEST]->(b)
             RETURN a, b, r

Our param data is :

[
    {
        "source_id": "#4:0",
        "target_id": "#217:0",
        "features": {}
    },
    {
        "source_id": "#4:0",
        "target_id": "#52:0",
        "features": {}
    }
]

Please note that the source_id in the two entries are the same
Please note the only check is that the A node (source id) is a CHUNK node In other words, if the source_ids are all the same, if the check pass for one entry it should pass for all (the two).

When executing the query, here is what is returned by the arcade db rest api AN ARRAY OF LENGTH = 1 :

[
    {
        "a": {
            "@rid": "#4:0",
            "@type": "CHUNK",
            "subtype": "CHUNK",
            "name": "document OCRized image, chunk 1",
            "text": "TRUNCATED TEXT",
            "index": 0
        },
        "b": {
            "@rid": "#217:0",
            "@type": "PIPELINE_CONFIG",
            "pipelines": [
                "data_ingestion_pipeline",
                "ocr"
            ]
        },
        "r": {
            "@in": "#217:0",
            "@out": "#4:0",
            "@rid": "#297:0",
            "@type": "TEST_TEST_TEST_TEST"
        }
    }
]

This is quite unexpected.

When editing the query and removing the CHUNK check from MATCH (a:CHUNK) WHERE ID(a) = row.source_id -> MATCH (a) WHERE ID(a) = row.source_id there is no issue, returned array len is 2 and all the relations (the two) are correctly created.

New query :

            UNWIND $batch as row
            MATCH (a) WHERE ID(a) = row.source_id
            MATCH (b) WHERE ID(b) = row.target_id
            MERGE (a)-[r:TEST_TEST_TEST_TEST_2222]->(b)
            RETURN a, b, r

New result with the expected size :

[
    {
        "a": {
            "@rid": "#4:0",
            "@type": "CHUNK",
            "subtype": "CHUNK",
            "name": "document OCRized image, chunk 1",
            "text": "ANOTHER TRUNCATED",
            "index": 0
        },
        "b": {
            "@rid": "#217:0",
            "@type": "PIPELINE_CONFIG",
            "pipelines": [
                "data_ingestion_pipeline",
                "ocr"
            ]
        },
        "r": {
            "@in": "#217:0",
            "@out": "#4:0",
            "@rid": "#305:0",
            "@type": "TEST_TEST_TEST_TEST_2222"
        }
    },
    {
        "a": {
            "@rid": "#4:0",
            "@type": "CHUNK",
            "subtype": "CHUNK",
            "name": "document OCRized image, chunk 1",
            "text": "TRUNCATED TEXT",
            "index": 0
        },
        "b": {
            "@rid": "#52:0",
            "@type": "IMAGE",
            "name": "manchots-17.webp",
            "file_path": "TRUNCATED PATH",
            "id_doc": "#28:0",
            "mime_type": "image/webp",
            "last_modified": "ven. f\\u00e9vr. 07 08:13:07 +00:00 2025",
            "llava_flag": true,
            "clip_flag": true,
            "ocr_flag": false,
            "text": "TRUNCATED TEXT"
        },
        "r": {
            "@in": "#52:0",
            "@out": "#4:0",
            "@rid": "#306:0",
            "@type": "TEST_TEST_TEST_TEST_2222"
        }
    }
]

Here is a database backup right before executing the queries, you'll notice on the chunks RID posted above debug relations, you can ignore them.

POLAIRE_OCR2-backup-20250207-083229355.zip

Feb 07 '25 08:02 camomiy

Suspiciously similar to #1929

Feb 07 '25 08:02 ExtReMLapin