pgsync icon indicating copy to clipboard operation
pgsync copied to clipboard

search_phase_execution_exception is out of range for a long for postgrest varchar id

Open pigTom opened this issue 3 years ago • 1 comments

PGSync version: 2.3.2 Postgres version: PostgreSQL 13.2 (Debian 13.2-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit Elasticsearch version: 7.17.0 Redis version: Redis version=5.0.6, bits=64 Python version: python3.7 Problem Description: We have a table project(id, xxx) and member(id, user_id,project_id, role_id, deleted) and user(id, phone, email, deleted). and we have pgsync.yml bellow, and we found data synced to es successfully, but pgsync service restart somethings, the error message is search_phase_execution_exception is out of range for a long for postgrest varchar id.

                    {
                        "table": "project_member",
                        "schema": "public",
                        "columns": [
                            "user_id",
                            "role_id"
                        ],
                        "relationship": {
                            "variant": "object",
                            "type": "one_to_many",
                            "foreign_key": {
                                "child": [
                                    "project_id"
                                ],
                                "parent": [
                                    "id"
                                ]
                            }
                        },
                        "children":[
                            {
                                "table": "bees360_user",
                                "schema": "public",
                                "label": "name",
                                "columns": [
                                    "name"
                                ],
                                "relationship": {
                                    "variant": "scalar",
                                    "type": "one_to_one",
                                    "foreign_key": {
                                        "child": ["id"],
                                        "parent": ["user_id"]
                                    }
                                }
                
                            }
                        ]
                    },

Error Message (if any):

:ERROR:pgsync.elastichelper: Exception RequestError(400, 'search_phase_execution_exception', 'failed to create query: Value [[31 31 35 39 38 32 34 37 39 34 34 34 35 30 32 35 35 37 35 30 36]] is out of range for a long')

[
  "Traceback (most recent call last):\n  File \"/usr/local/lib/python3.7/site-packages/pgsync/elastichelper.py\", line 142, in bulk\n    raise_on_error=raise_on_error,\n  File \"/usr/local/lib/python3.7/site-packages/pgsync/elastichelper.py\", line 197, in _bulk\n    ignore_status=ignore_status,\n  File \"/usr/local/lib/python3.7/site-packages/elasticsearch/helpers/actions.py\", line 484, in parallel_bulk\n    actions, chunk_size, max_chunk_bytes, client.transport.serializer\n  File \"/usr/local/lib/python3.7/multiprocessing/pool.py\", line 748, in next\n    raise value\n  File \"/usr/local/lib/python3.7/multiprocessing/pool.py\", line 121, in worker\n    result = (True, func(*args, **kwds))\n  File \"/usr/local/lib/python3.7/multiprocessing/pool.py\", line 140, in _helper_reraises_exception\n    raise ex\n  File \"/usr/local/lib/python3.7/multiprocessing/pool.py\", line 292, in _guarded_task_generation\n    for i, x in enumerate(iterable):\n  File \"/usr/local/lib/python3.7/site-packages/elasticsearch/helpers/actions.py\", line 155, in _chunk_actions\n    for action, data in actions:\n  File \"/usr/local/lib/python3.7/site-packages/pgsync/sync.py\", line 836, in _payloads\n    extra,\n  File \"/usr/local/lib/python3.7/site-packages/pgsync/sync.py\", line 643, in _update_op\n    fields,\n  File \"/usr/local/lib/python3.7/site-packages/pgsync/elastichelper.py\", line 232, in _search\n    for hit in search.scan():\n  File \"/usr/local/lib/python3.7/site-packages/elasticsearch_dsl/search.py\", line 731, in scan\n    for hit in scan(es, query=self.to_dict(), index=self._index, **self._params):\n  File \"/usr/local/lib/python3.7/site-packages/elasticsearch/helpers/actions.py\", line 555, in scan\n    body=query, scroll=scroll, size=size, request_timeout=request_timeout, **kwargs\n  File \"/usr/local/lib/python3.7/site-packages/elasticsearch/client/utils.py\", line 168, in _wrapped\n    return func(*args, params=params, headers=headers, **kwargs)\n  File \"/usr/local/lib/python3.7/site-packages/elasticsearch/client/__init__.py\", line 1675, in search\n    body=body,\n  File \"/usr/local/lib/python3.7/site-packages/elasticsearch/transport.py\", line 415, in perform_request\n    raise e\n  File \"/usr/local/lib/python3.7/site-packages/elasticsearch/transport.py\", line 388, in perform_request\n    timeout=timeout,\n  File \"/usr/local/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py\", line 277, in perform_request\n    self._raise_error(response.status, raw_data)\n  File \"/usr/local/lib/python3.7/site-packages/elasticsearch/connection/base.py\", line 331, in _raise_error\n    status_code, error_message, additional_info\nelasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'failed to create query: Value [[31 31 35 39 38 32 34 37 39 34 34 34 35 30 32 35 35 37 35 30 36]] is out of range for a long')\n"
]

pigTom avatar Oct 13 '22 09:10 pigTom

we found the user_id like '110492135982403357918' will cause the exception.

pigTom avatar Oct 13 '22 09:10 pigTom

what type is user_id? The largest BIGINT you can store in Postgres is 9223372036854775807

toluaina avatar Oct 18 '22 18:10 toluaina

hello, user_id is varchar(32) in postgres.

pigTom avatar Oct 24 '22 06:10 pigTom

what type is user_id? The largest BIGINT you can store in Postgres is 9223372036854775807

hello, I find the code snippet is unused and will cause my problem, I will appreciate if that you can reply.

                # also handle foreign_keys
                if node.parent:
                    fields = defaultdict(list)

                    try:
                        foreign_keys = self.query_builder.get_foreign_keys(
                            node.parent,
                            node,
                        )
                    except ForeignKeyError:
                        foreign_keys = self.query_builder._get_foreign_keys(
                            node.parent,
                            node,
                        )

                    foreign_values = [
                        payload.new.get(k) for k in foreign_keys[node.name]
                    ]

                    for key in [key.name for key in node.primary_keys]:
                        for value in foreign_values:
                            if value:
                                fields[key].append(value)
                    # TODO: we should combine this with the filter above
                    # so we only hit Elasticsearch once
                    for doc_id in self.es._search(
                        self.index,
                        node.parent.table,
                        fields,
                    ):
                        where: dict = {}
                        params = doc_id.split(PRIMARY_KEY_DELIMITER)
                        for i, key in enumerate(
                            self.tree.root.model.primary_keys
                        ):
                            where[key] = params[i]
                        _filters.append(where)

pigTom avatar Nov 01 '22 04:11 pigTom

closing as this was addressed in commit 3fe7ffeb

toluaina avatar Dec 12 '22 22:12 toluaina