OpenMLDB icon indicating copy to clipboard operation
OpenMLDB copied to clipboard

After inserting data with precompiled SQL, when queried, certain fields have additional null characters ('\x00') appended at the end

Open yht520100 opened this issue 1 year ago • 2 comments

Bug Description Service Version: Code from the main branch after version 0.8.4, compiled and deployed on macOS 13.5.1. Client Version: The current project branch python-sdk, accessing the database using the dbapi method.

After inserting data with precompiled SQL, when queried, certain fields have additional null characters ('\x00') appended at the end, as shown in the fourth column: [(1001, '2022-05-01', 'province1', 'city1\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', 1, 1590738991000), (1002, '2022-05-02', 'province2', 'city2\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', 2, 1590738992000), (1003, '2022-05-03', 'province3', 'city3\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', 3, 1590738993000), (1004, '2022-05-04', 'province0', 'city4\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', 4, 1590738994000)]

Expected Behavior The field contents should not have additional null characters appended and should match the inserted content.

Relation Case

Steps to Reproduce

  1. Use the python-sdk code from the main branch after version 0.8.4.
  2. Run the test cases below to reproduce the corresponding error.
import sys
import os
from openmldb.dbapi import connect
import pytest

from .case_conf import OpenMLDB_ZK_CLUSTER, OpenMLDB_ZK_PATH

class TestOpenmldbDBAPI:
    cursor_without_db = None
    cursor = None
    db_name = "dbapi_test"

    @classmethod
    def setup_class(cls):
        # connect without db to create it
        db = connect(zk=OpenMLDB_ZK_CLUSTER, zkPath=OpenMLDB_ZK_PATH)
        cls.cursor_without_db = db.cursor()
        cls.cursor_without_db.execute(
            f"create database if not exists {cls.db_name};")
        # then we can use the db to connect
        db = connect(database=cls.db_name,
                     zk=OpenMLDB_ZK_CLUSTER,
                     zkPath=OpenMLDB_ZK_PATH)
        cls.cursor = db.cursor()
        cls.cursor.execute("create database if not exists {};".format(cls.db_name))
        assert cls.db_name in cls.cursor.get_databases()

    def recreate_table(self, table, schema):
        if table in self.cursor.get_tables(self.db_name):
            self.cursor.execute("drop table {}".format(table))
        assert table not in self.cursor.get_tables(self.db_name)

        self.cursor.execute("create table {}({}) OPTIONS(partitionnum=1);".format(table, schema))
        assert table in self.cursor.get_tables(self.db_name)

    def test_more_parameterized_query(self):
        table = "test_param_query"
        schema_str = "col1 bigint, col2 date, col3 string, col4 string, col5 int, col6 timestamp, " \
                     "index(key=col3, ts=col1), index(key=col3, ts=col6)"
        self.recreate_table(table, schema_str)
        test_rows = [(1000 + i, '2022-05-0' + str(i), 'province' + str(i % 4),
                      'city' + str(i), i, (1590738990 + i) * 1000)
                     for i in range(1, 5)]
        self.cursor.executemany('insert into {} values (?, ?, ?, ?, ?, ?);'.format(table), test_rows)

        result = self.cursor.execute("select * from {};".format(table)).fetchall()
        result = sorted(list(result), key=lambda x: x[0])
        assert result == test_rows


if __name__ == "__main__":
    sys.exit(pytest.main(["-vv", os.path.abspath(__file__)]))

yht520100 avatar Feb 26 '24 07:02 yht520100

It does not work for parameter query like insert into {} values (?, ?, ?, ?, ?, ?). It may be the issue of this sql interface.

tobegit3hub avatar Feb 26 '24 07:02 tobegit3hub

Only python paramterized insert fails, cuz in python sdk level, when insert parameter is tuple: https://github.com/4paradigm/OpenMLDB/blob/bb059de2868478ec056f619042aa42108b9a7a96/python/openmldb_sdk/openmldb/dbapi/dbapi.py#L310-L312 So if you pass date str to it, it'll be calculated, be treated as a string column. The test is one date col and two string, so the str_size = len(date_str) + real_str_size, add more 10 char.

The right way ref dict: https://github.com/4paradigm/OpenMLDB/blob/bb059de2868478ec056f619042aa42108b9a7a96/python/openmldb_sdk/openmldb/dbapi/dbapi.py#L316-L323

I think we should handle None in tuple.

And no paramterized tests of date type before(just insert sql, it'll be build in router native level). https://github.com/4paradigm/OpenMLDB/blob/b1318e804555c0a4ff20f2741a107c41d0995bd8/python/openmldb_sdk/tests/openmldb_client_test.py#L110-L113 We should add more tests about parameteried insert or query.

vagetablechicken avatar Feb 27 '24 09:02 vagetablechicken