OpenMLDB
OpenMLDB copied to clipboard
After inserting data with precompiled SQL, when queried, certain fields have additional null characters ('\x00') appended at the end
Bug Description Service Version: Code from the main branch after version 0.8.4, compiled and deployed on macOS 13.5.1. Client Version: The current project branch python-sdk, accessing the database using the dbapi method.
After inserting data with precompiled SQL, when queried, certain fields have additional null characters ('\x00') appended at the end, as shown in the fourth column:
[(1001, '2022-05-01', 'province1', 'city1\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', 1, 1590738991000), (1002, '2022-05-02', 'province2', 'city2\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', 2, 1590738992000), (1003, '2022-05-03', 'province3', 'city3\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', 3, 1590738993000), (1004, '2022-05-04', 'province0', 'city4\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', 4, 1590738994000)]
Expected Behavior The field contents should not have additional null characters appended and should match the inserted content.
Relation Case
Steps to Reproduce
- Use the python-sdk code from the main branch after version 0.8.4.
- Run the test cases below to reproduce the corresponding error.
import sys
import os
from openmldb.dbapi import connect
import pytest
from .case_conf import OpenMLDB_ZK_CLUSTER, OpenMLDB_ZK_PATH
class TestOpenmldbDBAPI:
cursor_without_db = None
cursor = None
db_name = "dbapi_test"
@classmethod
def setup_class(cls):
# connect without db to create it
db = connect(zk=OpenMLDB_ZK_CLUSTER, zkPath=OpenMLDB_ZK_PATH)
cls.cursor_without_db = db.cursor()
cls.cursor_without_db.execute(
f"create database if not exists {cls.db_name};")
# then we can use the db to connect
db = connect(database=cls.db_name,
zk=OpenMLDB_ZK_CLUSTER,
zkPath=OpenMLDB_ZK_PATH)
cls.cursor = db.cursor()
cls.cursor.execute("create database if not exists {};".format(cls.db_name))
assert cls.db_name in cls.cursor.get_databases()
def recreate_table(self, table, schema):
if table in self.cursor.get_tables(self.db_name):
self.cursor.execute("drop table {}".format(table))
assert table not in self.cursor.get_tables(self.db_name)
self.cursor.execute("create table {}({}) OPTIONS(partitionnum=1);".format(table, schema))
assert table in self.cursor.get_tables(self.db_name)
def test_more_parameterized_query(self):
table = "test_param_query"
schema_str = "col1 bigint, col2 date, col3 string, col4 string, col5 int, col6 timestamp, " \
"index(key=col3, ts=col1), index(key=col3, ts=col6)"
self.recreate_table(table, schema_str)
test_rows = [(1000 + i, '2022-05-0' + str(i), 'province' + str(i % 4),
'city' + str(i), i, (1590738990 + i) * 1000)
for i in range(1, 5)]
self.cursor.executemany('insert into {} values (?, ?, ?, ?, ?, ?);'.format(table), test_rows)
result = self.cursor.execute("select * from {};".format(table)).fetchall()
result = sorted(list(result), key=lambda x: x[0])
assert result == test_rows
if __name__ == "__main__":
sys.exit(pytest.main(["-vv", os.path.abspath(__file__)]))
It does not work for parameter query like insert into {} values (?, ?, ?, ?, ?, ?). It may be the issue of this sql interface.
Only python paramterized insert fails, cuz in python sdk level, when insert parameter is tuple:
https://github.com/4paradigm/OpenMLDB/blob/bb059de2868478ec056f619042aa42108b9a7a96/python/openmldb_sdk/openmldb/dbapi/dbapi.py#L310-L312
So if you pass date str to it, it'll be calculated, be treated as a string column. The test is one date col and two string, so the str_size = len(date_str) + real_str_size, add more 10 char.
The right way ref dict: https://github.com/4paradigm/OpenMLDB/blob/bb059de2868478ec056f619042aa42108b9a7a96/python/openmldb_sdk/openmldb/dbapi/dbapi.py#L316-L323
I think we should handle None in tuple.
And no paramterized tests of date type before(just insert sql, it'll be build in router native level). https://github.com/4paradigm/OpenMLDB/blob/b1318e804555c0a4ff20f2741a107c41d0995bd8/python/openmldb_sdk/tests/openmldb_client_test.py#L110-L113 We should add more tests about parameteried insert or query.