Can't decode dbobject fields from database with CL8MSWIN1251 character set
- What versions are you using? Oracle Database 19c (NLS_CHARACTERSET = CL8MSWIN1251) platform.platform: Windows-11-10.0.22631-SP0 sys.maxsize > 2**32: True platform.python_version: 3.12.4 oracledb.version: 2.3.0b1 (commit: c5c6b4f21443b599a458afff933bc4f54734d68f)
-
Is it an error or a hang or a crash? It's error
-
What error(s) or behavior you are seeing? File "src\oracledb\impl/thin/dbobject.pyx", line 489, in oracledb.thin_impl.ThinDbObjectImpl.get_attr_value File "src\oracledb\impl/thin/dbobject.pyx", line 192, in oracledb.thin_impl.ThinDbObjectImpl._ensure_unpacked File "src\oracledb\impl/thin/dbobject.pyx", line 308, in oracledb.thin_impl.ThinDbObjectImpl._unpack_data File "src\oracledb\impl/thin/dbobject.pyx", line 346, in oracledb.thin_impl.ThinDbObjectImpl._unpack_data_from_buf File "src\oracledb\impl/thin/dbobject.pyx", line 377, in oracledb.thin_impl.ThinDbObjectImpl._unpack_value File "src\oracledb\impl/base/buffer.pyx", line 746, in oracledb.base_impl.Buffer.read_str UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdf in position 0: unexpected end of data
- Does your application call init_oracle_client()? No, Thin
- Include a runnable Python script that shows the problem.
import oracledb
def main():
oracledb.defaults.config_dir = ''
with oracledb.connect(user='', password='', dsn='', mode=oracledb.AUTH_MODE_SYSDBA) as c:
with c.cursor() as cur:
sql = "select sys.tts_error_t('Я') from dual"
for v in cur.execute(sql):
print(v[0].VIOLATIONS)
if __name__ == "__main__":
main()
xmltype has same error:
import oracledb
def main():
oracledb.defaults.config_dir = ""
with oracledb.connect(user='', password='', dsn='', mode=oracledb.AUTH_MODE_SYSDBA) as c:
with c.cursor() as cur:
sql = "select xmltype('<a>Я</a>') from dual"
for v in cur.execute(sql):
print(v[0])
if __name__ == "__main__":
main()
Thanks, can you share the packets containing the cursor execution for one of these two scenarios? That might be helpful. I can compare with the case when the database character set is AL32UTF8 and see what is going on. I'll try to get a database set up with that character set to see if I can replicate.
Thanks, that was helpful. I can see that in your output the string inside the object is encoded in windows-1251 (0xDF) while in my output the object is encoded in utf-8 (0xD0 0xAF). It looks like the conversion is not occurring in the server -- which suggests that this is a database bug. I'll ask internally and get back to you.
Ok, thank you, I'll wait for the problem to be resolved, because we are using such databases with object types.