pydantic-to-typescript
pydantic-to-typescript copied to clipboard
UnicodeDecodeError for Pydantic models with Chinese characters
Description of problem
Given a Pydantic model with Chinese attributes:
class TrialRecord(BaseModel):
學號: str
OR with a Chinese alias:
class TrialRecord(BaseModel):
student_id: str = Field(alias="學號")
pydantic2ts will fail to clean output file, like so
PS ...> pydantic2ts --module backend\src\models\trial_record.py --output backend\output.ts
2025-06-08 20:49:10,988 Finding pydantic models...
2025-06-08 20:49:11,032 Generating JSON schema from pydantic models...
2025-06-08 20:49:11,039 Converting JSON schema to typescript definitions...
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "...\.venv\Scripts\pydantic2ts.exe\__main__.py", line 7, in <module>
sys.exit(main())
~~~~^^
File "...\.venv\Lib\site-packages\pydantic2ts\cli\script.py", line 404, in main
return generate_typescript_defs(
args.module,
...<2 lines>...
args.json2ts_cmd,
)
File "...\.venv\Lib\site-packages\pydantic2ts\cli\script.py", line 354, in generate_typescript_defs
_clean_output_file(output)
~~~~~~~~~~~~~~~~~~^^^^^^^^
File "...\.venv\Lib\site-packages\pydantic2ts\cli\script.py", line 212, in _clean_output_file
lines = f.readlines()
File "...\AppData\Local\Programs\Python\Python313\Lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 165: character maps to <undefined>
Suggested Solution
Add encoding="utf-8" to the with open() at lines 211 and 236 at /pydantic2ts/cli/script.py.
This is what I use as workaround after I install this package via pip, but I'd like to see this become a permanent change. To that effect I've also opened a pull request.