kotaemon
kotaemon copied to clipboard
[BUG] - <title>Error: 'gbk' codec can't encode character
Description
The ms graphrag index is so hard to use? Upload several pdf both cannot be parsed with below error Error: 'gbk' codec can't encode character '\xa9' in position 238: illegal multibyte sequence
Reproduction steps
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error
Screenshots

Logs
No response
Browsers
Chrome
OS
Windows
Additional information
No response
Same error here. This is what I'm trying to upload. car-kn1.md
I think my doc encoding is utf-8, why gbk is used?
I deployed without docker on my win11 PC.
same question. How can we change encoding method used?