NextChat icon indicating copy to clipboard operation
NextChat copied to clipboard

[Bug] 本地导出为乱码,导入失败(Title: [Bug] The local export is garbled and the import fails.)

Open taurusduan opened this issue 1 year ago • 23 comments

Describe the bug A clear and concise description of what the bug is. 版本为:2.9.13 本地数据导出后为乱码,然后就无法导入了。(Version is: 2.9.13 After exporting the local data, it becomes garbled and cannot be imported.) To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error 导出导入(Export and import)

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem. image

Deployment

  • [x] Docker
  • [ ] Vercel
  • [ ] Server

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22] windows11+谷歌浏览器(Windows 11 + Google Chrome) Smartphone (please complete the following information):
  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional Logs Add any logs about the problem here.

taurusduan avatar Dec 18 '23 08:12 taurusduan

Bot detected the issue body's language is not English, translate it automatically.


Title: [Bug] The local export is garbled and the import fails.

Issues-translate-bot avatar Dec 18 '23 08:12 Issues-translate-bot

It seems that only the client side has this issue.

Hub-moon avatar Dec 21 '23 15:12 Hub-moon

this issue related to https://github.com/ChatGPTNextWeb/ChatGPT-Next-Web/issues/3395

H0llyW00dzZ avatar Dec 21 '23 16:12 H0llyW00dzZ

+1,在 macOS 下的客户端也有这个问题。

sheng-di avatar Jan 15 '24 03:01 sheng-di

Bot detected the issue body's language is not English, translate it automatically.


+1, the client under macOS also has this problem.

Issues-translate-bot avatar Jan 15 '24 03:01 Issues-translate-bot

+1,希望能尽快修复,否则我只能在一台设备上使用

TCOTC avatar Feb 02 '24 01:02 TCOTC

Bot detected the issue body's language is not English, translate it automatically.


+1, hope it gets fixed soon otherwise I can only use it on one device

Issues-translate-bot avatar Feb 02 '24 01:02 Issues-translate-bot

this issue related to #3395

这是同一个bug。问题在于这里转换text到uint8array的方式是错误的,charCodeAt方法返回utf-16编码,但是uint8array的取值范围是0~255,导致英文字符可以正常转换,中文字符的编码值都被截断了。

https://github.com/ChatGPTNextWeb/ChatGPT-Next-Web/blob/2c12be62c4574f7914ff4075560b52298fbe64c7/app/utils.ts#L54

示例

text = "你好世界hellworld"
console.log(new Uint8Array([...text].map((c) => c.charCodeAt(0))))
// Uint8Array(13) [96, 125, 22, 76, 104, 101, 108, 108, 119, 111, 114, 108, 100]

console.log(new Uint32Array([...text].map((c) => c.charCodeAt(0))))
// Uint32Array(13) [20320, 22909, 19990, 30028, 104, 101, 108, 108, 119, 111, 114, 108, 100]

正确的做法是用TextEncoder

let str = "你好世界";
let encoder = new TextEncoder();
let utf8Array = encoder.encode(str);
// Uint8Array(21) [228, 189, 160, 229, 165, 189, 228, 184, 150, 231, 149, 140, 104, 101, 108, 108, 119, 111, 114, 108, 100]

这个bug应该是跨平台的,只要用户数据(聊天内容,对话标题..)包含中文(或者任何超出0~255编码范围的字符)就必定会触发。另外因为数据本身被截断了,意味着修复前导出的内容无法简单恢复(aka 数据永久丢失了)


This is the same bug. The problem lies in the incorrect way of converting text to a Uint8Array. The charCodeAt method returns UTF-16 encoding, but the valid range for Uint8Array is 0 to 255. This causes English characters to be converted correctly, but the encoding values for Chinese characters are truncated.

Example:

text = "你好世界hellworld"
console.log(new Uint8Array([...text].map((c) => c.charCodeAt(0))))
// Uint8Array(13) [96, 125, 22, 76, 104, 101, 108, 108, 119, 111, 114, 108, 100]

console.log(new Uint32Array([...text].map((c) => c.charCodeAt(0))))
// Uint32Array(13) [20320, 22909, 19990, 30028, 104, 101, 108, 108, 119, 111, 114, 108, 100]

The correct approach is to use TextEncoder:

let str = "你好世界";
let encoder = new TextEncoder();
let utf8Array = encoder.encode(str);
// Uint8Array(21) [228, 189, 160, 229, 165, 189, 228, 184, 150, 231, 149, 140, 104, 101, 108, 108, 119, 111, 114, 108, 100]

This bug is likely cross-platform, as long as user data (chat content, conversation titles, etc.) includes Chinese characters (or any characters beyond the 0 to 255 encoding range), it will trigger the bug. Additionally, since the data itself is truncated, it means that the exported content before the fix cannot be easily recovered (i.e., the data is permanently lost).

jerrylususu avatar Feb 04 '24 15:02 jerrylususu

omg you are not smart, bug its not in that things

H0llyW00dzZ avatar Feb 04 '24 17:02 H0llyW00dzZ

its because of this format are text not a json https://github.com/ChatGPTNextWeb/ChatGPT-Next-Web/blob/4511aa4d21eda4e6e0b5130d1e3222bb30734672/app/utils.ts#L68C1-L71C7

if you see my fork there is no problem whatever you export

H0llyW00dzZ avatar Feb 04 '24 17:02 H0llyW00dzZ

this a fucking correct

https://github.com/H0llyW00dzZ/ChatGPT-Next-Web/blob/main/app/utils.ts#L40

H0llyW00dzZ avatar Feb 04 '24 17:02 H0llyW00dzZ

also this issue is duplicated yiidaa was explaining how to fix this before

H0llyW00dzZ avatar Feb 04 '24 18:02 H0llyW00dzZ

2.10.3导出中文还是有乱码

shenyan-008 avatar Feb 22 '24 02:02 shenyan-008

Bot detected the issue body's language is not English, translate it automatically.


2.10.3 There are still garbled characters when exporting Chinese.

Issues-translate-bot avatar Feb 22 '24 02:02 Issues-translate-bot

v2.10.3 still has this issue Is anyone paying attention to this issue?

NightmareZero avatar Feb 22 '24 08:02 NightmareZero

When my mask is written in Chinese and I export it, it will not be able to be opened correctly due to encoding errors, and therefore cannot be imported. The same problem also occurs when exporting Chinese chat data. Hope nice developers can pay attention to the vast number of Chinese users!

MuGeeee avatar Feb 26 '24 06:02 MuGeeee

@Yidadaa 我希望这个问题能够修复,自从 v2.9.8 开始就无法导出正确的 json 了(会导入失败)。如果需要,我可以通过非公开渠道提供我在 v2.9.7v2.9.8 导出的同样的本地数据。

TCOTC avatar Mar 13 '24 10:03 TCOTC

Bot detected the issue body's language is not English, translate it automatically.


@Yidadaa I hope this problem can be fixed. Since v2.9.8, the correct json cannot be exported (will Import failed). If needed, I can provide my work in v2.9.7 and [v2.9.8](https ://github.com/ChatGPTNextWeb/ChatGPT-Next-Web/releases/tag/v2.9.8) exported the same local data.

Issues-translate-bot avatar Mar 13 '24 10:03 Issues-translate-bot

fix by #3972

v2.11.3 我这里导出已经正常了,大家可以试一试

TCOTC avatar Mar 14 '24 06:03 TCOTC

Bot detected the issue body's language is not English, translate it automatically.


fix by #3972

v2.11.3 My export is now normal, you can give it a try

Issues-translate-bot avatar Mar 14 '24 06:03 Issues-translate-bot

fix by #3972

v2.11.3 我这里导出已经正常了,大家可以试一试

I have already gone to use the chatbox

NightmareZero avatar Mar 18 '24 08:03 NightmareZero

同步了最新代码,全新从windows11 chrome导出,接着导入ios 17 的chrome的时候还是报【导入失败】 我查看了 json,没有发现什么异常

导出的文件大小 3112 KB,没有超过localStorage 5MB的限制

失败后,有个报错界面显示: quotaexceedederror: the quota has been exceeded. 后面一堆 堆栈

基本确定是 ios 的浏览器的问题,桌面端的firefox,chrome都没问题

jiangying000 avatar Apr 17 '24 05:04 jiangying000

Bot detected the issue body's language is not English, translate it automatically.


The latest code has been synchronized and exported from Windows 11 chrome. When importing into ios 17 chrome, it still reports [Import failed] I checked the json and found nothing unusual

Issues-translate-bot avatar Apr 17 '24 05:04 Issues-translate-bot