continue icon indicating copy to clipboard operation
continue copied to clipboard

JetBrains plugin misinterprets UTF-8 responses as system default encoding, causing garbled non-ASCII characters on non-UTF-8 systems

Open PMExtra opened this issue 5 months ago • 2 comments

Before submitting your bug report

Relevant environment info

I am an IT Administrator. I've received several reports about this issue in different environments.
Then I've reproduced in my env:

- OS: Windows 10 19045.6093
- Continue version: 1.0.33
- IDE version: DataGrip 2023.2.3
- Model: deepseek-v3 or kimi-k2
- config: irrelevant

Description

When using the Continue JetBrains plugin on Windows with a non-UTF-8 system locale (e.g., Chinese Windows systems where the default encoding is GBK), non-ASCII characters in model responses — including Chinese, accented Latin letters, etc. — appear garbled.

This issue affects any environment where the system’s default charset is not UTF-8, not just Chinese Windows systems.

Related issues:

  • #904
  • #1465

Verified Workarounds

I have confirmed that the problem can be resolved by either of the following:

  1. Changing the system locale to “UTF-8 (Beta)” in Windows settings.
  2. Adding the following JVM option to idea64.exe.vmoptions (or equivalent) to force UTF-8 for the IDE process:
-Dfile.encoding=UTF-8

Both workarounds ensure that the Continue plugin reads UTF-8 responses correctly, but the first method changes the entire system locale and may break some legacy applications.


(AI-Generated) Suspected Cause & Suggested Fix

Based on a review of the source code, the plugin reads and writes to the core process streams without explicitly specifying UTF-8, falling back to Charset.defaultCharset() instead.

For example, in
extensions/intellij/src/main/kotlin/com/github/continuedev/continueintellijextension/continue/process/ContinueProcessHandler.kt:

private val writer = OutputStreamWriter(process.output)
private val reader = BufferedReader(InputStreamReader(process.input))

On non-UTF-8 systems, Charset.defaultCharset() will not be UTF-8 (e.g., GBK on Chinese Windows), resulting in incorrect decoding of UTF-8 output from the core process.

Suggested change:

import java.nio.charset.StandardCharsets

private val writer = OutputStreamWriter(process.output, StandardCharsets.UTF_8)
private val reader = BufferedReader(InputStreamReader(process.input, StandardCharsets.UTF_8))

This explicitly decodes the process output as UTF-8 regardless of system locale, fixing the problem without affecting the other applications.


To reproduce

  1. Start a chat with the AI assistant and send a message in Chinese.
  2. Observe that the AI assistant’s reply contains garbled text instead of the expected Chinese characters.
Image

PMExtra avatar Aug 12 '25 09:08 PMExtra

I believe this is a bug. This problem didn't exist in version 1.0.16, but it occurs in the latest versions (1.0.30 and 1.0.33 – I haven't verified other versions).​ set -Dfile.encoding=UTF-8 is work

Magic-Ping avatar Aug 12 '25 10:08 Magic-Ping

This issue hasn't been updated in 90 days and will be closed after an additional 10 days without activity. If it's still important, please leave a comment and share any new information that would help us address the issue.

github-actions[bot] avatar Nov 11 '25 02:11 github-actions[bot]