TypeScript Don't escape valid Unicode characters in strings

TypeScript Version: 3.7.4

Code

const sf = createSourceFile(
  'aaa',
  'const a: string = "哈哈"',
  ScriptTarget.Latest
)
// try to do sth in transfrom.
const result = transform(sf, [])
const printer = createPrinter()
const printed = printer.printNode(
  EmitHint.SourceFile,
  result.transformed[0],
  sf
)
console.log(printed)

Expected behavior: const a: string = "哈哈"

Actual behavior: const a: string = "\u54C8\u54C8";

I am trying to use compiler api to do some transform. but the Printer seems could not generate the decoded unicode characters. wonder how to do this right?

Jan 14 '20 03:01 sonacy

i am seeing the api here.

const realPath = path.resolve(__dirname, './utf8.ts')
const program = createProgram([realPath], {
  target: ScriptTarget.ES2017,
  module: ModuleKind.ES2015,
  allowJs: true,
  jsx: JsxEmit.Preserve,
})
// use it, got expected answer
// program.getTypeChecker()
const result = transform(sf, [])
const printer = createPrinter()
const printed = printer.printNode(
  EmitHint.SourceFile,
  result.transformed[0],
  sf
)
console.log(printed)

same here, use the program api, the file content is basic: 'const a: string = "哈哈"'. but got result: const a: string = "\u54C8\u54C8"; but when i use: program.getTypeChecker(), i got expected answer like: const a: string = "哈哈". wonder why this happens?

Jan 14 '20 03:01 sonacy

It's not that you're doing anything wrong - our implementation just escapes any characters outside of the printable range of ASCII characters. Nowadays e might be equipped to do a little better given that we have the set of valid unicode identifier characters.

Is there a reason this emit is a problem for you?

Jan 14 '20 07:01 DanielRosenwasser

characters

we use the transform api to deal our source code, for example

const a:string = '哈哈' => const a: string = i18n('哈哈'), so we can search our codebase to replace all the chinese string to use i18n, but if typescript escapes any characters outside of the printable range of ASCII characters, our code base will be wired

is there any solutions let me keep my chinese string, thanks

Jan 14 '20 09:01 GilbertSun

I don't think we should escape these unless there's some hard necessity.

Jan 14 '20 20:01 RyanCavanaugh

No, it was strictly ease of implementation at the time. I'm marking this as Difficult because any contribution needs very thorough test code.

Jan 14 '20 22:01 DanielRosenwasser

Hitting same issue. Our workaround:

    let content = printer.printFile(file);
    content = unescape(content.replace(/\\u/g, "%u"));

Mar 11 '20 10:03 git9am

backlog since 2020

Feb 24 '24 15:02 Grawl

Backlog = PRs accepted, be the change you want to see in the world 😇

Feb 25 '24 06:02 RyanCavanaugh

I'm now using recast to workaround this issue

import ts from "typescript";
import { parse, print, types } from "recast";

const output = ts.transpileModule("`你好`", {});
console.log("typescript output:\n", output.outputText);

let ast = parse(output.outputText);

types.visit(ast, {
  visitLiteral(path) {
    const node = path.node;

    if (typeof node.value === "string") {
      path.replace(types.builders.stringLiteral(node.value));
    }

    this.traverse(path);
  },
});

console.log("recast output:\n", print(ast).code);

outputs

typescript output:
 "\u4F60\u597D";

recast output:
 "你好";

Jun 18 '24 04:06 KevinWang15

TypeScript TypeScript copied to clipboard

Don't escape valid Unicode characters in strings

TypeScript
TypeScript copied to clipboard