TypeScript
TypeScript copied to clipboard
Don't escape valid Unicode characters in strings
TypeScript Version: 3.7.4
Code
const sf = createSourceFile(
'aaa',
'const a: string = "哈哈"',
ScriptTarget.Latest
)
// try to do sth in transfrom.
const result = transform(sf, [])
const printer = createPrinter()
const printed = printer.printNode(
EmitHint.SourceFile,
result.transformed[0],
sf
)
console.log(printed)
Expected behavior: const a: string = "哈哈"
Actual behavior: const a: string = "\u54C8\u54C8";
I am trying to use compiler api to do some transform. but the Printer seems could not generate the decoded unicode characters. wonder how to do this right?
i am seeing the api here.
const realPath = path.resolve(__dirname, './utf8.ts')
const program = createProgram([realPath], {
target: ScriptTarget.ES2017,
module: ModuleKind.ES2015,
allowJs: true,
jsx: JsxEmit.Preserve,
})
// use it, got expected answer
// program.getTypeChecker()
const result = transform(sf, [])
const printer = createPrinter()
const printed = printer.printNode(
EmitHint.SourceFile,
result.transformed[0],
sf
)
console.log(printed)
same here, use the program api, the file content is basic: 'const a: string = "哈哈"'. but got result: const a: string = "\u54C8\u54C8"; but when i use: program.getTypeChecker(), i got expected answer like: const a: string = "哈哈". wonder why this happens?
It's not that you're doing anything wrong - our implementation just escapes any characters outside of the printable range of ASCII characters. Nowadays e might be equipped to do a little better given that we have the set of valid unicode identifier characters.
Is there a reason this emit is a problem for you?
characters
we use the transform api to deal our source code, for example
const a:string = '哈哈'
=> const a: string = i18n('哈哈')
, so we can search our codebase to replace all the chinese string to use i18n, but if typescript escapes any characters outside of the printable range of ASCII characters, our code base will be wired
is there any solutions let me keep my chinese string, thanks
I don't think we should escape these unless there's some hard necessity.
No, it was strictly ease of implementation at the time. I'm marking this as Difficult
because any contribution needs very thorough test code.
Hitting same issue. Our workaround:
let content = printer.printFile(file);
content = unescape(content.replace(/\\u/g, "%u"));
backlog since 2020
Backlog = PRs accepted, be the change you want to see in the world 😇
I'm now using recast to workaround this issue
import ts from "typescript";
import { parse, print, types } from "recast";
const output = ts.transpileModule("`你好`", {});
console.log("typescript output:\n", output.outputText);
let ast = parse(output.outputText);
types.visit(ast, {
visitLiteral(path) {
const node = path.node;
if (typeof node.value === "string") {
path.replace(types.builders.stringLiteral(node.value));
}
this.traverse(path);
},
});
console.log("recast output:\n", print(ast).code);
outputs
typescript output:
"\u4F60\u597D";
recast output:
"你好";