go icon indicating copy to clipboard operation
go copied to clipboard

Wrong behavior in marshalling invalid UTF-8 byte with EscapeHTML=false

Open kz-sher opened this issue 2 years ago • 0 comments

Background:

  • When EscapeHTML=true, marshalling invalid UTF-8 byte will yield \ufffd, which is correct.
  • When EscapeHTML=false, marshalling invalid UTF-8 byte will keep the original byte remaining the same, which is different from encoding/json.

Findings:

  • After reading source code, I found that the condition here (https://github.com/json-iterator/go/blob/v1.1.12/stream_str.go#L318) seems incorrect.
  • When meeting invalid UTF-8 byte in fast path, we should break the loop and go to slow path instead.
  • Also, writeStringSlowPathWithHTMLEscaped and writeStringSlowPath are better to be combined and synced with https://go.dev/src/encoding/json/encode.go#L1029

Sample:

  • https://go.dev/play/p/uIKmsZQG18-

kz-sher avatar Mar 09 '23 04:03 kz-sher