go
go copied to clipboard
Wrong behavior in marshalling invalid UTF-8 byte with EscapeHTML=false
Background:
- When EscapeHTML=true, marshalling invalid UTF-8 byte will yield
\ufffd, which is correct. - When EscapeHTML=false, marshalling invalid UTF-8 byte will keep the original byte remaining the same, which is different from
encoding/json.
Findings:
- After reading source code, I found that the condition here (https://github.com/json-iterator/go/blob/v1.1.12/stream_str.go#L318) seems incorrect.
- When meeting invalid UTF-8 byte in fast path, we should break the loop and go to slow path instead.
- Also, writeStringSlowPathWithHTMLEscaped and writeStringSlowPath are better to be combined and synced with https://go.dev/src/encoding/json/encode.go#L1029
Sample:
- https://go.dev/play/p/uIKmsZQG18-