swift-foundation
swift-foundation copied to clipboard
JSON issues with U+FEFF
Description
If we have a JSON
file with the following 8 bytes, "\ufeff"
(including the quotes), JSONSerialization
decodes this as an empty string, rather than as String("\u{FEFF}")
(Note this is valid JSON
and not JSON
data starting with a BOM (byte order mark). All 8 bytes are ASCII.)
Steps to reproduce
import Foundation
let jsonString = #""\ufeff""#
let obj = try! JSONSerialization.jsonObject(
with: jsonString.data(using:.utf8)!,
options: [.allowFragments])
let string = obj as! String
print(string.count) //=>0. expected 1
Expected behavior
I would expect that for any String
value, that the "round-trip" JSON
encode + JSON
decode should give you back an equal string, but here is an example where it does not:
import Foundation
let string1 = String("\u{FEFF}")
print(string1.count)//=>1
let jsonData = try! JSONEncoder().encode(string1)
let string2 = try! JSONDecoder().decode(String.self, from: jsonData)
print(string2.count)//=>0. expected string2==string1
Note, here jsonData is the empty string encoded as JSON (""
).
As someone pointed out in this forum post, these problems arise from different behaviors of NSString
and String
,
import Foundation
print("\u{FEFF}".count) // => 1
print(("\u{FEFF}" as NSString).length) // => 0
Environment
$ swiftc -version
swift-driver version: 1.75.2 Apple Swift version 5.8.1 (swiftlang-5.8.0.124.5 clang-1403.0.22.11.100)
Target: arm64-apple-macosx13.0