swift-foundation icon indicating copy to clipboard operation
swift-foundation copied to clipboard

JSON issues with U+FEFF

Open kts opened this issue 1 year ago • 0 comments

Description

If we have a JSON file with the following 8 bytes, "\ufeff" (including the quotes), JSONSerialization decodes this as an empty string, rather than as String("\u{FEFF}")

(Note this is valid JSON and not JSON data starting with a BOM (byte order mark). All 8 bytes are ASCII.)

Steps to reproduce

import Foundation

let jsonString = #""\ufeff""#

let obj = try! JSONSerialization.jsonObject(
  with: jsonString.data(using:.utf8)!,
  options: [.allowFragments])

let string = obj as! String
print(string.count) //=>0. expected 1

Expected behavior

I would expect that for any String value, that the "round-trip" JSON encode + JSON decode should give you back an equal string, but here is an example where it does not:

import Foundation

let string1 = String("\u{FEFF}")
print(string1.count)//=>1

let jsonData = try! JSONEncoder().encode(string1)

let string2 = try! JSONDecoder().decode(String.self, from: jsonData)
print(string2.count)//=>0. expected string2==string1

Note, here jsonData is the empty string encoded as JSON ("").

As someone pointed out in this forum post, these problems arise from different behaviors of NSString and String,

import Foundation
print("\u{FEFF}".count) // => 1
print(("\u{FEFF}" as NSString).length) // => 0

Environment

$ swiftc -version
swift-driver version: 1.75.2 Apple Swift version 5.8.1 (swiftlang-5.8.0.124.5 clang-1403.0.22.11.100)
Target: arm64-apple-macosx13.0

kts avatar Oct 25 '23 03:10 kts