cl-json
cl-json copied to clipboard
Surrogate pair
(yason:parse "\"\\uD840\\uDC0B\"")
;; => "𠀋"
(with-input-from-string (stream "\"\\uD840\\uDC0B\"")
(cl-json:decode-json stream))
;; => "��"
This "bug" is at decoder.lisp
(around line 160):
((len rdx)
(let ((code
(let ((repr (make-string len)))
(dotimes (i len)
(setf (aref repr i) (read-char stream)))
(handler-case (parse-integer repr :radix rdx)
(parse-error ()
(json-syntax-error stream esc-error-fmt
(format nil "\\~C" c)
repr))))))
(restart-case
(or (and (< code char-code-limit) (code-char code))
(error 'no-char-for-code :code code))
Escape sequence "\u" is just split and encoded in separate characters and then returned. Surrogate pair is just not implemented in CL-JSON.
I've got a dirty hack:
(progn
(setf xxx (with-input-from-string (stream "\"\\uD83D\\uDE03\"")
(cl-json:decode-json stream)))
(princ (code-char
(let ((c1 (char-code (aref xxx 0)))
(c2 (char-code (aref xxx 1))))
(+ #x10000
(ash (logand #x03FF c1) 10)
(logand #x03FF c2))))))
=>
😃
#\SMILING_FACE_WITH_OPEN_MOUTH
This also causes a rather nasty failure to handle output:
(json:decode-json-from-string "\"\\uD83D\\uDE02\\uD83D\\uDE02\"")
I suggest using YASON instead.