crack icon indicating copy to clipboard operation
crack copied to clipboard

YAML load errors when parsing JSON that initially contained a \u0000 character sequence

Open gus opened this issue 13 years ago • 2 comments

When incoming JSON to be parsed contains the character sequence \u0000, YAML blows up. \u0000 converts to \x00 with unescape just fine, but YAML chokes. Basically, this is because end-of-string is considered to be wherever \u0000 was.

The same issue does not arise if \x00 was in the initial string to be JSON.parsed.

Fix/pull-request forthcoming.

gus avatar Mar 01 '11 22:03 gus

Here is a quick monkey patch that you can use to get around this problem. By no means is this presented as a permanent fix.

All it does is deletes any \u0000 control characters from the string before it gets converted to yaml.

Not entirely elegant but it does the job...

##
## This is a monkey patch that fix a bug with how Crack/YAML parses the unicode control character \u0000.
## To get around the Invalid JSON string error, we simply remove these from the string
##
module Crack
  class JSON
    def self.parse(json)
      YAML.load(unescape(convert_json_to_yaml(json.gsub(/\\[u|U]0000/,""))))
    rescue ArgumentError => e
      raise ParseError, "Invalid JSON string"
    end
  end
end

ianyamey avatar Aug 02 '11 21:08 ianyamey

You can see my -- as yet unmerged -- pull request that fixes this here. If you're using Crack through HTTParty, you can get around this by using a different JSON parser besides Crack. See discussion here.

ericgj avatar Aug 02 '11 22:08 ericgj