json
json copied to clipboard
Loading UTF-8 file throws Encoding::CompatibilityError
copying from Stackoverflow. http://stackoverflow.com/questions/27673655/json-load-throws-encodingcompatibilityerror
Loading UTF-8 file.
[ec2-user@ip-XXX-XXX-XXX-XXX vfs]$ file data/E03124/data.json
data/E03124/data.json: UTF-8 Unicode text, with very long lines, with no line terminators
Error message
Caught Encoding::CompatibilityError at '"{\"資産の部\":{': incompatible encoding regexp match (ASCII-8BIT regexp with UTF-8 string)
Backtrace
json (1.8.1) lib/json/pure/parser.rb:242:in `rescue in parse_string'
json (1.8.1) lib/json/pure/parser.rb:213:in `parse_string'
json (1.8.1) lib/json/pure/parser.rb:257:in `parse_value'
json (1.8.1) lib/json/pure/parser.rb:121:in `parse'
json (1.8.1) lib/json/common.rb:155:in `parse'
json (1.8.1) lib/json/common.rb:334:in `load'
app/controllers/statements_controller.rb:13:in `block in getData'
app/controllers/statements_controller.rb:12:in `open'
app/controllers/statements_controller.rb:12:in `getData'
Rails code
def getData
json_data = open("data/#{params[:code]}/data.json") do |io|
JSON.load(io)
end
render :json => json_data
end
Ruby version is 2.0.0.
Rails version is 4.1.4.
And the problem is the json/pure parser according to AJcodez.
The regex for matching a string uses the n option meaning the pattern is in ASCII-8BIT encoding. From the ruby regex docs:
A regexp can be matched against a string when they either share an encoding, or the regexp’s encoding is US-ASCII and the string’s encoding is ASCII-compatible.
If a match between incompatible encodings is attempted an Encoding::CompatibilityError exception is raised.
json/pure/parser.rb
215 string = self[1].gsub(%r((?:\\[\\bfnrt"/]|(?:\\u(?:[A-Fa-f\d]{4}))+|\\[\x20-\xff]))n) do |c|
Try to use MultiJson