csv
csv copied to clipboard
unknown encoding name - UTF-16:UTF-8 (ArgumentError)
I just upgraded ruby to 2.7.5 and am getting this error in one of my tests:
unknown encoding name - UTF-16:UTF-8 (ArgumentError)
The code that I am using looks like this:
CSV.parse(page.body, headers: true, encoding: 'UTF-16:UTF-8').size()
Settings:
- Ruby old version: ruby 2.6.6p146
- Ruby 2.7.5p203 (2021-11-24 revision f69aeb8314)
- Rails 5.0.7.2
Could you show the version of old Ruby that works?
@kou
Sure, this is the old ruby version 2.6.6p146 where it works fine.
Thanks.
What is the encoding of page.body
? UTF-16BE
or UTF-16LE
? Could you show a sample String that works with Ruby 2.6?
it looks like this:
"ID,Title,Description,Mod Date,Jurisdiction,Leg. Ref.,Sectors,Legislation Types,Legislation Status,English,Alternate Language,URL Link,Pub Date,Effective Date\n444,"Legislation4 in EN\nLegislation4 in FR","protect the facility, please\nprotect the facility, please",,CA,"LegEN4\nLegFR4",Mining,General,Published,no,no,"https://www.tes.org/en/ca/laws/stat/sc-19-c-33/latest/sc-19-c-33.html\nhttps://www.test.org/en/ca/laws/stat/sc-19-c-33/latest/sc-19-c-33.html",2017-08-04,2017-08-08\n"
Thanks but could you attach the content instead of pasting here? If we paste here, encoding information is lost.
I also have this issue, and it seems to be caused by these changes on the initialize method for CSV:
3.0.0/csv
Previously it used to be like this:
2.6.0/csv
Exactly, what is the benefit of doing @io.set_encoding there, isn't that the purpose of tracking @encoding separately and doing @encoding = determine_encoding(encoding, internal_encoding)
anyway after that?
This is the example code that no longer works after upgrading from 2.6.8 to 2.7.0 and then to 3.0.5:
CSV.parse(some_csv_string, headers: true, encoding: 'ISO-8859-1:UTF-8') do |row|
# magic
end
Could you retry with the latest version? https://rubygems.org/gems/csv
Isn't this included with the Ruby version itself though? Can I just override it on a gemfile?
-- EDIT -- Alright, yeah that worked, thanks!