seven_zip_ruby icon indicating copy to clipboard operation
seven_zip_ruby copied to clipboard

Encoding::UndefinedConversionError on non UTF-8 filename characters

Open md-work opened this issue 3 years ago • 0 comments

seven_zip_ruby 1.3.0 on openSUSE-15.4 (Linux)
https://download.opensuse.org/distribution/leap/15.4/live/openSUSE-Leap-15.4-KDE-Live-x86_64-Media.iso

sudo zypper in ruby2.5-devel gcc-c++
sudo gem install --conservative --no-doc seven_zip_ruby

 

The problem:

mkdir in
touch in/non_"$(echo -ne '\x80')"_utf8.txt

ruby -e 'require "seven_zip_ruby";
    File.open("z.7z", "wb") { |file| SevenZipRuby::Writer.add_directory(file, "in".force_encoding(Encoding::ASCII_8BIT)) }'

result:

Traceback (most recent call last):
        11: from -e:1:in `<main>'
        10: from -e:1:in `open'
         9: from -e:1:in `block in <main>'
         8: from /usr/lib64/ruby/gems/2.5.0/gems/seven_zip_ruby-1.3.0/lib/seven_zip_ruby/seven_zip_writer.rb:187:in `add_directory'
         7: from /usr/lib64/ruby/gems/2.5.0/gems/seven_zip_ruby-1.3.0/lib/seven_zip_ruby/seven_zip_writer.rb:118:in `open'
         6: from /usr/lib64/ruby/gems/2.5.0/gems/seven_zip_ruby-1.3.0/lib/seven_zip_ruby/seven_zip_writer.rb:188:in `block in add_directory'
         5: from /usr/lib64/ruby/gems/2.5.0/gems/seven_zip_ruby-1.3.0/lib/seven_zip_ruby/seven_zip_writer.rb:403:in `add_directory'
         4: from /usr/lib64/ruby/gems/2.5.0/gems/seven_zip_ruby-1.3.0/lib/seven_zip_ruby/seven_zip_writer.rb:403:in `glob'
         3: from /usr/lib64/ruby/gems/2.5.0/gems/seven_zip_ruby-1.3.0/lib/seven_zip_ruby/seven_zip_writer.rb:403:in `glob'
         2: from /usr/lib64/ruby/gems/2.5.0/gems/seven_zip_ruby-1.3.0/lib/seven_zip_ruby/seven_zip_writer.rb:410:in `block in add_directory'
         1: from /usr/lib64/ruby/gems/2.5.0/gems/seven_zip_ruby-1.3.0/lib/seven_zip_ruby/seven_zip_writer.rb:340:in `add_file'
/usr/lib64/ruby/gems/2.5.0/gems/seven_zip_ruby-1.3.0/lib/seven_zip_ruby/seven_zip_writer.rb:340:in `encode': "\x80" from ASCII-8BIT to UTF-8 (Encoding::UndefinedConversionError)

 

Without Encoding::ASCII_8BIT it results in a invalid byte sequence in UTF-8 (ArgumentError).
And with everything I know about Ruby, filename strings should always have Encoding::ASCII_8BIT. So setting Encoding::ASCII_8BIT is definitively the right thing here.

md-work avatar Jul 15 '22 11:07 md-work