cpython icon indicating copy to clipboard operation
cpython copied to clipboard

ZipFile & ZipInfo functions raise IndexError for empty filenames

Open obfusk opened this issue 3 years ago • 2 comments
trafficstars

Bug report

ZipFile.writestr() raises IndexError for empty filenames unless one creates the ZipInfo manually:

>>> import zipfile
>>> zf = zipfile.ZipFile("x.zip", "w")
>>> zf.writestr("", "Hello, World!")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.10/zipfile.py", line 1790, in writestr
    if zinfo.filename[-1] == '/':
IndexError: string index out of range
>>> zf.writestr(zipfile.ZipInfo(""), "Hello, World!")
>>> zf.namelist()
['']

Now, one could argue that empty file names should not be allowed (and I'm sure they will cause bugs somewhere, if only failing to extract), but that should presumably then be enforced correctly and result in a different exception.

And apparently, ZipInfo.__repr__() also raises IndexError for empty filenames via ZipInfo.is_dir():

>>> zf.infolist()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.10/zipfile.py", line 399, in __repr__
    isdir = self.is_dir()
  File "/usr/lib/python3.10/zipfile.py", line 530, in is_dir
    return self.filename[-1] == '/'
IndexError: string index out of range

ZipFile.extract() probably should fail in this case, but again with a different exception and not just via ZipInfo.is_dir()'s IndexError I think:

>>> zf.extract("", "some_dir")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.10/zipfile.py", line 1628, in extract
    return self._extract_member(member, path, pwd)
  File "/usr/lib/python3.10/zipfile.py", line 1693, in _extract_member
    if member.is_dir():
  File "/usr/lib/python3.10/zipfile.py", line 530, in is_dir
    return self.filename[-1] == '/'
IndexError: string index out of range

Your environment

  • CPython versions tested on: 3.10.7
  • Operating system and architecture: Debian amd64

obfusk avatar Oct 15 '22 16:10 obfusk

FWIW, zipinfo & unzip -l don't raise any errors:

$ zipinfo x.zip
Archive:  x.zip
Zip file size: 111 bytes, number of entries: 1
?rw-------  2.0 unx       13 b- stor 80-Jan-01 00:00
1 file, 13 bytes uncompressed, 13 bytes compressed:  0.0%
$ zipinfo -v x.zip
Archive:  x.zip
There is no zipfile comment.

End-of-central-directory record:
-------------------------------

  Zip archive file size:                       111 (000000000000006Fh)
  Actual end-cent-dir record offset:            89 (0000000000000059h)
  Expected end-cent-dir record offset:          89 (0000000000000059h)
  (based on the length of the central directory and its expected offset)

  This zipfile constitutes the sole disk of a single-part archive; its
  central directory contains 1 entry.
  The central directory is 46 (000000000000002Eh) bytes long,
  and its (expected) offset in bytes from the beginning of the zipfile
  is 43 (000000000000002Bh).


Central directory entry #1:
---------------------------



  offset of local header from start of archive:   0
                                                  (0000000000000000h) bytes
  file system or operating system of origin:      Unix
  version of encoding software:                   2.0
  minimum file system compatibility required:     MS-DOS, OS/2 or NT FAT
  minimum software version required to extract:   2.0
  compression method:                             none (stored)
  file security status:                           not encrypted
  extended local header:                          no
  file last modified on (DOS date/time):          1980 Jan 1 00:00:00
  32-bit CRC value (hex):                         ec4ac3d0
  compressed size:                                13 bytes
  uncompressed size:                              13 bytes
  length of filename:                             0 characters
  length of extra field:                          0 bytes
  length of file comment:                         0 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  Unix file attributes (000600 octal):            ?rw-------
  MS-DOS file attributes (00 hex):                none

  There is no file comment.
$ unzip -l x.zip
Archive:  x.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
       13  1980-01-01 00:00
---------                     -------
       13                     1 file

But of course extracting fails:

$ unzip x.zip
Archive:  x.zip
mapname:  conversion of  failed

obfusk avatar Oct 15 '22 16:10 obfusk

I wrote a simple patch:

--- a/zipfile.py	2022-10-01 06:31:04.000000000 +0200
+++ b/zipfile.py	2022-10-16 17:27:45.585018849 +0200
@@ -527,7 +527,7 @@
 
     def is_dir(self):
         """Return True if this archive member is a directory."""
-        return self.filename[-1] == '/'
+        return self.filename.endswith('/')
 
 
 # ZIP encryption uses the CRC32 one-byte primitive for scrambling some
@@ -1682,6 +1682,9 @@
             # filter illegal characters on Windows
             arcname = self._sanitize_windows_name(arcname, os.path.sep)
 
+        if not arcname:
+            raise ValueError("Empty filename.")
+
         targetpath = os.path.join(targetpath, arcname)
         targetpath = os.path.normpath(targetpath)
 
@@ -1787,7 +1790,7 @@
                             date_time=time.localtime(time.time())[:6])
             zinfo.compress_type = self.compression
             zinfo._compresslevel = self.compresslevel
-            if zinfo.filename[-1] == '/':
+            if zinfo.filename.endswith('/'):
                 zinfo.external_attr = 0o40775 << 16   # drwxrwxr-x
                 zinfo.external_attr |= 0x10           # MS-DOS directory flag
             else:

obfusk avatar Oct 16 '22 15:10 obfusk

thanks. i agree these are meaningless cases and shouldn't happen, but at least this prevents mysterious looking exceptions when someone does give you such a zip archive.

gpshead avatar Oct 29 '22 05:10 gpshead