cpython
cpython copied to clipboard
ZipFile & ZipInfo functions raise IndexError for empty filenames
Bug report
ZipFile.writestr() raises IndexError for empty filenames unless one creates the ZipInfo manually:
>>> import zipfile
>>> zf = zipfile.ZipFile("x.zip", "w")
>>> zf.writestr("", "Hello, World!")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.10/zipfile.py", line 1790, in writestr
if zinfo.filename[-1] == '/':
IndexError: string index out of range
>>> zf.writestr(zipfile.ZipInfo(""), "Hello, World!")
>>> zf.namelist()
['']
Now, one could argue that empty file names should not be allowed (and I'm sure they will cause bugs somewhere, if only failing to extract), but that should presumably then be enforced correctly and result in a different exception.
And apparently, ZipInfo.__repr__() also raises IndexError for empty filenames via ZipInfo.is_dir():
>>> zf.infolist()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.10/zipfile.py", line 399, in __repr__
isdir = self.is_dir()
File "/usr/lib/python3.10/zipfile.py", line 530, in is_dir
return self.filename[-1] == '/'
IndexError: string index out of range
ZipFile.extract() probably should fail in this case, but again with a different exception and not just via ZipInfo.is_dir()'s IndexError I think:
>>> zf.extract("", "some_dir")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.10/zipfile.py", line 1628, in extract
return self._extract_member(member, path, pwd)
File "/usr/lib/python3.10/zipfile.py", line 1693, in _extract_member
if member.is_dir():
File "/usr/lib/python3.10/zipfile.py", line 530, in is_dir
return self.filename[-1] == '/'
IndexError: string index out of range
Your environment
- CPython versions tested on: 3.10.7
- Operating system and architecture: Debian amd64
FWIW, zipinfo & unzip -l don't raise any errors:
$ zipinfo x.zip
Archive: x.zip
Zip file size: 111 bytes, number of entries: 1
?rw------- 2.0 unx 13 b- stor 80-Jan-01 00:00
1 file, 13 bytes uncompressed, 13 bytes compressed: 0.0%
$ zipinfo -v x.zip
Archive: x.zip
There is no zipfile comment.
End-of-central-directory record:
-------------------------------
Zip archive file size: 111 (000000000000006Fh)
Actual end-cent-dir record offset: 89 (0000000000000059h)
Expected end-cent-dir record offset: 89 (0000000000000059h)
(based on the length of the central directory and its expected offset)
This zipfile constitutes the sole disk of a single-part archive; its
central directory contains 1 entry.
The central directory is 46 (000000000000002Eh) bytes long,
and its (expected) offset in bytes from the beginning of the zipfile
is 43 (000000000000002Bh).
Central directory entry #1:
---------------------------
offset of local header from start of archive: 0
(0000000000000000h) bytes
file system or operating system of origin: Unix
version of encoding software: 2.0
minimum file system compatibility required: MS-DOS, OS/2 or NT FAT
minimum software version required to extract: 2.0
compression method: none (stored)
file security status: not encrypted
extended local header: no
file last modified on (DOS date/time): 1980 Jan 1 00:00:00
32-bit CRC value (hex): ec4ac3d0
compressed size: 13 bytes
uncompressed size: 13 bytes
length of filename: 0 characters
length of extra field: 0 bytes
length of file comment: 0 characters
disk number on which file begins: disk 1
apparent file type: binary
Unix file attributes (000600 octal): ?rw-------
MS-DOS file attributes (00 hex): none
There is no file comment.
$ unzip -l x.zip
Archive: x.zip
Length Date Time Name
--------- ---------- ----- ----
13 1980-01-01 00:00
--------- -------
13 1 file
But of course extracting fails:
$ unzip x.zip
Archive: x.zip
mapname: conversion of failed
I wrote a simple patch:
--- a/zipfile.py 2022-10-01 06:31:04.000000000 +0200
+++ b/zipfile.py 2022-10-16 17:27:45.585018849 +0200
@@ -527,7 +527,7 @@
def is_dir(self):
"""Return True if this archive member is a directory."""
- return self.filename[-1] == '/'
+ return self.filename.endswith('/')
# ZIP encryption uses the CRC32 one-byte primitive for scrambling some
@@ -1682,6 +1682,9 @@
# filter illegal characters on Windows
arcname = self._sanitize_windows_name(arcname, os.path.sep)
+ if not arcname:
+ raise ValueError("Empty filename.")
+
targetpath = os.path.join(targetpath, arcname)
targetpath = os.path.normpath(targetpath)
@@ -1787,7 +1790,7 @@
date_time=time.localtime(time.time())[:6])
zinfo.compress_type = self.compression
zinfo._compresslevel = self.compresslevel
- if zinfo.filename[-1] == '/':
+ if zinfo.filename.endswith('/'):
zinfo.external_attr = 0o40775 << 16 # drwxrwxr-x
zinfo.external_attr |= 0x10 # MS-DOS directory flag
else:
thanks. i agree these are meaningless cases and shouldn't happen, but at least this prevents mysterious looking exceptions when someone does give you such a zip archive.