[π Bug]: Fail to download PDF or zip file from remote to client on Remote webdriver
What happened?
error:
D:\Python\Python311\python.exe D:/OfflineaCare/ndb/program/test/test_oooooooo.py
Traceback (most recent call last):
File "D:\OfflineaCare\ndb\program\test\test_oooooooo.py", line 51, in
Process finished with exit code 1
How can we reproduce the issue?
The code bellow is click the button,then download the .docx file(or zip or pdf)
code:
from selenium import webdriver
from selenium.webdriver.common.by import By
import time
options = webdriver.ChromeOptions()
options.enable_downloads = True
driver = webdriver.Remote(command_executor='http://192.168.3.35:4444/wd/hub', options=options)
driver.maximize_window()
driver.implicitly_wait(5)
driver.get("http://127.0.0.1:8000/login_page")
driver.find_element(By.XPATH,"//button[text()='ε―ΌεΊ']").click()
time.sleep(5)
file_names = driver.get_downloadable_files()
downloadable_file = file_names[0]
target_directory = r'D:\dtmp'
driver.download_file(downloadable_file, target_directory)
time.sleep(10)
node setting:
java -jar selenium-server-4.20.0.jar node --hub http://192.168.3.35:4444 --host 192.168.3.35 --port 5557 --enable-managed-downloads true
I found the the source code in webdriver.py the method :def get_downloadable_files, has some issues
if i set the name to be zip like 'file_name = 'package.zip' ,then i can run successfully, but without this ,it will fail
contents = self.execute(Command.DOWNLOAD_FILE, {"name": file_name})["value"]["contents"]
# file_name = 'package.zip'
target_file = os.path.join(target_directory, file_name)
with open(target_file, "wb") as file:
file.write(base64.b64decode(contents))
with zipfile.ZipFile(target_file, "r") as zip_ref:
zip_ref.extractall(target_directory)
Relevant log output
D:\Python\Python311\python.exe D:/OfflineaCare/ndb/program/test/test_oooooooo.py
Traceback (most recent call last):
File "D:\OfflineaCare\ndb\program\test\test_oooooooo.py", line 51, in <module>
driver.download_file(downloadable_file, target_directory)
File "D:\Python\Python311\Lib\site-packages\selenium\webdriver\remote\webdriver.py", line 1155, in download_file
zip_ref.extractall(target_directory)
File "D:\Python\Python311\Lib\zipfile.py", line 1679, in extractall
self._extract_member(zipinfo, path, pwd)
File "D:\Python\Python311\Lib\zipfile.py", line 1734, in _extract_member
shutil.copyfileobj(source, target)
File "D:\Python\Python311\Lib\shutil.py", line 197, in copyfileobj
buf = fsrc_read(length)
^^^^^^^^^^^^^^^^^
File "D:\Python\Python311\Lib\zipfile.py", line 953, in read
data = self._read1(n)
^^^^^^^^^^^^^^
File "D:\Python\Python311\Lib\zipfile.py", line 1021, in _read1
data += self._read2(n - len(data))
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Python\Python311\Lib\zipfile.py", line 1056, in _read2
raise EOFError
EOFError
Process finished with exit code 1
Operating System
WINDOWS10
Selenium version
selenium 4.20.0 python 3.11.3
What are the browser(s) and version(s) where you see this issue?
Chrome 124
What are the browser driver(s) and version(s) where you see this issue?
124.0.6367.61
Are you using Selenium Grid?
selenium-server-4.20.0.jar
@15975518086, thank you for creating this issue. We will troubleshoot it as soon as we can.
Info for maintainers
Triage this issue by using labels.
If information is missing, add a helpful comment and then I-issue-template label.
If the issue is a question, add the I-question label.
If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.
If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C),
add the applicable G-* label, and it will provide the correct link and auto-close the
issue.
After troubleshooting the issue, please add the R-awaiting answer label.
Thank you!
Hi!
I encountered the same problem when trying to download a zip file.
Also in the process of debugging I catch another error message here (maybe it help:
Operating System: Manjaro Linux Selenium version: 4.21 Python version: 3.12 Browsers: Chrome , Firefox, Edge (latest versions of selenium/standalone)
Traceback:
tests/modules/test_internal_export.py:104:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
.venv/lib/python3.12/site-packages/selenium/webdriver/remote/webdriver.py:1155: in download_file
zip_ref.extractall(target_directory)
../../../.pyenv/versions/3.12.0/lib/python3.12/zipfile/__init__.py:1720: in extractall
self._extract_member(zipinfo, path, pwd)
../../../.pyenv/versions/3.12.0/lib/python3.12/zipfile/__init__.py:1778: in _extract_member
shutil.copyfileobj(source, target)
../../../.pyenv/versions/3.12.0/lib/python3.12/shutil.py:203: in copyfileobj
while buf := fsrc_read(length):
../../../.pyenv/versions/3.12.0/lib/python3.12/zipfile/__init__.py:978: in read
data = self._read1(n)
../../../.pyenv/versions/3.12.0/lib/python3.12/zipfile/__init__.py:1046: in _read1
data += self._read2(n - len(data))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <zipfile.ZipExtFile [closed]>, n = 3094
def _read2(self, n):
if self._compress_left <= 0:
return b''
n = max(n, self.MIN_READ_SIZE)
n = min(n, self._compress_left)
data = self._fileobj.read(n)
self._compress_left -= len(data)
if not data:
> raise EOFError
E EOFError
../../../.pyenv/versions/3.12.0/lib/python3.12/zipfile/__init__.py:1081: EOFError
Docker-compose file
version: '3'
services:
chrome:
image: selenium/standalone-chrome
shm_size: 2gb
ports:
- 4444:4444 # Selenium service
- 5900:5900 # VNC server
- 7900:7900 # VNC browser client
environment:
- SE_OPTS=--enable-managed-downloads true
We are also experiencing the same issue...
The root issue, is that it's writing the zip-file content with the same name of the desired file, when it starts to uncompress, the "zip" file get's overwritten and then the file goes empty resulting with the EOF exception
ATM we are bypassing it by calling the self.execute directly with a similar solution to what millin did in his PR
def __download_file(self, file_name: str, target_directory: str) -> None:
if not os.path.exists(target_directory):
os.makedirs(target_directory)
contents = self.execute(Command.DOWNLOAD_FILE, {"name": file_name})["value"]["contents"]
zip_target_file = os.path.join(target_directory, f"{file_name}.zip")
with open(zip_target_file, "wb") as file:
file.write(base64.b64decode(contents))
with zipfile.ZipFile(zip_target_file, "r") as zip_ref:
zip_ref.extractall(target_directory)
os.remove(zip_target_file)
This issue is looking for contributors.
Please comment below or reach out to us through our IRC/Slack/Matrix channels if you are interested.
@titusfortner Fixed in #14031
I believe this issue can be closed as the PR for this is merged already like @millin said.