pyppeteer
pyppeteer copied to clipboard
Set Download Location in Pyppeteer in headless mode
In there is a method to store the downloaded files in headless mode of puppeteer as described in this so answer
await page._client.send('Page.setDownloadBehavior', {behavior: 'allow', downloadPath: './myAwesomeDownloadFolder'});
Is there any similar method that exists in pyppeteer?
Doesn't the same method work in pyppeteer? That method is using chromium feature, so i guess it should work also in pyppeteer.
It is indeed a Chrome feature, it the default download folder. Did you try to get the Response object as binary, save it to a file on disk?
response = await page.goto("http://www.irs.gov/pub/irs-pdf/f1040.pdf")
with open("f1040.pdf", "wb") as new_file
new_file.write(response)
Note: I'm not sure if this is working, I'm introducing a solution.
It's been awhile since this was asked, however I think whats covered in this feature will get you what you need by interfacing with the DevTools Protocol directly: https://github.com/GoogleChrome/puppeteer/pull/1770
after some debug, I found this way will work
cdp = await page.target.createCDPSession(); await cdp.send('Page.setDownloadBehavior', { 'behavior': 'allow', 'downloadPath': '/temp/'});
the reason is python has different grammer
you can't use behavior in python dict, you need use 'behavior'
Converted the javascript function to python: https://github.com/puppeteer/puppeteer/issues/299#issuecomment-474435547
It isn't 1-to-1, as the generated random directory name is different, but the same number of characters. I had trouble figuring out how to convert Javascript's Number.toString
function to Python.
import os
import random
def base36encode(number, alphabet="0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"):
base36 = ""
sign = ""
if number < 0:
sign = "-"
number = -number
if 0 <= number < len(alphabet):
return sign + alphabet[number]
while number != 0:
number, i = divmod(number, len(alphabet))
base36 = alphabet[i] + base36
return sign + base36
async def download_file(page, f):
randNum = random.random()
intPart = int(str(randNum)[2:])
base36num = base36encode(intPart)
downloadPath = f"{os.getcwd()}/download-{base36num}"
try:
os.mkdir(downloadPath)
except OSError as err:
print(f"Creation of directory {downloadPath} failed: {err}")
else:
print(f"Successfully created download directory: {downloadPath}")
cdp = await page.target.createCDPSession()
await cdp.send(
"Page.setDownloadBehavior",
{"behavior": "allow", "downloadPath": downloadPath},
)
await f()
print("Downloading...")
fileName = ""
theList = os.listdir(downloadPath)
if len(theList) > 0:
fileName = theList[0]
while fileName is "" or fileName.endswith(".crdownload"):
time.sleep(0.100)
theList = os.listdir(downloadPath)
if len(theList) > 0:
fileName = theList[0]
filePath = os.path.join(downloadPath, fileName)
print(f"Downloaded file: {filePath}")
return filePath