scrapy-playwright issues

Proxy removes cookies

4

Without proxy, cookie applied correctly. But when I use proxy (brightdata), then the cookie is not applied. Did I miss anything? ``` class ScrapyTest(scrapy.Spider): name = 'scrapy test' def start_requests(self):...

jayavinothmoorthy

bug

upstream issue

GoTo returns None for certain sites (never the first page)

5

Hi! I have a spider that uses playwright with a proxy. **NOTE: the spider works as it should when the proxy is not needed and the proxy works, as the...

AlvinSartorTrityum

[question]: How to follow links using CrawlerSpider

2

I have had a hard time trying to follow links using the Scrapy Playwright to navigate a dynamic website. want to write a crawl spider that will get all available...

okoliechykwuka

support

Support playwright_stealth

1

Integrated playwright_stealth, and PLAYWRIGHT_STEALTH_ENABLED as an optional config. Attached bot test results. **PLAYWRIGHT_STEALTH_ENABLED = True** ![ENABLED](https://user-images.githubusercontent.com/29615986/181036474-1ce1a2ee-991f-47df-9104-21014e36e0c4.png) **PLAYWRIGHT_STEALTH_ENABLED = False** ![DISABLED](https://user-images.githubusercontent.com/29615986/181036494-d495ddae-6ead-44ea-8c7c-e740202445af.png)

hqtang33

Is there a difference between using playright and using scrapy-playlight?

3

Hi. I think the results of using playright and scrappy-playright are different in some situations. When i use just playwright, it just propery worked. but same code in scrapy-playwright wasn't...

sucream

Receiving a 400 response after clicking "I agree" on the consent form on Google, but not when running through regular Playwright.

7

Hi, I have a strange issue where I am receiving a 400 response from Google after clicking on the "I agree" button on their consent form. ![after_span](https://user-images.githubusercontent.com/18088212/173633886-8d85a702-d14a-430d-886f-3369775fbb65.png) This issue however...

LTWood

could not reproduce

Stale

Conditionally apply page methods

1

Hi. I crawl a website using scrapy_plawright , I use `wait_for_selector` and when page isn't exist (status = 404) scrapy_playwright wait until `Timeout` and then raise exception. Is there any...

anvaari

enhancement

Does not shutdown cleanly on SIGINT (cmd+c)

2

It seems that when using scrapy-playwright Scrapy will not shut down cleanly on SIGINT (`cmd+c`), and you have to force a shutdown with a second `cmd+c`. If you use the...

samwillis

upstream issue

add_init_script seems not to be implemented

This function [function](https://playwright.dev/python/docs/api/class-page#page-add-init-script) and either this [function](https://playwright.dev/python/docs/api/class-browsercontext#browser-context-add-init-script) are not implemented in scrapy-playwright. I tried with coroutines (evaluate) but it doesn't give the same results as `add_init_script` as the JS script...

hassan-alexandre-innodataweb

enhancement

Allocation failed - JavaScript heap out of memory

10

Hi, This issue related to [#18](https://github.com/scrapy-plugins/scrapy-playwright/issues/18) The error still occurred with `scrapy-playwright 0.0.4`. The Scrapy script crawled about 2500 domains in 10k from [majestic](https://majestic.com/reports/majestic-million) and crashed with the last error...

phongtnit

documentation

enhancement

scrapy-playwright
scrapy-playwright copied to clipboard

Metadata

Proxy removes cookies

GoTo returns None for certain sites (never the first page)

[question]: How to follow links using CrawlerSpider

Support playwright_stealth

Is there a difference between using playright and using scrapy-playlight?

Receiving a 400 response after clicking "I agree" on the consent form on Google, but not when running through regular Playwright.

Conditionally apply page methods

Does not shutdown cleanly on SIGINT (cmd+c)

add_init_script seems not to be implemented

Allocation failed - JavaScript heap out of memory

← Metadata

Owner

Metadata

scrapy-playwright scrapy-playwright copied to clipboard

Metadata

← Metadata

Owner

Metadata

scrapy-playwright
scrapy-playwright copied to clipboard