dude icon indicating copy to clipboard operation
dude copied to clipboard

dude uncomplicated data extraction: A simple framework for writing web scrapers using Python decorators

Results 43 dude issues
Sort by recently updated
recently updated
newest added

- Playwright - https://playwright.dev/docs/auth#reuse-authentication-state - Check possibility if this can be done with other backends

enhancement

Modified Selenium that circumvents anti-bot. Repo: https://github.com/ultrafunkamsterdam/undetected-chromedriver > Optimized Selenium Chromedriver patch which does not trigger anti-bot services like Distill Network / Imperva / DataDome / Botprotect.io Automatically downloads the...

enhancement
help wanted
good first issue

1. Download by file extension 2. Download by mimetype, e.g. `png` should also match `image/png` mimetype ```console dude scrape ... --download png,jpg # download all png and jpg files dude...

enhancement
help wanted
good first issue

[Autoscraper](https://github.com/alirezamika/autoscraper) is made for automatic web scraping to make scraping easy. I believe it would be incredible to also include it.

enhancement
help wanted
good first issue

## Possible format: ```python @select(sample="path/to/training/data") def handler(result): return {"data": result} ``` ## Potential backends: - https://github.com/lorey/mlscraper

enhancement
help wanted
good first issue

https://www.reddit.com/r/Python/comments/tc3x72/comment/i0xvy98/?utm_source=share&utm_medium=web2x&context=3

enhancement
help wanted
good first issue

https://github.com/browserless/chrome#hosting-providers

enhancement
help wanted
good first issue

- Set a value for Dude User-Agent instead of using the default values on each parser backend (e.g.: `pydude/{version} (+https://github.com/roniemartinez/dude)`) - Add option to override the User-Agent - For Playwright...

enhancement
help wanted