Add login context helper to playwright and puppeteer crawlers
Which package is the feature request for? If unsure which one to select, leave blank
@crawlee/playwright (PlaywrightCrawler)
Feature
Add a new context helper for playwright and puppeteer crawlers for simple login flows.
Motivation
Bigger part of logins are rather simple, they either contain the username/email and password field, or have a two step form (first provide username/email, then provide the password). We want to have a simple context helper for those simple cases to simplify logging into protected sites.
Ideal solution or implementation, and any additional constraints
It should be a CrawlingContext helper added specifically for PlaywrightCrawler and PuppeteerCrawler.
async requestHandler({ login }) {
await login({ username: '...', password: '...' });
});
- the implementation should detect whether there is a login form on the page, the heuristic should be also configurable as a callback
- if no login form detected, the function should resolve, since there is nothing to do
- if login form is detected, it should detect what kind of a form it is (one or two step, feel free to consider other login form types), fill it in and submit it
- it should detect if the login succeeded or failed and resolve/reject based on that
- the detection of a successful/failed login should be configurable as a callback, try to come up with a good default heuristic
usernameandpasswordwill be the only required options, page object will be already bound to the function like in the other context helpers- (optional) there should also be an option with a callback for dealing with captchas
Alternative solutions or implementations
No response
Other context
For inspiration, see how other context helpers are implemented, e.g. parseWithCheerio.
This helper should be available only on the two browser crawlers. You can start with playwright only, porting the code to puppeteer is optional for the initial PR.
You can use those sites to test this:
- https://www.saucedemo.com/
- http://zero.webappsecurity.com/
- https://automationexercise.com/login
Examples of sites with the two-step login form:
- https://claude.ai/
- https://accounts.evernote.com/login
Hi @B4nan
We are students from CodeDay. We were working on issue #2261. Thanks to your feedbacks, we were able to close it quickly. Now we would like to take on this issue. Since this will be our 2nd issue, is there any concern or guidelines we should follow? Thank you.
Hi @B4nan
We are students from CodeDay. We were working on issue #2261. Thanks to your feedbacks, we were able to close it quickly. Now we would like to take on this issue. Since this will be our 2nd issue, is there any concern or guidelines we should follow? Thank you.
I am a part of this CodeDay team. I am looking forward to working on this issue!
Hi @B4nan
We are students from CodeDay. We were working on issue #2261. Thanks to your feedbacks, we were able to close it quickly. Now we would like to take on this issue. Since this will be our 2nd issue, is there any concern or guidelines we should follow? Thank you.
I’m super excited to work on this and would be truly grateful if you @B4nan could share any external resources that might help us better understand and implement all the required features. For example, any architecture/design Apify wants us to do.
Looking forward to bringing this to life! :)