chromote
chromote copied to clipboard
Create a `ChromoteSession$nagivate()` method?
As described in the README under Loading a page reliably, one cannot use the following pattern to reliably wait for the page to load.
# Not reliable
b$Page$navigate("https://www.r-project.org/")
b$Page$loadEventFired() # Block until page has loaded
b$screenshot("browser.png")
Instead, we suggest an synchronous or asynchronous approach that involves correctly threading the $Page$loadEventFired() and $Page$navigate() commands.
In practice, however, most people just sys.sleep() for some reasonable or acceptable amount of time (#176).
Considering that we have a few convenience methods already, e.g. ChromoteSession$screenshot(), I'd like to propose that we offer a similar convenience method, ChromoteSession$navigate(), that automatically waits for the page load event and awaits the value (wait_ = TRUE) or returns the promise (wait_ = FALSE).
In the base case, that would simplify the above example code to just these two lines:
b$navigate("https://www.r-project.org/")
b$screenshot("browser.png")
I think this would be very helpful! FWIW, if the page is making API calls, $Page$loadEventFired() might not be enough for the page to really be "done". I don't have it fully loaded in my memory right now, but I remember that being the source of a lot of the complexity in {apisniffer}. I never felt like what I was doing was absolutely correct, but I was using callbacks for session$Network$requestWillBeSent and session$Network$responseReceived to make sure all of the calls that fired off had received a response (or a maximum wait time was reached) before considering the page fully loaded. That might be beyond the scope of this ticket, but I wanted to bring it up (because if you CAN handle that piece, too, that would be very cool 😊 )!
Great point @jonthegeek. I like what pupeteer does in taking a predefined lifecycle event. In R would be a wait_until argument that takes one of the following values
wait_until |
Event |
|---|---|
"load" |
Waits for the load event, i.e. b$Page$loadEventFired(). |
"DOMContentLoaded" |
Waits for the DOMContentLoaded event. |
"networkIdle0" |
Waits until there are no network connections for at least 500ms. |
"networkIdle2" |
Waits until there are no more than 2 network connections for at least 500ms. |
I also really like the page.waitFor*() methods (listed here) and would love to see some of those make it into chromote. In particular
waitForSelector()– waits for a CSS selector to have matching elementswaitForFunction()– waits for a function to return a truthy valuewaitForNetworkIdle()– see abovewaitForNavigation()– wait for the page to navigate to a new URLwaitForRequest()– wait for a request to a URL (or request identified by a predicate function)waitForResponse()– wait for a response from a URL (or a response identified by a predicate function)
Oooh, I like waitForNetworkIdle()! I think for my specific use case I also want to wait for responses to sent events (with a timeout), but I think mixing in network idle would help deal with part of my "but is that everything?" problem.
Oh, ha, I just saw waitForResponse() is in there, too!
Good to know! I hadn't included waitForResponse() in my smaller list because I wasn't sure if that would be immediately useful to chromote users. Glad to get that feedback right away. (I'll edit the list above.) I do think waitForRequest() and waitForResponse() are a package deal.
Yeah, definitely! I'd have to understand how it works a little better; what I do right now is log responses until a certain time has passed (simulating waitForNetworkIdle(), kinda), and then check that there are corresponding response events for each of those requests. I think what I'd want to do is fire off a waitForResponse() (with a timeout) as a callback on session$Network$requestWillBeSent(), disabling that callback with waitForNetworkIdle(). That assumes waitForResponse() ties (or at least can tie) to a particular request.
I very much doubt that my use case will fall neatly onto the standard path, but having the helpers would definitely make my code feel less hacky!