RSelenium icon indicating copy to clipboard operation
RSelenium copied to clipboard

How to set pageLoadStrategy

Open ruimgbarros opened this issue 4 years ago • 2 comments

Hello everyone!

I'm sorry for opening an issue for this since it is more of a question, but I don't know where to ask. I need to prevent my page to stop waiting for all the page to be loaded before starting doing stuff on the page (basically, broken js loading in some archived pages).

I've seen this solution but, to be honest, I have no idea how can I set the pageLoadStrategy with RSelenium...

Is there a way to do this?

ruimgbarros avatar Mar 25 '21 20:03 ruimgbarros

Yes, below is an example that is working for me.

driver <- rsDriver(port = 4568L, browser=c("firefox"), extraCapabilities=ffprof) remote_driver <- driver[["client"]] remote_driver$extraCapabilities$pageLoadStrategy <- "eager"

deathmaster9 avatar May 08 '21 07:05 deathmaster9

@ruimgbarros @deathmaster9 Depending on the website and the task the pageLoadStrategy does not always work. But it will work if you want to do something before the page loads for firefox. Often times the things you can do though are limited.

Part of it has to do with the fact the Rselenium needs to be moved to Selenium 3.0/4.0. I suspect as the package is updated this will fixed.

The major other issues some websites act up if you try to change the page load strategy--in order to prevent web scraping This is why for our Rselenium scripts we include Sys.sleep() after every click or key entry. These are my workplace's Standard Web scraping guidelines

  • 3 seconds for any method or function of the chrome/firefox web driver
  • 10 seconds for the first page to load or when changing the domain of the website.
  • 300 seconds (5 minutes) for any download. Though this can scale up to 18000 seconds (6 hours) depending on the website.
  • After 18000 seconds it's better to make a second script to complete the test.

mlane3 avatar Sep 16 '22 11:09 mlane3