RSelenium
RSelenium copied to clipboard
Why does google search page source results crash RSelenium webdriver?
Using Linux why does google search results page source for anything crash my browser?
The example below is reproducible to get the error using a linux google compute engine VM:
When I read google href results from any search using the findElements it works just fine, but when I use remDr$getPageSource()[[1]] (to parse using rvest instead) IT CRASHES MY WEBDRIVER ALWAYS.
Reproduce webdriver:
library(RSelenium)
library(rvest)
library(httr)
library(RSelenium)
randsleep <- function(low=1,high=2){
Sys.sleep(sample(seq(low,high,by=0.001),1))
}
system("sudo kill -9 $(lsof -t -i:4444)")
system("sudo kill -9 $(lsof -t -i:4445)")
eCaps <- list(chromeOptions = list(
args = c('--headless', '--disable-gpu', '--window-size=1280,800')
))
randsleep()
rD <- rsDriver(port=4445L, extraCapabilities = eCaps, browser=c("chrome"), chromever = "76.0.3809.68")
randsleep()
remDr <- rD$client
Reproduce search:
searchstring <- "dogs"
remDr$navigate("https://www.google.com")
randsleep(4,6)
googsearch <- remDr$findElement(using='xpath','//input[@name="q"]')
randsleep(2,3)
googsearch$sendKeysToElement(list(searchstring))
randsleep(2,3)
googsearch$submitElement()
randsleep(10,15)
WORKS FINE:
googres <- remDr$findElements(using='xpath','//div[@class="r"]/a')
reslinks <- sapply(seq_along(googres),FUN=function(x){googres[[x]]$getElementAttribute("href")})
CRASHES THE WEBDRIVER:
ps1 <- remDr$getPageSource()
#ERROR:
Undefined error in httr call. httr output: Failed to connect to localhost port 4445: Connection refused
Hi @nbarsch, I wasn't able to reproduce your example. I tried it on Ubuntu 18.04 with the same chrome browser and driver version as you (76.0.3809.68).
Have you try using the docker? https://github.com/SeleniumHQ/docker-selenium https://ropensci.github.io/RSelenium/articles/docker.html