RSelenium icon indicating copy to clipboard operation
RSelenium copied to clipboard

Run headless firefox inside dockerfile

Open MislavSag opened this issue 6 years ago • 1 comments

Operating System

Windows 7, but question is about Azure batch

Selenium Server version (selenium-server-standalone-3.0.1.jar etc.)

Browser version (firefox 50.1.0, chrome 54.0.2840.100 (64-bit) etc.)

Latest Firefox.

Other driver version (chromedriver 2.27, geckodriver v0.11.1, iedriver x64_3.0.0, PhantomJS 2.1.1 etc.)

Latest geckoriver version.

Expected behaviour

Run parallel tests in Azure batch using doAzureParallel package

Actual behaviour

Can't run.

Steps to reproduce the behaviour

I know to run headless Firefox in parallel using RSelenium inside foreach loop:

cl <- parallel::makeCluster(detectCores() - 5)
registerDoParallel(cl)

rD <- RSelenium::rsDriver(
  browser = "firefox",
  extraCapabilities = list(
    "moz:firefoxOptions" = list(
      args = list('--headless')
    )
  )
)

clusterExport(cl, "rD")

do_first <- TRUE
oib_foreach_loop <- foreach(i = 1:nrow(df), 
             .packages = c("RSelenium"),
             .combine = 'rbind',
             .export = c("df", "do_first")) %dopar% {
               
               if (do_first == TRUE) {
                 driver <<- rD$client
                 driver$open()
                 Sys.sleep(2)
                 driver$navigate("http://www.google.com")
                 Sys.sleep(1L)
                 do_first <<- FALSE
               }
               # test here
             }

Everything works fine. But I am not sure how to run same script using doAzureParallel package: https://github.com/Azure/doAzureParallel

I have written a dockerfile that installs JAva, RSelenium and firefox:

## Start with the official rocker image (lightweight Debian) 
FROM rocker/r-base:latest 

RUN  apt-get update \
  && apt-get install -y --no-install-recommends \
   libxml2-dev \
   libcurl4-openssl-dev \
   libssl-dev \
   gnupg2 \
   libfftw3-dev \
   libtiff-dev \
   libx11-dev \
   libcairo2-dev \
   libxt-dev \
   firefox
 
#RUN add-apt-repository -y ppa:mozillateam/firefox-next

## Install Java 
RUN echo "deb http://ppa.launchpad.net/webupd8team/java/ubuntu trusty main" \ 
		| tee /etc/apt/sources.list.d/webupd8team-java.list \ 
	&& echo "deb-src http://ppa.launchpad.net/webupd8team/java/ubuntu trusty main" \ 
		| tee -a /etc/apt/sources.list.d/webupd8team-java.list \ 
	&& apt-key adv --keyserver keyserver.ubuntu.com --recv-keys EEA14886 \ 
	&& echo "oracle-java8-installer shared/accepted-oracle-license-v1-1 select true" \ 
		| /usr/bin/debconf-set-selections \ 
	&& apt-get update \ 
	&& apt-get install -y oracle-java8-installer \ 
	&& update-alternatives --display java \ 
	&& rm -rf /var/lib/apt/lists/* \ 
	&& apt-get clean \ 
	&& R CMD javareconf	

## make sure Java can be found in rApache and other daemons not looking in R ldpaths 
RUN echo "/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/" > /etc/ld.so.conf.d/rJava.conf 
RUN /sbin/ldconfig
   
# Install the R Packages from CRAN
RUN Rscript -e 'install.packages(c("Cairo", "Rcpp", "RSelenium", "httr", "rvest", "imager", "RCurl"))'

but I don't know how to execute this command on host:

rD <- RSelenium::rsDriver(
  browser = "firefox",
  extraCapabilities = list(
    "moz:firefoxOptions" = list(
      args = list('--headless')
    )
  )
)

Is it possible to somehow run headless Firefox using dockerfile commands?

MislavSag avatar Nov 21 '18 13:11 MislavSag

Hi, I will try to reformulate my long question with one very simple one: What is the recommended way to run many RSelenium tests? Let's say I would like to run 1000 tests and each step takes 1 hour. Running tests one by one takes lot's of time (24 test per day, so in total ccca 42 days). I know how to use doParallel and foreach package to run tests in parallel on my machine: https://stackoverflow.com/questions/38950958/run-rselenium-in-parallel But sometimes, this is not enough. I would like like to run around 100 tests in parallel. I tried to use Azure Batch for that, but get lot's of errors on some nodes when starting the selenium server. What would be general advice in situations where we need to use RSelenium in lot's of parallel tests?

MislavSag avatar Nov 26 '18 10:11 MislavSag