doAzureParallel icon indicating copy to clipboard operation
doAzureParallel copied to clipboard

object 'results' not found

Open ericchansen opened this issue 6 years ago • 8 comments

Before submitting a bug please check the following:

  • [x] Start a new R session
  • [x] Check your credentials file
  • [x] Install the latest doAzureParallel package
  • [x] Submit a minimal, reproducible example
  • [x] run sessionInfo()

Updates

EDIT (11/29/2018) - Added additional examples, corrected a typo and improved formatting.

Description

Can someone please explain why I get the error object 'results' not found? Full code below.

Ideally, I need to return two objects from inside the loop. One is a data frame that gets row binded, and the other is a list that needs to become a list of lists. In the example below, I'm only returning the data frame (will work on adding the list as additional output once this issue is resolved).

Instruction to repro the problem if applicable

Example 1

remove.packages("rAzureBatch")
remove.packages("doAzureParallel")
devtools::install_github("Azure/rAzureBatch", force = TRUE)
devtools::install_github("Azure/doAzureParallel", force = TRUE)

library(doAzureParallel)
setVerbose(TRUE)

setCredentials(file.path(getwd(), "credentials.json"))
cluster <- makeCluster(file.path(getwd(), "cluster.json"), fullName=TRUE)
registerDoAzureParallel(cluster)
getDoParWorkers()

# my_results <- foreach(t = 1:3) %do% { # Works.
# my_results <- foreach(t = 1:3, .combine = 'rbind') %do% { # Works.
# my_results <- foreach(t = 1:3, .combine = 'rbind', .options.azure = list(autoDeleteJob = FALSE)) %dopar% { # Works.
# my_results <- foreach(t = 1:3, .options.azure = list(enableCloudCombine = FALSE, autoDeleteJob = FALSE)) %dopar% { # object 'results' not found
my_results <- foreach(t = 1:3, .combine = 'rbind', .options.azure = list(enableCloudCombine = FALSE, autoDeleteJob = FALSE)) %dopar% { # object 'results' not found
  my_results_df <- data.frame("x" = runif(2), "trial" = replicate(2, t))
  my_results_list <- runif(3)
  return(my_results_df)
}

sessionInfo()

Example 2 featuring superfluous use of setAutoDeleteJob(FALSE)

library(doAzureParallel)
setVerbose(TRUE)
setAutoDeleteJob(FALSE)

setCredentials(file.path(getwd(), "credentials.json"))
cluster <- makeCluster(file.path(getwd(), "cluster.json"), fullName=TRUE)
registerDoAzureParallel(cluster)
getDoParWorkers()

# my_results <- foreach(t = 1:3, .options.azure = list(enableCloudCombine = FALSE, autoDeleteJob = FALSE)) %dopar% { # object 'results' not found
# my_results <- foreach(t = 1:3, .options.azure = list(enableCloudCombine = FALSE)) %dopar% { # object 'results' not found
# my_results <- foreach(t = 1:3, .combine = 'rbind', .options.azure = list(enableCloudCombine = FALSE)) %dopar% { # object 'results' not found
my_results <- foreach(t = 1:3, .combine = 'rbind', .options.azure = list(enableCloudCombine = FALSE, autoDeleteJob = FALSE)) %dopar% { # object 'results' not found
  my_results_df <- data.frame("x" = runif(2), "trial" = replicate(2, t))
  my_results_list <- runif(3)
  return(my_results_df)
}

Output from sessionInfo()

R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server >= 2012 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] doAzureParallel_0.7.2 iterators_1.0.9       foreach_1.4.5        
 [4] RevoUtilsMath_10.0.1  RevoUtils_10.0.7      RevoMods_11.0.0      
 [7] MicrosoftML_9.3.0     mrsdeploy_1.1.3       RevoScaleR_9.3.0     
[10] lattice_0.20-35       rpart_4.1-11         

loaded via a namespace (and not attached):
 [1] codetools_0.2-15       CompatibilityAPI_1.1.0 digest_0.6.13         
 [4] rAzureBatch_0.6.2      mime_0.5               bitops_1.0-6          
 [7] grid_3.4.3             R6_2.2.2               jsonlite_1.5          
[10] httr_1.3.1             curl_3.1               rjson_0.2.15          
[13] tools_3.4.3            RCurl_1.95-4.9         yaml_2.1.16           
[16] compiler_3.4.3         mrupdate_1.0.1       

Output from error

==============================================================================
Id: job20181128173916
chunkSize: 1
enableCloudCombine: FALSE
errorHandling: stop
wait: TRUE
autoDeleteJob: TRUE
==============================================================================
Submitting tasks (3/3)
Waiting for tasks to complete. . .
| Progress: 100.00% (3/3) | Running: 0 | Queued: 0 | Completed: 3 | Failed: 0 |
Tasks have completed. 
Error in e$fun(obj, substitute(ex), parent.frame(), e$data) : 
  object 'results' not found
Called from: e$fun(obj, substitute(ex), parent.frame(), e$data)

ericchansen avatar Nov 28 '18 17:11 ericchansen

Have a look at this issue 284, I have just been running into this myself and it seems using the option setAutoDeleteJob(FALSE) and .options.azure = list(enableCloudCombine = FALSE) will solve your issue. the link has more details but bassically you merge it yourself by reading from the blob storage directly.

Pullarg avatar Nov 29 '18 00:11 Pullarg

@Pullarg I tried my_results <- foreach(t = 1:3, .options.azure = list(enableCloudCombine = FALSE, autoDeleteJob = FALSE)) %dopar% { # object 'results' not found, which you can see in my original post. That should behave the same way as using setAutoDeleteJob(FALSE). That being said, I tested several variations with setAutoDeleteJob(FALSE) anyway (code below). All resulted in the same error (also shown below).

Error message

==============================================================================
Id: job20181129155730
chunkSize: 1
enableCloudCombine: FALSE
errorHandling: stop
wait: TRUE
autoDeleteJob: FALSE
==============================================================================
Submitting tasks (3/3)
Waiting for tasks to complete. . .
| Progress: 100.00% (3/3) | Running: 0 | Queued: 0 | Completed: 3 | Failed: 0 |
Tasks have completed. 
Error in e$fun(obj, substitute(ex), parent.frame(), e$data) : 
  object 'results' not found
Called from: e$fun(obj, substitute(ex), parent.frame(), e$data)

Sample code

library(doAzureParallel)
setVerbose(TRUE)
setAutoDeleteJob(FALSE)

setCredentials(file.path(getwd(), "credentials.json"))
cluster <- makeCluster(file.path(getwd(), "cluster.json"), fullName=TRUE)
registerDoAzureParallel(cluster)
getDoParWorkers()

# my_results <- foreach(t = 1:3, .options.azure = list(enableCloudCombine = FALSE, autoDeleteJob = FALSE)) %dopar% { # object 'results' not found
# my_results <- foreach(t = 1:3, .options.azure = list(enableCloudCombine = FALSE)) %dopar% { # object 'results' not found
# my_results <- foreach(t = 1:3, .combine = 'rbind', .options.azure = list(enableCloudCombine = FALSE)) %dopar% { # object 'results' not found
my_results <- foreach(t = 1:3, .combine = 'rbind', .options.azure = list(enableCloudCombine = FALSE, autoDeleteJob = FALSE)) %dopar% { # object 'results' not found

  my_results_df <- data.frame("x" = runif(2), "trial" = replicate(2, t))
  my_results_list <- runif(3)
  return(my_results_df)
}

ericchansen avatar Nov 29 '18 16:11 ericchansen

Hi @ericchansen

If you remove the enableCloudCombine flag, you will get your results. The object 'result not found' occurs because no file is found on Azure Storage that contains the merged result (RDS file that contains all the tasks since enableCloudCombine is set to disable). I will add better error handling for this case.

Below: This example works

my_results <- foreach(t = 1:3, .combine = 'rbind') %dopar% {
  my_results_df <- data.frame("x" = runif(2), "trial" = replicate(2, t))
  my_results_list <- runif(3)
  return(my_results_df)
}

my_results

brnleehng avatar Nov 29 '18 17:11 brnleehng

@brnleehng Yep, that's what I've been doing.

I suppose I don't understand the use case for enableCloudCombine = FALSE. How should we be using this option?

The documentation doesn't have any clear examples besides what's mentioned here. Looking at that example, I feel like that would also trigger this error.

ericchansen avatar Nov 29 '18 17:11 ericchansen

@brnleehng I need to return a list rather than bind rows. Is there anyway to skip the merge step at all, as i am getting the same failure. I would like to just perform this on the head( non cloud side) by reading from the storage account directly.

Update: Dumb question just needed to turn the result back into a list. , after having the rbind result i can convert from a data.frame to a list by using this statement split(rbind.df, seq(nrow(rbind.df)))

Pullarg avatar Nov 30 '18 01:11 Pullarg

Hi @ericchansen

The case for enableCloudCombine = FALSE is to avoid merging all your resources onto one VM while the other VMs are in idle (Unless you are using autoscale). There are cases when your tasks are producing many/large files that the merge task can run out of memory causing your job to fail.

Hi @Pullarg, You should use getJobResult function to download all the results locally and it will manually merge it as a list.

> getJobResult("job20181205211937")
Getting job results...
enableCloudCombine is set to FALSE, we will merge job result locally
[[1]]
[1] 2

[[2]]
[1] 3

[[3]]
[1] 4

Thanks, Brian

brnleehng avatar Dec 05 '18 21:12 brnleehng

I'm getting the same error even with enableCloudCombine = FALSE. In my code, I am not returning any results from the %dopar% block. Instead, I am just writing my result dataframe to disk. My code runs correctly but the error still appears. Is there a way to avoid this error when the code intentionally does not return a result?

angusrtaylor avatar Jan 04 '19 13:01 angusrtaylor

Can you add NULL at the end of the %dopar% block?

I'm looking into fixing enableCloudCombine path = false.

brnleehng avatar Jan 08 '19 17:01 brnleehng