SharedObject icon indicating copy to clipboard operation
SharedObject copied to clipboard

Is modification of shared objects parallel-safe?

Open DarwinAwardWinner opened this issue 1 year ago • 5 comments

The documentation mentions that the package allows reading and writing of shared memory, but it would be good to spell out whether or not it's safe for multiple R processes to write to the same object in parallel, and what the semantics of such operations would be. For example:

library(future)
library(future.apply)
library(future.callr)
plan(callr, workers = 20)
library(SharedObject)
library(assertthat)

## Parallel-safety test 1: add 1 to x 100 times, in parallel, in place
x <- share(0, minLength = 0, mustWork = TRUE, copyOnWrite = FALSE)
n <- 100
invisible(future_replicate(
    n,
    x[1] <- x[1] + 1,
    future.scheduling = FALSE
))
assert_that(x == n)
#> [1] TRUE

## Parallel-safety test 2: add 1 to x 100 times, in parallel, in
## place, but with an intermediate variable and a delay
x <- share(0, minLength = 0, mustWork = TRUE, copyOnWrite = FALSE)
n <- 100
invisible(future_replicate(
    n,
    {
        newval <- x[1] + 1
        Sys.sleep(0.1)
        x[1] <- newval
    },
    future.scheduling = FALSE
))
assert_that(x == n)
#> Error: x not equal to n

Created on 2023-08-08 with reprex v2.0.2

On the other hand, I would assume it should always be safe to write to different elements of the same vector in parallel, but maybe that's a bad assumption, e.g. if shared memory is written in "chunks".

DarwinAwardWinner avatar Aug 08 '23 14:08 DarwinAwardWinner

Related question: can a shared variable be used as a mutex or semaphore?

DarwinAwardWinner avatar Aug 08 '23 14:08 DarwinAwardWinner

Note: I don't actually expect that the 2nd example should ever work without some kind of mutex or similar feature. My main point is that the semantics should be clearly documented so I can reason about whether or not such code will work as I want it to.

DarwinAwardWinner avatar Aug 08 '23 15:08 DarwinAwardWinner

Thanks for your observations, your two points:

  1. Supporting mutex or semaphore feature
  2. Documenting the behavior of a SharedObject when multiple processes are working on it

I think they are good suggestions. For the second one, my original idea here is concurrent writing to the same object at different locations(e.g. process 1 writes the first column of a matrix, and process 2 writes the second column of a matrix). I agree that the document is not very clear when it comes to your case. I'll make it clear in the next version

Jiefei-Wang avatar Aug 09 '23 02:08 Jiefei-Wang

To be more specific, I imagine implementing a mutex something like this (except with the implementation details hidden from the user, obviously):

library(BiocParallel)
library(SharedObject)
library(assertthat)

my_mutex <- share(0, minLength = 0, mustWork = TRUE, copyOnWrite = FALSE)

do_thing_with_mutex <- function(mutex, delay = 1, x) {
  while (mutex != 0) {
    Sys.sleep(0.1)
  }
  mutex[1] <- 1
  message("Job ", x, " has the mutex!")
  Sys.sleep(delay)
  message("Job ", x, " releases the mutex!")
  mutex[1] <- 0
  x
}

n <- 50
d <- 0.5
register(MulticoreParam(n))
## Should take at least N * D seconds
systime <- system.time(bplapply(seq_len(n), do_thing_with_mutex, mutex = my_mutex, delay = d))
message("Elapsed time: ", (systime["elapsed"]))
#> Elapsed time: 19.096
assert_that(systime["elapsed"] >= n * d)
#> Error: systime["elapsed"] not greater than or equal to n * d

Created on 2023-08-08 with reprex v2.0.2

As shown by the failed assertion, this doesn't seem to work with the current code, which isn't surprising since you said the code wasn't designed for it.

DarwinAwardWinner avatar Aug 09 '23 02:08 DarwinAwardWinner

To be clear: I'm not necessarily asking for a mutex implementation, since that might be out of scope for this package. I'm asking for documentation indicating whether or not this is an appropriate usage of the package.

DarwinAwardWinner avatar Aug 09 '23 02:08 DarwinAwardWinner