What is causing the `external pointer is not valid` error in `parallel::parSapply`?

I am trying to pass an object (i.e., of R6 class; this particular one) to a number of workers created using parallel::makePSOCKcluster() and I get:

Error in checkForRemoteErrors(val) : 
  one node produced an error: external pointer is not valid

Based on this post by Henrik Bengtsson:

[...] there is a set of object types that cannot be passed on to another R process and be expected to work there.

I want to understand whether the object I am trying to pass falls in this category and, if so, what my options are.

Here is a MRE:

Scenario 1: (working) Creating the model object inside each worker.

(function() {
    # Create cluster.
    cluster <- parallel::makePSOCKcluster(parallel::detectCores() - 1)

    # Stop cluster.
    on.exit(parallel::stopCluster(cluster))

    # Bare minimum data.
    x <- matrix(rnorm(100), 10, 10)
    y <- runif(10)

    # Run operation.
    result <- parallel::parSapply(cluster, c(1), function(i) {
        # The 'osqp' object.
        model <- osqp::osqp(P = crossprod(x), q = -crossprod(x, y), pars = list(verbose = FALSE))

        # Calling the solver.
        return(model$Solve()$x)
    })

    # Inspect result.
    print(result)
})()

Scenario 2: (not working) Creating the model object in the main and passing it to the workers.

(function() {
    # Create cluster.
    cluster <- parallel::makePSOCKcluster(parallel::detectCores() - 1)

    # Stop cluster.
    on.exit(parallel::stopCluster(cluster))

    # Bare minimum data.
    x <- matrix(rnorm(100), 10, 10)
    y <- runif(10)

    # The 'osqp' object.
    model <- osqp::osqp(P = crossprod(x), q = -crossprod(x, y), pars = list(verbose = FALSE))

    # Run operation.
    result <- parallel::parSapply(cluster, c(1), function(i) {
        # Calling the solver.
        return(model$Solve()$x)
    })

    # Inspect result.
    print(result)
})()

Scenario 1 works so it seems I can use osqp inside the workers. But, when instead I create that object outside and pass it to the workers (i.e., Scenario 2), it fails.

To provide a bit more context, I have no control over the model creation. I am receiving an instance created elsewhere and I am only allowed to call a few methods on that instance (e.g., $Update()).


Update 1

It does not seem to be related to the fact that R6 instances are environments. The following still works as intended.

# Create mock model class.
ModelMock <- R6::R6Class("ModelMock",
    public = list(
        Solve = function() {
            return(list(x = "Mocked model output."))
        }
    )
)

(function() {
    # Create cluster.
    cluster <- parallel::makePSOCKcluster(parallel::detectCores() - 1)

    # Stop cluster.
    on.exit(parallel::stopCluster(cluster))

    # The mocked 'osqp' object.
    model <- ModelMock$new()

    # Run operation.
    result <- parallel::parSapply(cluster, c(1), function(i) {
        # Calling the solver.
        return(model$Solve()$x)
    })

    # Inspect result.
    print(result)
})()

1 answer

  • answered 2021-06-23 10:08 Mihai

    Roland pointed out that environment(model$Solve) contains a private environment that contains an externalptr object .work:

    typeof(model$.__enclos_env__$private$.work)  ​
    # "externalptr"
    

    This pointer .work is created using compiled code, i.e., via an Rcpp export (see this export).

    It seems that this pointer is managed by compiled code and, as such, I can not use it within the workers. It is fine to call the compiled code and create this pointer from within the workers. What is not fine is to create this pointer in another process (i.e., the main process) and then pass it to the worker processes. This is probably because each worker is created as a separate R process, with its own memory space.

    Not ideal, but what might work, as Roland pointed out, is to somehow copy the data at that pointer and ensure it is passed to the workers. But this likely requires an Rcpp implementation.

    For those interested in this particular package, you may also follow this issue on GitHub.