PowerShell-Docs icon indicating copy to clipboard operation
PowerShell-Docs copied to clipboard

Passing variables by reference to thread jobs is not correctly documented

Open martinprikryl opened this issue 3 years ago • 4 comments

Documentation Issue

PowerShell Scopes documentation says this about using variables to "Thread jobs":

The Using scope modifier is supported in the following contexts:

  • ...
  • Thread jobs, started via Start-ThreadJob or ForEach-Object -Parallel (separate thread session)

Depending on the context, embedded variable values are either independent copies of the data in the caller's scope or references to it.
...
In thread sessions, they are passed by reference. This means it is possible to modify call scope variables in a different thread. To safely modify variables requires thread synchronization.

To me, coming from C#/C++ background, passing by reference means that you can assign these variables and have the assigned value be available in the calling code.

Yet the following fails to run:

$foo = 1

Start-ThreadJob {
    Write-Host $using:foo
    $using:foo = 2
} | Wait-Job | Out-Null

Write-Host $foo

It errors on $using:foo = 2 with:

The assignment expression is not valid. The input to an assignment operator must be an object that is able to accept assignments, such as a variable or a property.

I assume it's not a bug in PowerShell, but rather the documentation does not really correctly document how the variable can be modified. That the actual variable cannot be modified, but if one passes something like an object or a hash table, one can modify its fields/contents. I.e. it is conceptually more like passing a pointer to an object by value, rather then passing a variable by reference.

Context of the issue

https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_scopes#the-using-scope-modifier

Originally posted on Stack Overflow:
Modifying PowerShell $using scope variables in thread started with Start-ThreadJob

martinprikryl avatar Mar 25 '21 07:03 martinprikryl

Hey @martinprikryl thanks for the feedback here. This is a fair point. We don't have an article talking about concurrency specifically realated to PowerShell but have an issue open here to address that.

A good solution here will be to link to that article once it's created.

chasewilson avatar Mar 25 '21 15:03 chasewilson

The problem is that https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_scopes#the-using-scope-modifier confuses variables with their values.

This means it is possible to modify call scope variables in a different thread.

A $using: reference only ever expands to a variable's value - there is no way to refer to the caller's variables themselves from out-of-runspace code.

Therefore, you can never update a caller's variable.

What you can do - in thread-based parallelism only - is to modify an object that a caller's variable references, which only applies if the variable value happens to be an instance of a .NET reference type, such as a hash table.

mklement0 avatar Sep 20 '21 22:09 mklement0

You can update the PSVariable instance, the same way you can update any other reference type:

$foo = 1
$refOfFoo = Get-Variable foo

Start-ThreadJob {
    ($using:refOfFoo).Value = 2
} | Receive-Job -Wait -AutoRemoveJob

Write-Host $foo

Worth noting this is clearly not a thread safe operation. Looking at the answer you got from SO, its also worth noting that a synchronized hash table will not ensure that updating the same key from multiple threads is thread safe, that's incorrect. There must be a locking mechanism implemented while updating it. A simple way to demonstrate it:

$attempts = 0

do {
    $attempts++

    $foo = [hashtable]::Synchronized(@{
        Value = 0
    })

    0..10 | ForEach-Object -Parallel {
        Start-Sleep -Milliseconds 200
        ($using:foo).Value++
    } -ThrottleLimit 11
}
until($foo['Value'] -ne 11)

"It took $attempts attempts to make this fail."

santisq avatar Apr 16 '23 04:04 santisq

I don't get this demonstration. Why do you expect the $foo['Value'] to be updated in sequence? The only thing you need to care about is that two different threads doesn't try to update the value at the exact same moment, right?

And shouldn't you be using any of the thread safe methods?

What I can tell we care about removing and adding objects when talking about thread safe?
MS Learn - Thread-safe collections

DennisL68 avatar Jun 22 '23 16:06 DennisL68