dbatools icon indicating copy to clipboard operation
dbatools copied to clipboard

Module Import Not Thread Safe

Open RandyInMarin opened this issue 1 year ago • 11 comments

Verified issue does not already exist?

I have searched and found no existing issue

What error did you receive?

Import-Command: The process cannot access the file 'C:\Users\rstone\Documents\PowerShell\Modules\dbatools\2.0.3\dbatools.dat' because it is being used by another process.

Steps to Reproduce

$test = 1..10 | ForEach-Object -ThrottleLimit 5 -Parallel {
  $thread = [System.Threading.Thread]::CurrentThread
  $mod = Get-Module -Name DBATools
  $dbstools = $null -ne $mod
  if (-not $dbstools) {
      Import-Module -Name dbatools -Function 'Find-DbaInstance'
  }
  $count = (Get-Command -Module dbatools).Count
  [PSCustomObject]@{
    PID             = $PID
    ManagedThreadId = $thread.ManagedThreadId
    dbstools        = $dbstools
    count           = $count
  }
}
$test | ft *

Please confirm that you are running the most recent version of dbatools

Major  Minor  Build  Revision
-----  -----  -----  --------
2      0      3      -1

Other details or mentions

All errors go away if I provide locking code to force imports to be sequential. This might seem really bad, but it only is required to load the modules on the the "ThrottleLimit" number of threads.

The data file is locked for write? Should it be able to share the read? If not, provide thread safety where required rather than on the entire module?

The results still looked okay for the "1..10 | ForEach-Object -ThrottleLimit 5 -Parallel" test.

  PID ManagedThreadId dbstools count
  --- --------------- -------- -----
27848              34    False     1
27848              32    False     1
27848              40     True     1
27848              41     True     1
27848              42     True     1
27848              43     True     1
27848              44     True     1
27848              36    False     1
27848              33    False     1
27848              35    False     1

When pushed, the result can be different on different threads. For example "1..100 | ForEach-Object -ThrottleLimit 10 -Parallel" has a lot of errors and the cmdlet counts are not always that expected. I've done tests importing all cmdlets and see only 10 loaded on some threads. Here, I have been trying to load 1 and see all loaded sometimes. Never know what you get when thread unsafe.

  PID ManagedThreadId dbstools count
  --- --------------- -------- -----
...
27848             110     True     1
27848             109     True     1
27848             108     True     1
27848              32    False     1
27848              16    False     1
27848               7    False   687
27848              23    False     1
27848              29    False   687
27848              31    False     1
27848              18    False   687
27848              49    False   687
...

Here is a thread safe version. No errors. The first 2 threads process everything before 8 threads are done importing the module. In a real case, searching for instances on many servers would take long enough to allow all threads time to get some real work done.

$fileLock = [System.Threading.ReaderWriterLockSlim]::new()

$test = 1..100 | ForEach-Object -ThrottleLimit 10 -Parallel {
    $managedThreadId = [System.Threading.Thread]::CurrentThread.ManagedThreadId 
    $mod = Get-Module -Name DBATools
    $dbstools = $null -ne $mod
    if (-not $dbstools) {
        $wait = $true
        $lock = $using:fileLock
        try {
            Write-Host "Thread $managedThreadId`: Waiting for lock..."
            $lock.EnterWriteLock()
            Write-Host "Thread $managedThreadId`: Importing module..."
            Import-Module -Name dbatools -Function 'Find-DbaInstance'
        }
        finally {
            if ($lock.IsWriteLockHeld) {
                $lock.ExitWriteLock()
                Write-Host "Thread $managedThreadId`: Lock released."
            }      
        }
    }
    else {
        $wait = $false
    }
    $count = (Get-Command -Module dbatools).Count
    [PSCustomObject]@{
        PID             = $PID
        ManagedThreadId = $managedThreadId
        dbstools        = $dbstools
        count           = $count
        wait            = $wait
    }
}
$test | ft *
  PID ManagedThreadId dbstools count  wait
  --- --------------- -------- -----  ----
27848             123    False     1  True
27848              82     True     1 False
27848              83     True     1 False
...
27848               3     True     1 False
27848             125    False     1  True
27848              37     True     1 False
...
27848             153     True     1 False
27848             154     True     1 False
27848             124    False     1  True
27848             131    False     1  True
27848             132    False     1  True
27848             134    False     1  True
27848              78    False     1  True
27848              79    False     1  True
27848              80    False     1  True
27848              81    False     1  True

What PowerShell host was used when producing this error

VS Code (integrated terminal)

PowerShell Host Version

Name                           Value
----                           -----
PSVersion                      7.3.4
PSEdition                      Core
GitCommitId                    7.3.4
OS                             Microsoft Windows 10.0.19045
Platform                       Win32NT
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0

SQL Server Edition and Build number

NA

.NET Framework Version

.NET 7.0.5

PSChildName                      Version
-----------                      -------
v2.0.50727                       2.0.50727.4927
v3.0                             3.0.30729.4926
Windows Communication Foundation 3.0.4506.4926
Windows Presentation Foundation  3.0.6920.4902
v3.5                             3.5.30729.4926
Client                           4.8.04084
Full                             4.8.04084
Client                           4.0.0.0

RandyInMarin avatar Jun 07 '23 17:06 RandyInMarin

thanks for the report, that's a bummer, i put a check in. any suggestions?

https://github.com/dataplat/dbatools/blob/development/dbatools.psm1#L180

potatoqualitee avatar Jun 07 '23 18:06 potatoqualitee

hi @potatoqualitee . This may be know to you but for the casual reader it may be beneficial a recap. If Import-Command indeed uses File.Open(Path, FileMode.Open, FileAccess.Read) and it cannot be changed there's no way around it. Without passing a third argument to File.Open (the FileShare enum) the lock is exclusive, as its the default.

It's easily reproducible opening two PS consoles and doing

$fpath = "c:\some\path\to\file.txt"
$x = [io.file]::Open($fpath, 'Open', 'Read')

you'll see the second blocked, until you call $x.Close() on the first one.

If we could tweak a bit the importer in some way, the lock-free way to do it would be

$fpath = "c:\some\path\to\file.txt"
$x = [io.file]::Open($fpath, 'Open', 'Read', 'Read')

if you try this version it'll work in both consoles.

edit: fffffffffff . I now see Import-Command comes from the dbatools.library. If we can edit that, https://github.com/dataplat/dbatools.library/blob/d2519ad5e0429f5f43cbc1d7cdbb0925fd119314/project/dbatools/Commands/ImportCommandCommand.cs#L38 needs to be using (FileStream fs = File.Open(Path, FileMode.Open, FileAccess.Read, FileShare.Read));

niphlod avatar Jun 07 '23 19:06 niphlod

Should there still be a loop to wait a few seconds just in case something else locks the file (e.g., a virus checker scanning script)? Is using a test for write access to much when read access is required?

RandyInMarin avatar Jun 07 '23 20:06 RandyInMarin

I'd say leave the check for things out of dbatools' control, but given we can make something better allowing import-command to share the .dat while reading ... why not ?

BTW: a simple test for read access should be enough, so ::OpenRead( instead of ::OpenWrite( should suffice

niphlod avatar Jun 07 '23 20:06 niphlod

Is the dbatools.library something I can change on my system to test this? Seems like there used to be a "all" file of some sort with everything. I could hack it to make a simple change. Don't see that now.

RandyInMarin avatar Jun 07 '23 20:06 RandyInMarin

The library is now a separate module that will also import into each thread you create. AllCommands.ps1 is no longer used, that was replaced with the dat file.

wsmelton avatar Jun 07 '23 22:06 wsmelton

working on this now

potatoqualitee avatar Jun 08 '23 11:06 potatoqualitee

Hello,

is some progress on this issue with multithread? As we also are experiencing this issue from long time and waiting for new DBATOOL version with this issue fixed.

ToGoKuWr avatar Jul 03 '23 14:07 ToGoKuWr

I'm wondering if the write access acts as a thread block and prevents more problems than it creates. What other shared resources (e.g., temp files, tables, settings, etc.) are being used by multiple threads during the module import and after?

Some preference variable values are reset, so I suspect the global and module scopes are local to each thread. At least memory is thread safe. BTW, here is something that should scare anybody using -parallel. (Some preference variables are copied to each thread correctly, but the most important one is not.)

$global:WhatIfPreference = $true
$WhatIfPreference
1..2 | ForEach-Object -Parallel {$WhatIfPreference}

True
False
False
1..2 | ForEach-Object -Parallel {$WhatIfPreference=$using:WhatIfPreference; $WhatIfPreference} # fix

RandyInMarin avatar Jul 05 '23 19:07 RandyInMarin

Hello guys,

is there any discussion to fix this problem soon that in next release we can find this fixed?

ToGoKuWr avatar Aug 03 '23 14:08 ToGoKuWr

No updates to provide at this time.

wsmelton avatar Aug 03 '23 15:08 wsmelton