distributed icon indicating copy to clipboard operation
distributed copied to clipboard

LocalCluster fails to spawn requested number of workers (capped at 5 workers on Windows)

Open jianlinshi opened this issue 2 months ago • 3 comments

Description

When creating a LocalCluster with n_workers > 5, the cluster consistently only spawns exactly 5 workers, regardless of the requested number. This appears to be a hard limit on Windows systems.

Environment

  • OS: Windows
  • Python version: (output from python --version)
  • Dask version: 2025.10.0
  • Distributed version: 2025.10.0
  • Installation method: conda/miniforge

Minimal Reproducible Example

from dask.distributed import LocalCluster, Client
import time

if __name__ == '__main__':
    print('=== TESTING DASK LOCAL CLUSTER WITHOUT PLUGINS ===')
    
    for n_workers in [5, 7, 10, 20, 50]:
        print(f'\n--- Testing {n_workers} workers ---')
        
        cluster = LocalCluster(
            n_workers=n_workers,
            threads_per_worker=1,
            processes=True,
            memory_limit='500MB',
            silence_logs=True
        )
        client = Client(cluster)
        
        # Wait for workers to connect
        time.sleep(5)
        
        # Check how many workers registered
        info = client.scheduler_info()
        num_connected = len(info['workers'])
        print(f'Requested: {n_workers}, Connected: {num_connected}')
        
        client.close()
        cluster.close()
        time.sleep(2)
    
    print('\n=== TEST COMPLETE ===')

Expected Behavior

The LocalCluster should spawn the number of workers specified by the n_workers parameter. For example:

n_workers=7 should create 7 workers n_workers=10 should create 10 workers n_workers=50 should create 50 workers

Actual Behavior

Regardless of the n_workers parameter, exactly 5 workers are created and connected: === TESTING DASK LOCAL CLUSTER WITHOUT PLUGINS ===

--- Testing 5 workers --- Requested: 5, Connected: 5

--- Testing 7 workers --- Requested: 7, Connected: 5

--- Testing 10 workers --- Requested: 10, Connected: 5

--- Testing 20 workers --- Requested: 20, Connected: 5

--- Testing 50 workers --- Requested: 50, Connected: 5

=== TEST COMPLETE ===

jianlinshi avatar Nov 03 '25 02:11 jianlinshi

The number of workers returned by client.scheduler_info() was capped to 5 in #9045 to avoid sending too much information on large clusters. You can explicitly set client.scheduler_info(n_workers=-1) to remove this limit.

If you do this in your script are you still seeing this problem?

jacobtomlinson avatar Nov 03 '25 12:11 jacobtomlinson

Thanks. Why not set def scheduler_info(self, n_workers: int = -1, **kwargs: Any) , instead of def scheduler_info(self, n_workers: int = 5, **kwargs: Any) ? because capping 5 without explicit configuration, is really confusing.

jianlinshi avatar Nov 03 '25 14:11 jianlinshi

See #9045 for the reasoning behind this decision.

We generally don't recommend that people use scheduler_info directly, it should probably be named _scheduler_info. You can use the n_workers property directly.

Where were you seeing the limit? Was it in Jupyter or somewhere else? Or was is due to using scheduler_info directly? It would help to know where you ran into this so we could potentially reconsider how we expose this information.

jacobtomlinson avatar Nov 04 '25 10:11 jacobtomlinson