tinygo icon indicating copy to clipboard operation
tinygo copied to clipboard

Add goroutine core affinity support for RP2040/RP2350 systems

Open amken3d opened this issue 1 month ago • 7 comments

This PR proposes

  • Support for CPU core pinning and affinity for tasks and goroutines.
  • Updated the scheduler to respect affinity constraints with separate queues for pinned and shared tasks.
  • Added new runtime API functions LockToCore, UnlockFromCore, GetAffinity, and CurrentCPU.
  • Example program demonstrates core pinning and unpinned execution behavior.

API Functions

runtime.NumCPU() int

Returns the number of CPU cores available (returns 2 on RP2040/RP2350).

runtime.CurrentCPU() int

Returns the current CPU core number (0 or 1).

runtime.LockToCore(core int)

Pins the current goroutine to the specified core:

  • core = 0 - Pin to core 0
  • core = 1 - Pin to core 1
  • core = -1 - Unpin (allow running on any core)

Panics if core is invalid (not -1, 0, or 1).

runtime.UnlockFromCore()

Unpins the current goroutine, allowing it to run on any core. Equivalent to runtime.LockToCore(-1).

runtime.GetAffinity() int

Returns the current goroutine's CPU affinity:

  • Returns -1 if not pinned (can run on any core)
  • Returns 0 or 1 if pinned to that specific core

Example program included in the examples directory

  • Tested on both pico and pico2

  • Output of example program

=== Core Pinning Example ===                                             
Number of CPU cores: 2                                                   
Main starting on core: 0                                                 
                                                                         
Main pinned to core: 0

Core 0 (main): 0 on CPU 0
Worker pinned to core: 1
  Core 1 (worker): 0 on CPU 0
Unpinned worker starting, affinity: 0
    Unpinned worker: 0 on CPU 0
Core 0 (main): 1 on CPU 0
    Unpinned worker: 1 on CPU 0
Core 0 (main): 2 on CPU 0
  Core 1 (worker): 2 on CPU 1
    Unpinned worker: 2 on CPU 0                                                 
Core 0 (main): 3 on CPU 0                                                       
  Core 1 (worker): 3 onCPU 1                                                    
Core 0 (main): 4 on CPU 0                                                       
  Core 1 (worker): 4 on CPU 1                                                   
    Unpinned worker: 3 on CPU 0                                                 
Core 0 (main): 5 on CPU 0                                                       
  Core 1 (worker): 5 on CPU 1                                                   
    Unpinned worker: 4 on CPU 0                                                 
Core 0 (main): 6 on CPU 0                                                       
  Core 1 (worker): 6 on CPU 1                                                   
Core 0 (main): 7 on CPU 0                                                       
  Core 1 (worker): 7 on CPU 1                                                   
    Unpinned worker: 5 on CPU 0                                                 
Core 0 (main): 8 on CPU 0                                                       
  Core 1 (worker): 8 on CPU 1                                                   
    Unpinned worker: 6 on CPU 0                                                 
Core 0 (main): 9 on CPU 0                                                       
  Core 1 (worker): 9 on CPU 1                                                   
    Unpinned worker: 7 on CPU 0                                                 
                                                                                
Main unpinned, affinity: -1                                                     
Unpinned main on CPU 0                                                          
  Core 1 worker finished                                                        
Unpinned main on CPU 0                                                          
    Unpinned worker: 8 on CPU 0                                                 
Unpinned main on CPU 0                                                          
    Unpinned worker: 9 on CPU 0                                                 
Unpinned main on CPU 0                                                          
Unpinned main on CPU 1                                                          
    Unpinned worker finished                                                    
                                                                                
Example complte!   

amken3d avatar Nov 16 '25 23:11 amken3d

Do you actually care about the particular core? If not, are the existing runtime.LockOSThread and runtime.UnlockOSThread calls sufficient to lock/unlock a goroutine to a core?

eliasnaur avatar Nov 17 '25 10:11 eliasnaur

This is what I see for LockOsThread

// LockOSThread wires the calling goroutine to its current operating system thread. // Stub for now // Called by go1.18 standard library on windows, see https://github.com/golang/go/issues/49320 func LockOSThread() { }

// UnlockOSThread undoes an earlier call to LockOSThread. // Stub for now func UnlockOSThread() { }

There seems to be no implementation behind it.

For the RP2, since it is symmetrical multi processor, it probably doesn't matter which exact core. But for something like StM32h7, it would matter which core you pin to. (I know we don't support multicore on it yet)

amken3d avatar Nov 17 '25 12:11 amken3d

I know. What I'm saying is to change LockOSThread to mean "lock the current goroutine to a core" (when using the cores scheduler).

eliasnaur avatar Nov 17 '25 12:11 eliasnaur

Fair point. That seems reasonable to me. I can use those function names instead. The only issue I see is that Lock OsThread can't take any arguments. I'd like to be able to tell which core to lock to.

amken3d avatar Nov 17 '25 13:11 amken3d

Right. So LockOSThread is enough for use cases where you only care about exclusive access to a some core. For heterogeneous cores, I suggest:

  • Move the API to package machine.
  • Drop CurrentCPU - it's racy (its result may be invalidated at any time).
  • Drop GetAffinity - it seems that code that pin itself to a particular core should know what it's doing.
  • Drop the -1 special case from LockToCore
  • Rename LockToCore to LockCore to mimic LockOSThread naming. Rename UnlockFromCore to UnlockCore for the same reason.

An important issue to think about is what happens if the requested core is busy? LockOSThread doesn't have this problem (some core must be running it).

eliasnaur avatar Nov 17 '25 16:11 eliasnaur

Looks like it passed all checks except the macos(13) test with this error

This is a scheduled macos-13 brownout. The macOS-13 based runner images are being deprecated. For more details, see https://github.com/actions/runner-images/issues/13046.

amken3d avatar Nov 18 '25 15:11 amken3d

@amken3d I just created https://github.com/tinygo-org/tinygo/pull/5093 to address the macOS 13 runner deprecation.

deadprogram avatar Nov 18 '25 15:11 deadprogram