opensbi icon indicating copy to clipboard operation
opensbi copied to clipboard

Lifelock condition under high load with many TLB shootdowns

Open clausecker opened this issue 5 months ago • 2 comments

On riscv64 FreeBSD, I have been experiencing problems when running the system under high load. The system hangs, apparently with all threads in a livelock condition. This condition reproduces on a SiFive Unmatched board, as well as on QEMU and RVVM.

The evaluation of many hangs suggest a link to multithreaded applications with lots of TLB shootdowns, specifically running the Go toolchain. Stack traces of a half-stuck system, where some threads seem to be in a livelock but others are fine look like this:

db> show active trace                        
                                                           
Tracing command clock pid 2 tid 100029 td 0xffffffc0adc0a140 (CPU 0)                                                   
ipi_stop() at ipi_stop+0x2c                                                                                            
intr_ipi_dispatch() at intr_ipi_dispatch+0x50  
sbi_ipi_intr() at sbi_ipi_intr+0x70                  
intr_event_handle() at intr_event_handle+0x88 
intr_isrc_dispatch() at intr_isrc_dispatch+0x2c
intc_intr() at intc_intr+0x42                                                                                          
intr_irq_handler() at intr_irq_handler+0x54                                                                             
do_trap_supervisor() at do_trap_supervisor+0x78 
cpu_exception_handler_supervisor() at cpu_exception_handler_supervisor+0x74                                            
--- interrupt 1                                                                                                        
spinlock_exit() at spinlock_exit+0x3c                      
mi_switch() at mi_switch+0x17a                                                                                          
softclock_thread() at softclock_thread+0x76                                                                            
fork_exit() at fork_exit+0x68                
fork_trampoline() at fork_trampoline+0xa                                                                               
                                                           
Tracing command pagedaemon pid 8 tid 100115 td 0xffffffc0adc2bcc0 (CPU 6)                                              
ipi_stop() at ipi_stop+0x2c                                                                                            
intr_ipi_dispatch() at intr_ipi_dispatch+0x50
sbi_ipi_intr() at sbi_ipi_intr+0x70            
intr_event_handle() at intr_event_handle+0x88                                                                           
intr_isrc_dispatch() at intr_isrc_dispatch+0x2c                                                                        
intc_intr() at intc_intr+0x42                        
intr_irq_handler() at intr_irq_handler+0x54    
do_trap_supervisor() at do_trap_supervisor+0x78
cpu_exception_handler_supervisor() at cpu_exception_handler_supervisor+0x74                                            
--- interrupt 1                                                                                                        
__rw_wlock_hard() at __rw_wlock_hard+0x442
_rw_wlock_cookie() at _rw_wlock_cookie+0x94    
pmap_ts_referenced() at pmap_ts_referenced+0xa4
$x() at $x+0x62c                              
vm_pageout() at vm_pageout+0x1c2        
fork_exit() at fork_exit+0x68                                                                                           
fork_trampoline() at fork_trampoline+0xa                                                                    

Tracing command sh pid 76959 tid 120011 td 0xffffffc103152140 (CPU 5)                                                  
ipi_stop() at ipi_stop+0x2c                                                                                            
intr_ipi_dispatch() at intr_ipi_dispatch+0x50           
sbi_ipi_intr() at sbi_ipi_intr+0x70          
intr_event_handle() at intr_event_handle+0x88                                                                          
intr_isrc_dispatch() at intr_isrc_dispatch+0x2c
intc_intr() at intc_intr+0x42                  
intr_irq_handler() at intr_irq_handler+0x54            
do_trap_supervisor() at do_trap_supervisor+0x78
cpu_exception_handler_supervisor() at cpu_exception_handler_supervisor+0x74                                            
--- interrupt 1                                                                                                        
sbi_remote_fence_i() at sbi_remote_fence_i+0x22                                                                        
pmap_enter_quick() at pmap_enter_quick+0x6e    
vm_fault_prefault() at vm_fault_prefault+0x180 
vm_fault() at vm_fault+0x164c                               
vm_fault_trap() at vm_fault_trap+0x4a                       
page_fault_handler() at page_fault_handler+0x1c4            
do_trap_user() at do_trap_user+0xf0                         
cpu_exception_handler_user() at cpu_exception_handler_user+0x72                                                         
--- exception 12, tval = 0x12a763f87e                       

Tracing command sh pid 76969 tid 120402 td 0xffffffc1030735c0 (CPU 4)                                                   
ipi_stop() at ipi_stop+0x2c                                 
intr_ipi_dispatch() at intr_ipi_dispatch+0x50               
sbi_ipi_intr() at sbi_ipi_intr+0x70                         
intr_event_handle() at intr_event_handle+0x88               
intr_isrc_dispatch() at intr_isrc_dispatch+0x2c             
intc_intr() at intc_intr+0x42                               
intr_irq_handler() at intr_irq_handler+0x54                 
do_trap_supervisor() at do_trap_supervisor+0x78             
cpu_exception_handler_supervisor() at cpu_exception_handler_supervisor+0x74                                             
--- interrupt 1                                             
sbi_remote_fence_i() at sbi_remote_fence_i+0x22             
pmap_enter_quick() at pmap_enter_quick+0x6e                 
vm_fault_prefault() at vm_fault_prefault+0x180              
vm_fault() at vm_fault+0x164c                               
vm_fault_trap() at vm_fault_trap+0x4a                       
page_fault_handler() at page_fault_handler+0x1c4            
do_trap_user() at do_trap_user+0xf0                         
cpu_exception_handler_user() at cpu_exception_handler_user+0x72                                                         
--- exception 12, tval = 0x8337c0c0                         

Tracing command sh pid 76971 tid 120096 td 0xffffffc1030c75c0 (CPU 7)                                                   
ipi_stop() at ipi_stop+0x2c                                 
intr_ipi_dispatch() at intr_ipi_dispatch+0x50               
sbi_ipi_intr() at sbi_ipi_intr+0x70                         
intr_event_handle() at intr_event_handle+0x88               
intr_isrc_dispatch() at intr_isrc_dispatch+0x2c             
intc_intr() at intc_intr+0x42                               
intr_irq_handler() at intr_irq_handler+0x54                 
do_trap_supervisor() at do_trap_supervisor+0x78             
cpu_exception_handler_supervisor() at cpu_exception_handler_supervisor+0x74                                             
--- interrupt 1                                             
sbi_remote_fence_i() at sbi_remote_fence_i+0x22             
pmap_enter_object() at pmap_enter_object+0xda               
vm_map_pmap_enter() at vm_map_pmap_enter+0x280              
vm_map_insert1() at vm_map_insert1+0x438                    
vm_map_fixed() at vm_map_fixed+0x112                        
vm_mmap_object() at vm_mmap_object+0x130                    
vn_mmap() at vn_mmap+0xec                                   
kern_mmap() at kern_mmap+0x46e                              
sys_mmap() at sys_mmap+0x38                                 
do_trap_user() at do_trap_user+0x1e4                        
cpu_exception_handler_user() at cpu_exception_handler_user+0x72                                                         
--- syscall (477, FreeBSD ELF64, mmap)                      

Tracing command sh pid 76972 tid 120164 td 0xffffffc1030bfcc0 (CPU 1)                                                   
ipi_stop() at ipi_stop+0x2c                                 
intr_ipi_dispatch() at intr_ipi_dispatch+0x50               
sbi_ipi_intr() at sbi_ipi_intr+0x70                         
intr_event_handle() at intr_event_handle+0x88               
intr_isrc_dispatch() at intr_isrc_dispatch+0x2c             
intc_intr() at intc_intr+0x42                               
intr_irq_handler() at intr_irq_handler+0x54                 
do_trap_supervisor() at do_trap_supervisor+0x78             
cpu_exception_handler_supervisor() at cpu_exception_handler_supervisor+0x74                                             
--- interrupt 1                                             
sbi_remote_fence_i() at sbi_remote_fence_i+0x22             
pmap_enter_quick() at pmap_enter_quick+0x6e                 
vm_fault_prefault() at vm_fault_prefault+0x180              
vm_fault() at vm_fault+0x164c                               
vm_fault_trap() at vm_fault_trap+0x4a                       
page_fault_handler() at page_fault_handler+0x1c4            
do_trap_user() at do_trap_user+0xf0                         
cpu_exception_handler_user() at cpu_exception_handler_user+0x72                                                         
--- exception 12, tval = 0x1a0c22baf0                       

Tracing command sh pid 76973 tid 100429 td 0xffffffc1030c9840 (CPU 3)                                                   
kdb_alt_break_internal() at kdb_alt_break_internal+0x15c    
kdb_alt_break() at kdb_alt_break+0xe                        
uart_intr_rxready() at uart_intr_rxready+0x7e               
uart_intr() at uart_intr+0x104                              
intr_event_handle() at intr_event_handle+0x88               
intr_isrc_dispatch() at intr_isrc_dispatch+0x2c             
plic_intr() at plic_intr+0x80                               
intr_event_handle() at intr_event_handle+0x88               
intr_isrc_dispatch() at intr_isrc_dispatch+0x2c             
intc_intr() at intc_intr+0x42                               
intr_irq_handler() at intr_irq_handler+0x54                 
do_trap_supervisor() at do_trap_supervisor+0x78             
cpu_exception_handler_supervisor() at cpu_exception_handler_supervisor+0x74                                             
--- interrupt 9                                             
__rw_wlock_hard() at __rw_wlock_hard+0x446                  
_rw_wlock_cookie() at _rw_wlock_cookie+0x94                 
pmap_enter() at pmap_enter+0x614                            
vm_fault() at vm_fault+0x133c                               
vm_fault_trap() at vm_fault_trap+0x4a                       
page_fault_handler() at page_fault_handler+0x1c4            
do_trap_user() at do_trap_user+0xf0                         
cpu_exception_handler_user() at cpu_exception_handler_user+0x72                                                         
--- exception 15, tval = 0x80a34a28                         

Tracing command cc pid 76974 tid 118022 td 0xffffffc1030d7cc0 (CPU 2)                                                   
ipi_stop() at ipi_stop+0x2c                                 
intr_ipi_dispatch() at intr_ipi_dispatch+0x50               
sbi_ipi_intr() at sbi_ipi_intr+0x70                         
intr_event_handle() at intr_event_handle+0x88               
intr_isrc_dispatch() at intr_isrc_dispatch+0x2c             
intc_intr() at intc_intr+0x42                               
intr_irq_handler() at intr_irq_handler+0x54                 
do_trap_supervisor() at do_trap_supervisor+0x78             
cpu_exception_handler_supervisor() at cpu_exception_handler_supervisor+0x74                                             
--- interrupt 1                                             
sbi_remote_fence_i() at sbi_remote_fence_i+0x22             
pmap_enter_object() at pmap_enter_object+0xda               
vm_map_pmap_enter() at vm_map_pmap_enter+0x280              
vm_map_insert1() at vm_map_insert1+0x438                    
vm_map_fixed() at vm_map_fixed+0x112                        
elf64_map_insert() at elf64_map_insert+0x16e                
elf64_load_sections() at elf64_load_sections+0x1ae          
exec_elf64_imgact() at exec_elf64_imgact+0x75c              
$x() at $x+0x47c                                            
sys_execve() at sys_execve+0x52                             
do_trap_user() at do_trap_user+0x1e4                        
cpu_exception_handler_user() at cpu_exception_handler_user+0x72                                                         
--- syscall (59, FreeBSD ELF64, execve)                     
db> cont                                                    
^BFeb 18 23:23:53 freebsd syslogd: exiting on signal 15     
KDB: enter: Break to debugger                               
timeout stopping cpus                                       
[ thread pid 76965 tid 119128 ]                             
Stopped at      kdb_alt_break_internal+0x15e:   sd      zero,-1164(s1)                                                  
db>    

(a trace of a full hang cannot be obtained as the kernel debugger cannot be entered while the system hangs)

Project member @jrtc27 has suggested that this may be connected to an unhandled livelock possibility:

	if (ret == SBI_FIFO_UNCHANGED &&
	    sbi_fifo_enqueue(tlb_fifo_r, data, false) < 0) {
		/**
		 * For now, Busy loop until there is space in the fifo.
		 * There may be case where target hart is also
		 * enqueue in source hart's fifo. Both hart may busy
		 * loop leading to a deadlock.
		 * TODO: Introduce a wait/wakeup event mechanism to handle
		 * this properly.
		 */
		tlb_process_once(scratch);
		sbi_dprintf("hart%d: hart%d tlb fifo full\n", curr_hartid,
			    sbi_hartindex_to_hartid(remote_hartindex));
		return SBI_IPI_UPDATE_RETRY;
	}

clausecker avatar Jul 21 '25 12:07 clausecker