hyperkit icon indicating copy to clipboard operation
hyperkit copied to clipboard

connecting to virtio sockets from VM hangs HyerKit

Open rn opened this issue 8 years ago • 0 comments

When using the virtsock stress tests and just running the client in the VM with no one listening seems to hang HyperKit (or at least the virtio socket backend).

With DfM create follow the instructions in the stress the test README to create a Dockerfile and then run:

docker build -t stress . && docker run -it --rm --net=host --privileged stress -c 2

This will cause some messages like this to be printed:

Client connecting to 00000002.00005653
2017/07/03 09:30:31 [00000] Failed to Dial: 00000002.00005653 failed connect() to 00000002.00005653: connection reset by peer
2017/07/03 09:30:31 [00001] Failed to Dial: 00000002.00005653 failed connect() to 00000002.00005653: connection reset by peer
2017/07/03 09:30:31 [00002] Failed to Dial: 00000002.00005653 failed connect() to 00000002.00005653: connection reset by peer
2017/07/03 09:30:31 [00003] Failed to Dial: 00000002.00005653 failed connect() to 00000002.00005653: connection reset by peer
2017/07/03 09:30:31 [00004] Failed to Dial: 00000002.00005653 failed connect() to 00000002.00005653: connection reset by peer
2017/07/03 09:30:31 [00005] Failed to Dial: 00000002.00005653 failed connect() to 00000002.00005653: connection reset by peer
2017/07/03 09:30:31 [00006] Failed to Dial: 00000002.00005653 failed connect() to 00000002.00005653: connection reset by peer
2017/07/03 09:30:31 [00007] Failed to Dial: 00000002.00005653 failed connect() to 00000002.00005653: connection reset by peer

and eventually it will hang. The subsequently, docker commands, or anything else going over virtio sockets will hang.

Here is the thread backtrace when hyperkit is in this state:

lldb -p 36032
(lldb) process attach --pid 36032
Process 36032 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    frame #0: 0x00007fffa161ed96 libsystem_kernel.dylib`kevent + 10
libsystem_kernel.dylib`kevent:
->  0x7fffa161ed96 <+10>: jae    0x7fffa161eda0            ; <+20>
    0x7fffa161ed98 <+12>: movq   %rax, %rdi
    0x7fffa161ed9b <+15>: jmp    0x7fffa1616caf            ; cerror_nocancel
    0x7fffa161eda0 <+20>: retq

Executable module set to "/Applications/Docker.app/Contents/Resources/bin/hyperkit".
Architecture set to: x86_64h-apple-macosx.
(lldb) thread backtrace all
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00007fffa161ed96 libsystem_kernel.dylib`kevent + 10
    frame #1: 0x000000010972f35d hyperkit`main + 10856
    frame #2: 0x00007fffa14ef235 libdyld.dylib`start + 1
    frame #3: 0x00007fffa14ef235 libdyld.dylib`start + 1

  thread #2
    frame #0: 0x00007fffa161deb6 libsystem_kernel.dylib`__select + 10
    frame #1: 0x0000000109867938 hyperkit`unix_select + 305
    frame #2: 0x00000001097a7123 hyperkit`camlLwt_engine__fun_3017 + 35
    frame #3: 0x00000001097a6d1a hyperkit`camlLwt_engine__fun_2956 + 442
    frame #4: 0x00000001097a93d8 hyperkit`camlLwt_main__run_1327 + 136
    frame #5: 0x00000001097e7689 hyperkit`camlThread__fun_1564 + 137
    frame #6: 0x00000001098624cc hyperkit`caml_start_program + 92
    frame #7: 0x0000000109846cb5 hyperkit`caml_thread_start + 107
    frame #8: 0x00007fffa170893b libsystem_pthread.dylib`_pthread_body + 180
    frame #9: 0x00007fffa1708887 libsystem_pthread.dylib`_pthread_start + 286
    frame #10: 0x00007fffa170808d libsystem_pthread.dylib`thread_start + 13

  thread #3
    frame #0: 0x00007fffa161deb6 libsystem_kernel.dylib`__select + 10
    frame #1: 0x0000000109846d73 hyperkit`caml_thread_tick + 76
    frame #2: 0x00007fffa170893b libsystem_pthread.dylib`_pthread_body + 180
    frame #3: 0x00007fffa1708887 libsystem_pthread.dylib`_pthread_start + 286
    frame #4: 0x00007fffa170808d libsystem_pthread.dylib`thread_start + 13

  thread #4
    frame #0: 0x00007fffa161dbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #1: 0x00007fffa17097fa libsystem_pthread.dylib`_pthread_cond_wait + 712
    frame #2: 0x000000010984376b hyperkit`worker_loop + 123
    frame #3: 0x00007fffa170893b libsystem_pthread.dylib`_pthread_body + 180
    frame #4: 0x00007fffa1708887 libsystem_pthread.dylib`_pthread_start + 286
    frame #5: 0x00007fffa170808d libsystem_pthread.dylib`thread_start + 13

  thread #5, name = 'callout'
    frame #0: 0x00007fffa161dbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #1: 0x00007fffa1709833 libsystem_pthread.dylib`_pthread_cond_wait + 769
    frame #2: 0x000000010971373b hyperkit`callout_thread_func + 195
    frame #3: 0x00007fffa170893b libsystem_pthread.dylib`_pthread_body + 180
    frame #4: 0x00007fffa1708887 libsystem_pthread.dylib`_pthread_start + 286
    frame #5: 0x00007fffa170808d libsystem_pthread.dylib`thread_start + 13

  thread #6, name = 'net:ipc:tx'
    frame #0: 0x00007fffa161dbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #1: 0x00007fffa17097fa libsystem_pthread.dylib`_pthread_cond_wait + 712
    frame #2: 0x00000001097232f9 hyperkit`pci_vtnet_tx_thread.887 + 246
    frame #3: 0x00007fffa170893b libsystem_pthread.dylib`_pthread_body + 180
    frame #4: 0x00007fffa1708887 libsystem_pthread.dylib`_pthread_start + 286
    frame #5: 0x00007fffa170808d libsystem_pthread.dylib`thread_start + 13

  thread #7, name = 'net:ipc:rx'
    frame #0: 0x00007fffa161deb6 libsystem_kernel.dylib`__select + 10
    frame #1: 0x0000000109723788 hyperkit`pci_vtnet_tap_select_func.888 + 662
    frame #2: 0x00007fffa170893b libsystem_pthread.dylib`_pthread_body + 180
    frame #3: 0x00007fffa1708887 libsystem_pthread.dylib`_pthread_start + 286
    frame #4: 0x00007fffa170808d libsystem_pthread.dylib`thread_start + 13

  thread #8, name = 'blk:3:0'
    frame #0: 0x00007fffa161dbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #1: 0x00007fffa17097fa libsystem_pthread.dylib`_pthread_cond_wait + 712
    frame #2: 0x0000000109716986 hyperkit`blockif_thr + 256
    frame #3: 0x00007fffa170893b libsystem_pthread.dylib`_pthread_body + 180
    frame #4: 0x00007fffa1708887 libsystem_pthread.dylib`_pthread_start + 286
    frame #5: 0x00007fffa170808d libsystem_pthread.dylib`thread_start + 13

  thread #9, name = 'vsock:tx'
    frame #0: 0x00007fffa161dcca libsystem_kernel.dylib`__psynch_rw_wrlock + 10
    frame #1: 0x00007fffa1706d77 libsystem_pthread.dylib`_pthread_rwlock_lock + 478
    frame #2: 0x00000001097254ba hyperkit`pci_vtsock_tx_thread + 3364
    frame #3: 0x00007fffa170893b libsystem_pthread.dylib`_pthread_body + 180
    frame #4: 0x00007fffa1708887 libsystem_pthread.dylib`_pthread_start + 286
    frame #5: 0x00007fffa170808d libsystem_pthread.dylib`thread_start + 13

  thread #10, name = 'vsock:rx'
    frame #0: 0x00007fffa161dc22 libsystem_kernel.dylib`__psynch_mutexwait + 10
    frame #1: 0x00007fffa1708dfa libsystem_pthread.dylib`_pthread_mutex_lock_wait + 100
    frame #2: 0x0000000109726433 hyperkit`get_sock + 9
    frame #3: 0x000000010972596e hyperkit`pci_vtsock_rx_thread + 302
    frame #4: 0x00007fffa170893b libsystem_pthread.dylib`_pthread_body + 180
    frame #5: 0x00007fffa1708887 libsystem_pthread.dylib`_pthread_start + 286
    frame #6: 0x00007fffa170808d libsystem_pthread.dylib`thread_start + 13

  thread #11, name = 'vcpu:0'
    frame #0: 0x00007fffa161dbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #1: 0x00007fffa1709833 libsystem_pthread.dylib`_pthread_cond_wait + 769
    frame #2: 0x0000000109711a29 hyperkit`xh_vm_run + 1641
    frame #3: 0x00000001097300f3 hyperkit`vcpu_thread + 1215
    frame #4: 0x00007fffa170893b libsystem_pthread.dylib`_pthread_body + 180
    frame #5: 0x00007fffa1708887 libsystem_pthread.dylib`_pthread_start + 286
    frame #6: 0x00007fffa170808d libsystem_pthread.dylib`thread_start + 13

  thread #12
    frame #0: 0x00007fffa161e44e libsystem_kernel.dylib`__workq_kernreturn + 10
    frame #1: 0x00007fffa170848e libsystem_pthread.dylib`_pthread_wqthread + 1023
    frame #2: 0x00007fffa170807d libsystem_pthread.dylib`start_wqthread + 13

  thread #13, name = 'vcpu:1'
    frame #0: 0x00007fffa161dbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #1: 0x00007fffa1709833 libsystem_pthread.dylib`_pthread_cond_wait + 769
    frame #2: 0x0000000109711a29 hyperkit`xh_vm_run + 1641
    frame #3: 0x00000001097300f3 hyperkit`vcpu_thread + 1215
    frame #4: 0x00007fffa170893b libsystem_pthread.dylib`_pthread_body + 180
    frame #5: 0x00007fffa1708887 libsystem_pthread.dylib`_pthread_start + 286
    frame #6: 0x00007fffa170808d libsystem_pthread.dylib`thread_start + 13

  thread #14, name = '9p:db'
    frame #0: 0x00007fffa161f246 libsystem_kernel.dylib`read + 10
    frame #1: 0x0000000109720092 hyperkit`pci_vt9p_thread + 392
    frame #2: 0x00007fffa170893b libsystem_pthread.dylib`_pthread_body + 180
    frame #3: 0x00007fffa1708887 libsystem_pthread.dylib`_pthread_start + 286
    frame #4: 0x00007fffa170808d libsystem_pthread.dylib`thread_start + 13

  thread #15, name = '9p:port'
    frame #0: 0x00007fffa161f246 libsystem_kernel.dylib`read + 10
    frame #1: 0x0000000109720092 hyperkit`pci_vt9p_thread + 392
    frame #2: 0x00007fffa170893b libsystem_pthread.dylib`_pthread_body + 180
    frame #3: 0x00007fffa1708887 libsystem_pthread.dylib`_pthread_start + 286
    frame #4: 0x00007fffa170808d libsystem_pthread.dylib`thread_start + 13

  thread #16
    frame #0: 0x00007fffa161dbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #1: 0x00007fffa17097fa libsystem_pthread.dylib`_pthread_cond_wait + 712
    frame #2: 0x000000010984376b hyperkit`worker_loop + 123
    frame #3: 0x00007fffa170893b libsystem_pthread.dylib`_pthread_body + 180
    frame #4: 0x00007fffa1708887 libsystem_pthread.dylib`_pthread_start + 286
    frame #5: 0x00007fffa170808d libsystem_pthread.dylib`thread_start + 13
(lldb)

rn avatar Jul 03 '17 09:07 rn