sgx-lkl icon indicating copy to clipboard operation
sgx-lkl copied to clipboard

GCC OpenMP applications fail due to direct futex syscall instructions

Open prp opened this issue 4 years ago • 0 comments

When running the OpenMP sample application from here, it only runs correctly in sw mode. This is caused by a syscall instruction executed inside of the enclave:

 t-pepiet@msrc-cc06:~/devel/sgx-lkl-oe.git/apps/languages/openmp$ make run-hw
SGXLKL_GETTIME_VDSO=0 SGXLKL_VERBOSE=1 SGXLKL_KERNEL_VERBOSE=0 SGXLKL_TRACE_SIGNAL=0 SGXLKL_TRACE_HOST_SYSCALL=0 SGXLKL_TRACE_LKL_SYSCALL=0 SGXLKL_TRACE_MMAP=0 SGXLKL_TRACE_THREAD=0 ../../../build/sgx-lkl-run-oe --hw-debug sgxlkl-openmp.img app/openmp-test
[   SGX-LKL  ] SGX-LKL (OE) Git version 679cb50-dirty [DEBUG build (-O0)] [HARDWARE DEBUG]
[   SGX-LKL  ] nproc=8 ETHREADS=8 CMDLINE="mem=32M" GETTIME_VDSO=0
[   SGX-LKL  ] HW TLS support: conf->fsgsbase=1
[   SGX-LKL  ] Registering disk 0 (path='sgxlkl-openmp.img', mnt='/', [RW   ])
[   SGX-LKL  ] No tap device specified, networking will not be available.
[   SGX-LKL  ] get_signed_libsgxlkl_path... result=/home/t-pepiet/devel/sgx-lkl-oe.git/build_musl/./libsgxlkl.so.signed
[   SGX-LKL  ] oe_create_enclave... result=0 (OE_OK) [   SGX-LKL  ] sgxlkl_enclave_init(ethread_id=0)
[[  SGX-LKL ]] sgxlkl_enclave_show_attribute(): enclave base=0x7ff980000000 size=1.043 GB
[[  SGX-LKL ]] sgxlkl_enclave_show_attribute(): enclave heap base=0x7ff980b62000 size=1024.00 M end=0x7ff9c0b62000
[[  SGX-LKL ]] _register_enclave_signal_handlers(): Registering OE exception handler...
[[  SGX-LKL ]] lkl_start_init(): kernel command line: 'mem=32M console=hvc0 quiet'
[[  SGX-LKL ]] lkl_start_init(): lkl_start_kernel() called
[[  SGX-LKL ]] lkl_start_init(): lkl_start_kernel() finished
[[  SGX-LKL ]] init_random(): Adding entropy to entropy pool
[[  SGX-LKL ]] wg0 has public key BaCecg+GExs6wJWECCYfwgtoXidxxSKlPa85U8r/whA=
[[  SGX-LKL ]] aas_release_resources(): aas_release_resources: deallocate all resources
[[  SGX-LKL ]] lkl_mount_disk(): lkl_mount_disk(dev="/dev/vda", mnt="/mnt/vda", ro=0)
[[  SGX-LKL ]] lkl_mount_disks(): Set working directory: /
[[  SGX-LKL ]] libc_start_main_stage2(): Calling app main: app/openmp-test
Running with following number of threads: 8
[[  SGX-LKL ]] FAIL: [[  SGX-LKL ]] FAIL: Encountered an illegal instruction inside enclave (opcode=0x50f)
2020-03-18T15:40:26.707203Z [(H)ERROR] tid(0x7ffa489b7700) | :OE_ENCLAVE_ABORTING [../host/calls.c:oe_call_enclave_function_by_table_id:91]
[   SGX-LKL  ] FAIL: 2020-03-18T15:40:26.707209Z [(H)ERROR] tid(0x7ffa471b4700) | :OE_ENCLAVE_ABORTING [../host/calls.c:oe_call_enclave_function_by_table_id:91]
sgxlkl_ethread_init() failed (ethread_id=1 result=19 (OE_ENCLAVE_ABORTING))
[   SGX-LKL  ] FAIL: Segmentation fault (core dumped)
Makefile:41: recipe for target 'run-hw' failed
make: *** [run-hw] Error 139

OpenMP (at least under gcc) is directly making futex syscalls:

https://github.com/atgreen/gcc/blob/master/libgomp/config/linux/x86/futex.h

The same problem occurs when running certain version of PyTorch, probably also due to OpenMP.

We have two options here:

  1. Emulate the syscall inside of the enclave. This is possible but would carry a (significant) performance penalty.
  2. Change OpenMP/gcc to use the futex libc wrapper instead of using the syscall directly. This would require changing libgomp unless there is a command line flag to change its behaviour during compile time or runtime?

(Related to https://github.com/lsds/sgx-lkl/issues/131)

prp avatar Jun 25 '20 10:06 prp