graphene icon indicating copy to clipboard operation
graphene copied to clipboard

Graphene-SGX: Regression Failures on Kernel 5.12 upstream kernel

Open anjalirai-intel opened this issue 3 years ago • 8 comments

Config

Graphene :- Commit: f9ed396dba0fc3bb988c9cd53375306300a66178 Linux :- 5.12.0 upstream kernel

Issues

PAL Failures

TC-1: test_200_event The test expect to return “TEST OK” but it returned “Error at line 71, pal_errno: -1, debug: DkProcessExit: Returning exit code 1”

TC-2: test_900_misc Expected Result Query System Time OK Delay Execution for 10000 Microseconds OK Delay Execution for 3 Seconds OK Generate Random Bits OK

Actual Result: (Missing Delay Execution for 3 Seconds OK) Query System Time OK Sleeped 10078 Microseconds Delay Execution for 10000 Microseconds OK Sleeped 2993086 Microseconds Generate Random Bits OK

LibOS Failures

TC-1: test_041_futex_timeout Expected Result: futex correctly timed out

Actual Result: invoke futex syscall with a 1 second timeout Slept for 997926 microseconds, which is less than 1 seconds

LTP Test Failures:

TC-1: alarm05 Expected Result: TINFO: Timeout per run is 0h 05m 00s TPASS: alarm() returned zero TPASS: alarm() returned remainder correctly TPASS: alarm handler fired once

Summary: passed 0 failed 0 skipped 0 warnings 0

Actual Result: TINFO: Timeout per run is 0h 05m 00s TPASS: alarm() returned zero TFAIL: alarm() returned wrong remained 10 TPASS: alarm handler fired once

Summary: passed 0 failed 0 skipped 0 warnings 0

TC-2: alarm06 Expected Result: TINFO: Timeout per run is 0h 05m 00s TPASS: Received 0 alarms TPASS: alarm(0) returned 1

Summary: passed 0 failed 0 skipped 0 warnings 0

Actual Result: TINFO: Timeout per run is 0h 05m 00s TPASS: Received 0 alarms TFAIL: alarm(0) returned 2, expected 1

Summary: passed 0 failed 0 skipped 0 warnings 0

TC-3: sigtimedwait01 Expected Result: TINFO: Timeout per run is 0h 05m 00s TPASS: Wait interrupted by expected signal TPASS: struct siginfo is correct TPASS: struct siginfo is correct TPASS: sigwaitinfo restored the original mask TPASS: Wait interrupted by expected signal TPASS: Wait interrupted by expected signal TPASS: sigwaitinfo restored the original mask TPASS: Fault occurred while accessing the buffers TPASS: Child exited with expected code TPASS: Fault occurred while accessing the buffers TPASS: Wait interrupted by timeout

Summary: passed 0 failed 0 skipped 0 warnings 0

Actual Result: TINFO: Timeout per run is 0h 05m 00s TPASS: Wait interrupted by expected signal TPASS: struct siginfo is correct TPASS: struct siginfo is correct TPASS: sigwaitinfo restored the original mask TPASS: Wait interrupted by expected signal TPASS: Wait interrupted by expected signal TPASS: sigwaitinfo restored the original mask TPASS: Fault occurred while accessing the buffers graphene/LibOS/shim/test/ltp/ltp_src/lib/tst_test.c:1300: TBROK: Test killed by SIGPWR!

Summary: passed 0 failed 0 skipped 0 warnings 0

error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production! error: Failed to initialize secure pipe 152772: -6

TC-4: rt_sigtimedwait01 Expected Result: TINFO: Timeout per run is 0h 05m 00s TINFO: Testing variant: syscall with old kernel spec TPASS: Wait interrupted by expected signal TPASS: struct siginfo is correct TPASS: struct siginfo is correct TPASS: sigwaitinfo restored the original mask TPASS: Wait interrupted by expected signal TPASS: Wait interrupted by expected signal TPASS: sigwaitinfo restored the original mask TPASS: Fault occurred while accessing the buffers TPASS: Child exited with expected code TPASS: Fault occurred while accessing the buffers TPASS: Wait interrupted by timeout TPASS: struct siginfo is correct TPASS: sigwaitinfo restored the original mask TFAIL: Expected error number EAGAIN, got: EINTR (4) TPASS: struct siginfo is correct TPASS: sigwaitinfo restored the original mask TPASS: struct siginfo is correct TPASS: struct siginfo is correct TPASS: sigwaitinfo restored the original mask

Summary: passed 0 failed 0 skipped 0 warnings 0

Actual Result: TINFO: Timeout per run is 0h 05m 00s TINFO: Testing variant: syscall with old kernel spec TPASS: Wait interrupted by expected signal TPASS: struct siginfo is correct TPASS: struct siginfo is correct TPASS: sigwaitinfo restored the original mask TPASS: Wait interrupted by expected signal TPASS: Wait interrupted by expected signal TPASS: sigwaitinfo restored the original mask TPASS: Fault occurred while accessing the buffers TBROK: Test killed by SIGPWR!

Summary: passed 0 failed 0 skipped 0 warnings 0

TC-5: futex_wait_bitset01 Actual Result: TINFO: Timeout per run is 0h 05m 00s TINFO: Testing variant: syscall with old kernel spec TINFO: testing futex_wait_bitset() timeout with CLOCK_MONOTONIC TFAIL: futex_wait_bitset() woken up prematurely 99918us, expected 100010us TINFO: testing futex_wait_bitset() timeout with CLOCK_REALTIME TFAIL: futex_wait_bitset() woken up prematurely 99933us, expected 100010us

Summary: passed 0 failed 0 skipped 0 warnings 0

TC-6: alarm03 Actual Result: TINFO: Timeout per run is 0h 05m 00s TFAIL: alarm(100), fork, alarm(0) parent's alarm returned 99 TPASS: alarm(100), fork, alarm(0) child's alarm returned 0

Summary: passed 0 failed 0 skipped 0 warnings 0

Steps to reproduce

PAL Failures cd Pal/regression make SGX=1 regression

LibOS Failures cd LibOS/shim/test/regression sudo make SGX=1 regression Failure Trace Logs.zip

LTP Failures: cd LibOS/shim/test/ltp make -j8 SGX=1 all sgx-tokens make -j8 ltp-sgx.xml

anjalirai-intel avatar Jun 15 '21 09:06 anjalirai-intel

So is this regression because of:

  • moving to the new Linux kernel, or
  • moving to a new Graphene commit?

dimakuv avatar Jun 15 '21 14:06 dimakuv

There seems to be something wrong with timings in this new Linux 5.12 kernel. Looks like all new regressions are about timings that are very slightly off.

dimakuv avatar Jun 15 '21 14:06 dimakuv

@dimakuv This regression is because moving to new Linux Kernel

anjalirai-intel avatar Jun 16 '21 04:06 anjalirai-intel

Updated ltp test failures

anjalirai-intel avatar Jun 16 '21 13:06 anjalirai-intel

@anjalirx-intel I don't remember if we agreed on some way of solving this issue... Is anyone looking at it? I think someone volunteered to check this. I will assign priority P1 for now and mark it as a bug.

dimakuv avatar Jul 22 '21 08:07 dimakuv

No @dimakuv , we haven't agreed on any way for solving this issue

anjalirai-intel avatar Jul 22 '21 09:07 anjalirai-intel

@dimakuv, please assign this to me. I've started to have a look.

Satya1493 avatar Aug 04 '21 08:08 Satya1493

@Satya1493 This issue is also seen on RHEL setup Kernel 5.12

anjalirai-intel avatar Aug 20 '21 11:08 anjalirai-intel