liburing icon indicating copy to clipboard operation
liburing copied to clipboard

ppc64le: Tests failed (2): <no-mmap-inval.t> <reg-fd-only.t>

Open vt-alt opened this issue 1 year ago • 4 comments

JFYI, two tests fail on ppc64 on Linux v6.6.46 for liburing-2.7. They succeed or skipped on x86_64.

+ make runtests
...
[00:01:36] Running test no-mmap-inval.t                                        Got -2, wanted -EFAULT
[00:01:36] Test no-mmap-inval.t failed with ret 1
...
[00:01:49] Running test reg-fd-only.t                                          ring setup failed
[00:01:49] test 8 failed
[00:01:49] Test reg-fd-only.t failed with ret 1

On x86_64:

[00:01:22] Running test no-mmap-inval.t                                        0 sec
...
[00:01:29] Running test reg-fd-only.t                                          Enable huge pages to test big rings
[00:01:29] Skipped

vt-alt avatar Aug 17 '24 11:08 vt-alt

Also, the same version suddenly got on aarch64:

[00:00:40] Running test accept-non-empty.t                                     Test accept-non-empty.t failed with ret 78
...
[00:02:40] Tests failed (1): <accept-non-empty.t>

vt-alt avatar Aug 19 '24 14:08 vt-alt

Pushed a fix for accept-non-empty, that was a bug in the test.

For ppc64, the -ENOENT for mmap-no-inval is very (very) odd. For reg-fd-only, I'll push a commit to dump 'ret'. Can you try and re-run it? I'm wondering what it's returning. Maybe both are the same arch odditiy and it'll be -ENOENT?!

axboe avatar Aug 19 '24 15:08 axboe

Thanks. Now with ebd6c8ff4bbc05492185f937f002b119a8f91964 it failed only on ppc64le (and i586 where is usually exclude sqpoll-sleep.t). Kernel is also changed to v6.6.47.

  • ppc64le :
[00:01:36] Running test no-mmap-inval.t                                        Got -2, wanted -EFAULT
[00:01:36] Test no-mmap-inval.t failed with ret 1
...
[00:01:50] Running test reg-fd-only.t                                          ring setup failed: -2
[00:01:50] test 8 failed
[00:01:50] Test reg-fd-only.t failed with ret 1
...
[00:01:52] Running test send-zerocopy.t                                        invalid cqe->res -90 expected 65536
[00:01:52] send failed fixed buf 0, conn 0, addr 1, cork 0
[00:01:52] test_inet_send() failed (defer_taskrun 0)
[00:01:52] Test send-zerocopy.t failed with ret 1
...
[00:02:12] Tests failed (3): <no-mmap-inval.t> <reg-fd-only.t> <send-zerocopy.t>
  • i586:
[00:02:01] Running test sqpoll-sleep.t                                         Test sqpoll-sleep.t failed with ret 1

vt-alt avatar Aug 19 '24 15:08 vt-alt

Thanks. Now with ebd6c8f it failed only on ppc64le (and i586 where is usually exclude sqpoll-sleep.t). Kernel is also changed to v6.6.47.

* ppc64le :

...

[00:01:52] Running test send-zerocopy.t invalid cqe->res -90 expected 65536 [00:01:52] send failed fixed buf 0, conn 0, addr 1, cork 0

It's UDP for which we "expect" 65536 bytes in a datagram, more than usually supported by UDP. Looks the test wasn't prepared for 16K pages.

if (!tcp && len > 4 * page_sz)
	continue; // skip test

isilence avatar Aug 19 '24 16:08 isilence

For 2.8 test failures on ppc64le on Linux 6.11.5:

[00:01:39] Running test no-mmap-inval.t                                        Got -2, wanted -EFAULT
[00:01:39] Test no-mmap-inval.t failed with ret 1

[00:02:01] Running test reg-fd-only.t                                          ring setup failed: -2
[00:02:01] test 8 failed
[00:02:01] Test reg-fd-only.t failed with ret 1

[00:02:03] Running test recvsend_bundle.t                                      failed recv cqe: -105
[00:02:03] test d failed
[00:02:03] TCP test case (classic=0) failed
[00:02:03] Test recvsend_bundle.t failed with ret 1

[00:03:11] Running test timeout.t                                              child failed 0
[00:03:11] test_timeout_link_cancel failed
[00:03:11] Test timeout.t failed with ret 1

[00:03:17] Tests failed (4): <no-mmap-inval.t> <reg-fd-only.t> <recvsend_bundle.t> <timeout.t>

Additionally, 1 test fail on i586:

[00:02:32] Running test sqpoll-sleep.t                                         Test sqpoll-sleep.t failed with ret 1

[00:02:49] Tests failed (1): <sqpoll-sleep.t>

Temporary build logs: https://git.altlinux.org/tasks/361211/build/100/ppc64le/log https://git.altlinux.org/tasks/361211/build/100/i586/log

On x86_64 and aarch64, with same build env and same kernel version tests do not fail.

vt-alt avatar Oct 31 '24 00:10 vt-alt

I additionally tested on 6.12-rc5, and the list of failed tests is identical across all architectures.

[00:03:44] Test run complete, kernel: 6.12.0-6.12-alt0.rc5 #1 SMP Sun Oct 27 23:47:43 UTC 2024
[00:03:44] Tests failed (5): <no-mmap-inval.t> <reg-fd-only.t> <recvsend_bundle.t> <recvsend_bundle-inc.t> <timeout.t>
[00:03:15] Test run complete, kernel: 6.12.0-6.12-alt0.rc5 #1 SMP PREEMPT_DYNAMIC Sun Oct 27 23:46:51 UTC 2024
[00:03:15] Tests failed (1): <sqpoll-sleep.t>

Temporary build logs: https://git.altlinux.org/tasks/361212/build/100/ppc64le/log https://git.altlinux.org/tasks/361212/build/100/i586/log

vt-alt avatar Oct 31 '24 00:10 vt-alt

Thanks for running these. I'll check x86, but I don't have any powerpc to test on... Oh maybe this is a page size thing. What page size is your ppc box running?

axboe avatar Oct 31 '24 00:10 axboe

$ getconf PAGESIZE
65536

vt-alt avatar Oct 31 '24 00:10 vt-alt

Can you test sqpoll-sleep after the commit I just made?

axboe avatar Oct 31 '24 00:10 axboe

Can you try and strace no-mmap-inval on ppc and attach it here? It should be using page size dependent code already.

axboe avatar Oct 31 '24 00:10 axboe

Pushed some fixes, hopefully fixing some of them.

axboe avatar Oct 31 '24 01:10 axboe

Thanks. After updates applied, up to 59c0cb3 2024-10-30 test/timeout: properly loop around waitpid() status. i586 still have failure:

[00:04:23] Tests failed (1): <sqpoll-sleep.t>

ppc64le:

[00:03:52] Tests failed (3): <no-mmap-inval.t> <recvsend_bundle.t> <recvsend_bundle-inc.t>
strace -v -f test/no-mmap-inval.t
[00:00:24] #1 SMP Sun Oct 2++ strace -v -f test/no-mmap-inval.t
[00:00:25] execve("test/no-mmap-inval.t", ["test/no-mmap-inval.t"], ["RPM_PYTHON_COMPILE_INCLUDE=/usr/"..., "RPM_FIXUP_TOPDIR=", "RPM_PYTHON3_SELF_PROV_PATH=", "RPM_SOURCE_DIR=/usr/src/RPM/SOUR"..., "RPM_PKG_CONTENTS_INDEX_BIN=/.hos"..., "RPM_PYTHON_LIB_PATH=", "RPM_PYTHON3_VERSION=unknown", "G_BROKEN_FILENAMES=1", "HISTSIZE=999", "HOSTNAME=localhost.localdomain", "RPM_DEBUGINFO_STRIPPED_TERMINATE"..., "RPM_PYTHON3_SITELIBDIR=/usr/lib6"..., "RPM_PYTHON=/usr/bin/python2.7", "RPM_PYTHON3_COMPILE_EXCLUDE=/usr"..., "RPM_PYTHON_COMPILE_DEEP=20", "RPM_PYTHON3_SITELIBDIR_NOARCH=/u"..., "RPM_PYTHON_COMPILE_SKIP_X=1", "RPM_PYTHON_COMPILE_EXCLUDE=/usr/"..., "RPM_LIB=lib64", "RPM_FIXUP_METHOD=binconfig pkgco"..., "RPM_LD_PRELOAD_python=/usr/lib64"..., "PWD=/usr/src/RPM/BUILD/liburing-"..., "RPM_CLEANUP_TOPDIR=", "SOURCE_DATE_EPOCH=1730337221", "LOGNAME=root", "RPM_PYTHON3_COMPILE_SKIP_X=1", "RPM_VERIFY_ELF_METHOD=strict", "RPM_FILES_TO_LD_PRELOAD_python= "..., "RPM_CLEANUP_SKIPLIST=", "RPM_ARCH=ppc64le", "RPM_PYTHON3_COMPILE_CLEAN=1", "RPM_DATADIR=/usr/share", "HOME=/usr/src", "RPM_FILES_TO_LD_PRELOAD_python3="..., "PERL_USE_UNSAFE_INC=1", "RPM_PERL_REQ_METHOD=normal", "RPM_PYTHON_REQ_METHOD=slight", "RPM_TARGET_ARCH=ppc64le", "RPM_PYTHON3=/usr/bin/python3", "RPM_FINDPROV_LIB_PATH=", "RPM_COMPRESS_TOPDIR=/usr", "TMPDIR=/tmp", "RPM_VERIFY_ELF_TOPDIR=", "RPM_CHECK_CONTENTS_METHOD=defaul"..., "RPM_FINDPROV_TOPDIR=", "RPM_PACKAGE_RELEASE=alt1.test.2", "RPM_DEBUGINFO_SKIPLIST=", "RPM_PYTHON_COMPILE_CLEAN=1", "RPM_OS=linux", "RPM_VERIFY_ELF_SKIPLIST=", "MAKEFLAGS=-w -O PAM_SO_SUFFIX=", "RPM_CHECK_CONTENTS_SKIPLIST=", "TERM=dumb", "RPM_PYTHON3_REQ_METHOD=slight", "USER=root", "RPM_PYTHON3_LIBDIR=/usr/lib64/py"..., "RPM_PYTHON3_PATH=/usr/lib64/pyth"..., "PAM_SO_SUFFIX=", "RPM_TARGET_OS=linux", "SHLVL=3", "RPM_BUILD_DIR=/usr/src/RPM/BUILD", "RPM_FINDREQ_TOPDIR=", "SCRIPT=/usr/src/tmp/vm.yp4UP1yb8"..., "RPM_FIXUP_SKIPLIST=", "PAM_NAME_SUFFIX=", "RPM_PYTHON2_PATH=/usr/lib64/pyth"..., "RPM_PYTHON3_COMPILE_DEEP=20", "RPM_OPT_FLAGS=-pipe -frecord-gcc"..., "RPM_PYTHON3_REQ_HIER=yes", "RPM_FINDREQ_SKIPLIST=/usr/share/"..., "RPM_PYTHON3_IMPORT_PATH=", "RPM_DOC_DIR=/usr/share/doc", "RPM_VERIFY_INFO_METHOD=normal", "RPM_PACKAGE_VERSION=2.8", "RPM_STRICT_INTERDEPS=sisyphus.36"..., "RPM_PYTHON3_COMPILE_INCLUDE=/usr"..., "RPM_PYTHON_MODULE_DECLARED=", "RPM_PYTHON_REQ_SKIP=", "RPM_LIBDIR=/usr/lib64", "RPM_LD_PRELOAD_python3=/usr/lib6"..., "RPM_PYTHON_COMPILE_METHOD=ALL", "RPM_PYTHON3_REQ_SKIP=", "PATH=/usr/src/bin:/usr/bin:/bin:"..., "RPM_CHECK_CONTENTS_TOPDIR=", "HISTFILESIZE=9999", "MAIL=/var/mail/builder", "RPM_FINDPROV_SKIPLIST=/usr/share"..., "RPM_FINDPACKAGE_PATH=", "RPM_COMPRESS_SKIPLIST=", "RPM_PACKAGE_NAME=liburing", "RPM_CLEANUP_METHOD=auto", "RPM_BUILD_ROOT=/usr/src/tmp/libu"..., "OLDPWD=/usr/src/RPM/BUILD/liburi"..., "RPM_COMPRESS_METHOD=auto", "RPM_PYTHON3_LIB_PATH=", "_=/usr/bin/strace"]) = 0
[00:00:25] brk(NULL)                               = 0x10016f70000
[00:00:25] access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
[00:00:25] openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
[00:00:25] newfstatat(3, "", {st_dev=makedev(0, 0x14), st_ino=53631, st_mode=S_IFREG|0644, st_nlink=1, st_uid=1031, st_gid=1031, st_blksize=196608, st_blocks=128, st_size=8599, st_atime=1730337271 /* 2024-10-31T01:14:31.422995818+0000 */, st_atime_nsec=422995818, st_mtime=1730337271 /* 2024-10-31T01:14:31.421995880+0000 */, st_mtime_nsec=421995880, st_ctime=1730337271 /* 2024-10-31T01:14:31.421995880+0000 */, st_ctime_nsec=421995880}, AT_EMPTY_PATH) = 0
[00:00:25] mmap(NULL, 8599, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fffaf200000
[00:00:25] close(3)                                = 0
[00:00:25] openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
[00:00:25] read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0\25\0\1\0\0\0\240\243\2\0\0\0\0\0"..., 832) = 832
[00:00:25] newfstatat(3, "", {st_dev=makedev(0, 0x14), st_ino=2969, st_mode=S_IFREG|0755, st_nlink=1, st_uid=1031, st_gid=1031, st_blksize=196608, st_blocks=4864, st_size=2439000, st_atime=1730337245 /* 2024-10-31T01:14:05.859595745+0000 */, st_atime_nsec=859595745, st_mtime=1714366800 /* 2024-04-29T05:00:00+0000 */, st_mtime_nsec=0, st_ctime=1730337243 /* 2024-10-31T01:14:03.321754580+0000 */, st_ctime_nsec=321754580}, AT_EMPTY_PATH) = 0
[00:00:25] mmap(NULL, 2482960, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fffaefa0000
[00:00:25] mmap(0x7fffaf1e0000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x240000) = 0x7fffaf1e0000
[00:00:25] close(3)                                = 0
[00:00:25] set_tid_address(0x7fffaf2a2e10)         = 176
[00:00:25] set_robust_list(0x7fffaf2a2e20, 24)     = 0
[00:00:25] rseq(0x7fffaf2a3460, 0x20, 0, 0xfe5000b) = 0
[00:00:25] mprotect(0x7fffaf1e0000, 65536, PROT_READ) = 0
[00:00:25] mprotect(0x100ad0000, 65536, PROT_READ) = 0
[00:00:25] mprotect(0x7fffaf290000, 65536, PROT_READ) = 0
[00:00:25] prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
[00:00:25] munmap(0x7fffaf200000, 8599)            = 0
[00:00:25] getrandom("\x57\x15\x16\x31\xd6\xa0\x58\xe8", 8, GRND_NONBLOCK) = 8
[00:00:25] brk(NULL)                               = 0x10016f70000
[00:00:25] brk(0x10016fb0000)                      = 0x10016fb0000
[00:00:25] mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0) = 0x7fffaf200000
[00:00:25] mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0) = -1 ENOENT (No such file or directory)
[00:00:25] munmap(0x7fffaf200000, 1)               = 0
[00:00:25] write(2, "Got -2, wanted -EFAULT\n", 23Got -2, wanted -EFAULT
[00:00:25] ) = 23
[00:00:25] exit_group(1)                           = ?
[00:00:25] +++ exited with 1 +++

Temporary build logs https://git.altlinux.org/tasks/361214/build/100/i586/log https://git.altlinux.org/tasks/361214/build/100/ppc64le/log

vt-alt avatar Oct 31 '24 01:10 vt-alt

The bundle one needs a bit more investigation. no-mmap-inval should skip now too on ppc. I'll check the sqpoll-sleep on x86, that's very odd.

axboe avatar Oct 31 '24 01:10 axboe

Pushed another fix for sqpoll-sleep, can you give it a spin on x86?

axboe avatar Nov 01 '24 16:11 axboe

Thanks, For 0733494ef1f9ee8a593d86ff40e0aeb18a17cce2, on i586 now All tests passed, on ppc64le:

[00:02:13] Running test recvsend_bundle.t                                      failed recv cqe: -105
[00:02:13] test d failed
[00:02:13] TCP test case (classic=0) failed
[00:02:13] Test recvsend_bundle.t failed with ret 1

[00:03:27] Tests failed (1): <recvsend_bundle.t>

vt-alt avatar Nov 03 '24 03:11 vt-alt

Thanks, so we're just down to the bundle test. Let one will probably linger for a while until I get access to a ppc (or similar) system. Pretty sure it's a test issue, so I'd say just ignore it for now.

axboe avatar Nov 03 '24 13:11 axboe

IC. Thanks for the help. I decided to try with the latest commit 37a38802059420b7a204a27e2f6be6acd360893a and ppc suddenly reported additional failure

[00:02:10] Running test recv-multishot.t                                       connect failed
[00:02:10] t_create_socket_pair failed: 4
[00:02:10] test stream=1 wait_each=1 recvmsg=0 early_error=4  defer=1 failed
[00:02:10] Test recv-multishot.t failed with ret 1

[00:03:25] Tests failed (2): <recv-multishot.t> <recvsend_bundle.t>

Repeated run didn't show the failure, so perhaps it's intermittent.

vt-alt avatar Nov 03 '24 13:11 vt-alt

JFYI. Besides these old tests, that I just skip, the new test failed on all architectures (for liburing-2.9-rc1) and is not reported yet:

[00:01:42] Running test read-inc-file.t                                        fail buffer check loop 0
[00:01:42] Test read-inc-file.t failed with ret 1

This is on Linux v6.12.8 (inside of kvm, with v6.12.6 on the host).

vt-alt avatar Jan 08 '25 19:01 vt-alt

That's expected, it's fixed by:

https://git.kernel.dk/cgit/linux/commit/?h=io_uring-6.13&id=ed123c948d06688d10f3b10a7bce1d6fbfd1ed07

which is upstream but hasn't made it into stable just yet.

axboe avatar Jan 08 '25 19:01 axboe

I'm taking it as the issue is resolved. Closing, please reopen / let know if there are any problems

isilence avatar Apr 03 '25 08:04 isilence