zephyr icon indicating copy to clipboard operation
zephyr copied to clipboard

tests: tests/posix/headers: portability.posix.headers.newlib.with_posix_api build failure

Open hakehuang opened this issue 1 year ago • 2 comments

Describe the bug tests/posix/headers build failure

Please also mention any information which could help others to understand the problem you're facing:

  • What target platform are you using? this fails on all NXP RT series RT11xx/10xx/600/500
  • What have you tried to diagnose or workaround this issue? git bisect
  • Is this a regression? If yes, have you been able to "git bisect" it to a specific commit?
a9a909c558c8c2069742aeee55aba831740826cd is the first bad commit
commit a9a909c558c8c2069742aeee55aba831740826cd
Author: Chris Friedt <[email protected]>
Date:   Sat Jun 15 08:27:12 2024 -0400

    fdtable: read, write, close: only execute methods if non-NULL
    
    Only invoke vtable methods read, write, and close if they are
    non-NULL.
    
    The close() vtable method is optional, so that should not return
    an error if zvfs_close() is called and that method is
    unimplemented.
    
    Otherwise, if zvfs_read() or zvfs_write() are called and the
    corresponding vtable method is unimplemented, fail setting
    errno to EIO.
    
    Signed-off-by: Chris Friedt <[email protected]>

 lib/os/fdtable.c | 72 ++++++++++++++++++++++++++------------------------------
 1 file changed, 33 insertions(+), 39 deletions(-)

To Reproduce Steps to reproduce the behavior:

twister -i -p mimxrt1170_evk/mimxrt1176/cm7 -s portability.posix.headers.newlib.with_posix_api

Expected behavior build PASS

Impact POSIX compatible

Logs and console output

/home/jenkins/agent/workspace/bitbucket_build_platform/modules/crypto/mbedtls/library/platform_util.c
In file included from /home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/posix/time.h:116,
                 from /home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/posix/time.h:12,
                 from /home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/net/socket_select.h:22,
                 from /home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/posix/sys/select.h:10,
                 from /opt/toolchains/zephyr-sdk-0.16.8/arm-zephyr-eabi/arm-zephyr-eabi/sys-include/sys/types.h:50,
                 from /opt/toolchains/zephyr-sdk-0.16.8/arm-zephyr-eabi/arm-zephyr-eabi/sys-include/stdio.h:61,
                 from /home/jenkins/agent/workspace/bitbucket_build_platform/modules/crypto/mbedtls/include/mbedtls/platform.h:58,
                 from /home/jenkins/agent/workspace/bitbucket_build_platform/modules/crypto/mbedtls/library/platform_util.c:26:
/opt/toolchains/zephyr-sdk-0.16.8/arm-zephyr-eabi/arm-zephyr-eabi/sys-include/time.h:56:1: error: unknown type name 'clock_t'
   56 | clock_t    clock (void);
      | ^~~~~~~
/opt/toolchains/zephyr-sdk-0.16.8/arm-zephyr-eabi/arm-zephyr-eabi/sys-include/time.h:30:1: note: 'clock_t' is defined in header '<time.h>'; did you forget to '#include <time.h>'?
   29 | #include <sys/timespec.h>
  +++ |+#include <time.h>
   30 | 
In file included from /home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/posix/posix_types.h:15,
                 from /home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/posix/time.h:61:
/opt/toolchains/zephyr-sdk-0.16.8/arm-zephyr-eabi/arm-zephyr-eabi/sys-include/sys/_pthreadtypes.h:182:3: error: unknown type name 'clock_t'
  182 |   clock_t  clock;             /* specifiy clock for timeouts */
      |   ^~~~~~~
/home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/posix/posix_types.h:76:9: error: unknown type name 'clockid_t'
   76 |         clockid_t clock;
      |         ^~~~~~~~~
/home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/posix/time.h:93:19: error: unknown type name 'clockid_t'; did you mean 'clock_pfd_t'?
   93 | int clock_gettime(clockid_t clock_id, struct timespec *ts);
      |                   ^~~~~~~~~
      |                   clock_pfd_t
/home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/posix/time.h:94:18: error: unknown type name 'clockid_t'; did you mean 'clock_pfd_t'?
   94 | int clock_getres(clockid_t clock_id, struct timespec *ts);
      |                  ^~~~~~~~~
      |                  clock_pfd_t
/home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/posix/time.h:95:19: error: unknown type name 'clockid_t'; did you mean 'clock_pfd_t'?
   95 | int clock_settime(clockid_t clock_id, const struct timespec *ts);
      |                   ^~~~~~~~~
      |                   clock_pfd_t
/home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/posix/time.h:96:36: error: unknown type name 'clockid_t'; did you mean 'clock_pfd_t'?
   96 | int clock_getcpuclockid(pid_t pid, clockid_t *clock_id);
      |                                    ^~~~~~~~~
      |                                    clock_pfd_t
/home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/posix/time.h:98:18: error: unknown type name 'clockid_t'; did you mean 'clock_pfd_t'?
   98 | int timer_create(clockid_t clockId, struct sigevent *evp, timer_t *timerid);
      |                  ^~~~~~~~~
      |                  clock_pfd_t
/home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/posix/time.h:105:21: error: unknown type name 'clockid_t'; did you mean 'clock_pfd_t'?
  105 | int clock_nanosleep(clockid_t clock_id, int flags,
      |                     ^~~~~~~~~
      |                     clock_pfd_t
In file included from /home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/sys/fdtable.h:13,
                 from /home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/net/socket_select.h:26:
/home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/fs/fs.h:368:1: error: unknown type name 'ssize_t'; did you mean '_ssize_t'?
  368 | ssize_t fs_read(struct fs_file_t *zfp, void *ptr, size_t size);
      | ^~~~~~~
      | _ssize_t
/home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/fs/fs.h:389:1: error: unknown type name 'ssize_t'; did you mean '_ssize_t'?
  389 | ssize_t fs_write(struct fs_file_t *zfp, const void *ptr, size_t size);
      | ^~~~~~~
      | _ssize_t
/home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/fs/fs.h:409:36: error: unknown type name 'off_t'; did you mean '_off_t'?
  409 | int fs_seek(struct fs_file_t *zfp, off_t offset, int whence);
      |                                    ^~~~~
      |                                    _off_t
/home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/fs/fs.h:425:1: error: unknown type name 'off_t'; did you mean '_off_t'?
  425 | off_t fs_tell(struct fs_file_t *zfp);
      | ^~~~~
      | _off_t
/home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/fs/fs.h:447:40: error: unknown type name 'off_t'; did you mean '_off_t'?
  447 | int fs_truncate(struct fs_file_t *zfp, off_t length);
      |                                        ^~~~~
      |                                        _off_t
/home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/sys/fdtable.h:48:17: error: expected specifier-qualifier-list before 'ssize_t'
   48 |                 ssize_t (*read)(void *obj, void *buf, size_t sz);
      |                 ^~~~~~~
/home/jenkins/agent/workspace/bitbucket_build_platform/zephyr/include/zephyr/sys/fdtable.h:52:17: error: expected specifier-qualifier-list before 'ssize_t'
   52 |                 ssize_t (*write)(void *obj, const void *buf, size_t sz);
      |                 ^~~~~~~
ninja: build stopped: subcommand failed.

Environment (please complete the following information):

  • OS: (e.g. Linux)
  • Toolchain (e.g Zephyr SDK, ...) zephyr-sdk-0.16.8
  • Commit SHA or Version used: v3.7.0-rc1-551-g5b4c8945fd62

hakehuang avatar Jun 29 '24 10:06 hakehuang

It's not really clear to me how commit a9a909c558c8c2069742aeee55aba831740826cd could affect anything to do with mbedtls or posix headers (so I think there was a misstep in the git bisect, likely missing a west update), but this test does fail to build using the command below on main, so I'll continue to debug.

twister -i -p mimxrt1170_evk/mimxrt1176/cm7 -s portability.posix.headers.newlib.with_posix_api

cfriedt avatar Jun 29 '24 12:06 cfriedt

A bisect at my end identified 94d712e5cfe001996d7280c028e0e0303cf7b747 as the offending commit, which is also just as arbitrary and unrelated.

The search continues!

cfriedt avatar Jun 29 '24 13:06 cfriedt

The bisect I ran points the same PR (https://github.com/zephyrproject-rtos/zephyr/pull/73978) however, the first commit of that PR does not break posix thing but fdtable.c.

The error is:

.../zephyr/lib/os/fdtable.c: In function 'zvfs_rw':
.../zephyr/lib/os/fdtable.c:315:41: error: 'const struct fd_op_vtable' has no member named 'write_offset'; did you mean 'write_offs'?
  315 |                 if (fdtable[fd].vtable->write_offset == NULL) {
      |                                         ^~~~~~~~~~~~
      |                                         write_offs
.../zephyr/lib/os/fdtable.c:319:51: error: 'const struct fd_op_vtable' has no member named 'write_offset'; did you mean 'write_offs'?
  319 |                         res = fdtable[fd].vtable->write_offset(fdtable[fd].obj, buf, sz,
      |                                                   ^~~~~~~~~~~~
      |                                                   write_offs
.../zephyr/lib/os/fdtable.c:327:51: error: 'const struct fd_op_vtable' has no member named 'read_offset'; did you mean 'read_offs'?
  327 |                         res = fdtable[fd].vtable->read_offset(fdtable[fd].obj, buf, sz,
      |                                                   ^~~~~~~~~~~
      |                                                   read_offs
.../zephyr/lib/os/fdtable.c:335:1: error: label 'unlock' defined but not used [-Werror=unused-label]
  335 | unlock:
      | ^~~~~~

the following commits in that PR do affect posix related implementation in various ways so my guess would be that it's more error compounding in what's been reported by @hakehuang

EDIT: this error gets fixed in a later commit in that same PR: https://github.com/cfriedt/zephyr/commit/88e631dc82a129853542c3dfbeadb995ac13714f#diff-c5e475ef95ce582f4a29b7c9669794a903686ea1d141f6095239e51176c99a5e

EDIT2: after fixing this error in the commit that introduced it, a new bisect points to https://github.com/zephyrproject-rtos/zephyr/pull/73978/commits/df00883bfb5682e5113675adb6390e1ff5d02a7a which also isn't directly related to the error reported here. Not having all commits to build makes git bisect a bit less useful to identify the offending commit.

ithinuel avatar Jul 01 '24 10:07 ithinuel

I have the same problem. The sys/types.h from Newlib includes sys/select.h (provided by Zephyr), when __BSD_VISIBLE is set to 1. The sys/select.h later includes zephyr/net/socket_select.h which includes time.h and zephyr/sys/fdtable.h, which need off_t, ssize_t and clockid_t.

Missing types should be provided by sys/types.h, but sys/select.h is included before these types are defined.

Including posix_features.h using -imacro doesn't solve the issue. We can't also rely on POSIX headers from Newlib, because some types have different representation (e.g. pthread_rwlockattr_t).

For me, creating sys/types.h with the following content worked (sys/types.h from Newlib never includes sys/select.h).

diff --git a/include/zephyr/posix/sys/types.h b/include/zephyr/posix/sys/types.h
new file mode 100644
index 00000000000..9a13cfbda64
--- /dev/null
+++ b/include/zephyr/posix/sys/types.h
@@ -0,0 +1,19 @@
+/*
+ * Copyright (c) 2024 Google LLC
+ *
+ * SPDX-License-Identifier: Apache-2.0
+ */
+
+#ifndef ZEPHYR_INCLUDE_POSIX_SYS_TYPES_H_
+#define ZEPHYR_INCLUDE_POSIX_SYS_TYPES_H_
+
+#ifdef CONFIG_NEWLIB_LIBC
+
+#include <sys/features.h>
+#undef __BSD_VISIBLE
+
+#endif /* CONFIG_NEWLIB_LIBC */
+
+#include_next <sys/types.h>
+
+#endif /* ZEPHYR_INCLUDE_POSIX_SYS_TYPES_H_ */

duda-patryk avatar Jul 02 '24 15:07 duda-patryk

Just got back (I have quite a commute :-/). Looking at it again.

cfriedt avatar Jul 02 '24 22:07 cfriedt

I think part of the issue is that posix types are being used where they probably shouldn't be.

off_t, ssize_t and <sys/types.h> are all part of posix but are being used in fdtable.h (below the posix api).

There is even a comment in fdtable.h that says something along the lines of FIXME: For native_posix ssize_t, off_t, so clearly using those types in the base OS has been a challenge we have faced in the past.

The problem is compounded because all of the fdtable.h consumers are also therefore forced to use posix types creating a bit of a layering rat's nest.

I have a fairly simple workaround (just define the types locally). The off_t and ssize_t types should be removed after v3.7.0.

cfriedt avatar Jul 03 '24 00:07 cfriedt

Very strange how this somehow managed to go through CI without triggering any failures on the original set of PRs...

cfriedt avatar Jul 03 '24 01:07 cfriedt