flex icon indicating copy to clipboard operation
flex copied to clipboard

inclusion of WinFlex changes (drop need for posix functions fork/wait, htonl/htons)

Open GitMensch opened this issue 6 years ago • 16 comments

https://github.com/lexxmark/winflexbison/tree/master/flex/src includes some changes for non-GCC (obviously especially WIN32) compilers.

What do you think about integrating them (after cleanup)?

A good starting example seems to be filter.c (and flexdef.h) to use a different signature of struct filter and filter_apply_chain (). These changes could be integrated if one or both functions fork()/wait() are not available (at least with 2.6.4 there is a check in configure for them but neither a configure time failure nor a #ifdef HAVE_FORK in the code).

Similar could be done for tables.c and htonl / htons (which seem to have no check during configure).

I know this would mean to inspect all the changes and merge them one by one, but at least this would be a start for a more common code base.

So... was a discussion about this before?

GitMensch avatar Jun 20 '18 21:06 GitMensch

note: the fork (filter.c), wait (filter.c, main.c) parts may simply be replaced depending on #if defined (HAVE_SYS_WAIT_H) && defined (HAVE_FORK) as I guess those make only sense if both are defined.

Opinions?

Note: I actually try to build flex with OrangeC with the goal to use flex' testsuite as an additional test bed for this compiler (compilation works [when cheating an empty sys/wait.h] after their latest changes, linking stops on the non-posix parts that seem to be replaced in the flex parts of WinFlexBison, which ideally should have PR'd their changes in the first place...

GitMensch avatar Jun 20 '18 21:06 GitMensch

Note: if this is agreed upon I'd add some PR with getting similar changes in that are needed to compile and run with OrangeC and VS (as it seems to be the main target of the WinFlexBison changes).

GitMensch avatar Jun 22 '18 20:06 GitMensch

@GitMensch I have discovered the non-portability of htonl / htons, but I'm not sure how WinFlexBison solves it. (htonl and htons are used in skeleton, so adding a check of them in configure is useless. We need other workaround for the case.) Can you give a better pointer?

Regarding the others, I can't comment at the moment.

Explorer09 avatar Jun 24 '18 01:06 Explorer09

Regarding the others, I can't comment at the moment.

We likely should split the issue into multiple PR. I assume the fork/wait part are not included in the generated sources giving at least two PRs to handle...

@Explorer09 What do you mean by "used in the skeleton"? Isn't it only used in src/tables.c?

Obviously htonl and its family would be usable on many Windows compilers by including Winsock2.h (possibly also by including winsock.h), but this would still not be portable (just adds an option to compile with more [WIN32] compilers).

I have discovered the non-portability of htonl / htons, but I'm not sure how WinFlexBison solves it.

It doesn't use the functions but defines instead. This would be a first step in any case. Then define it depending on #if defined (htonl) || (defined(_POSIX_C_SOURCE) && _POSIX_C_SOURCE >= 200112L). The define WinFlex uses may not be ideal (as it always assumes it has to swap) but a starter:

#define SWAP_BYTE_ORDER_UINT(val)   (((val << 24) & 0xFF000000) |\
                                     ((val <<  8) & 0x00FF0000) |\
                                     ((val >>  8) & 0x0000FF00) |\
                                     ((val >> 24) & 0x000000FF) )

#define SWAP_BYTE_ORDER_USHORT(val) (((val << 8) & 0xFF00) |\
                                     ((val >> 8) & 0x00FF) )

byte swapping functions are included in many headers, for example libcob/common.h:

/* Byte swap functions */

/*
   The original idea for the byteswap routines was taken from GLib.
   (Specifically glib/gtypes.h)
   GLib is licensed under the GNU Lesser General Public License.
*/

/* Generic swapping functions */

#undef	COB_BSWAP_16_CONSTANT
#undef	COB_BSWAP_32_CONSTANT
#undef	COB_BSWAP_64_CONSTANT
#undef	COB_BSWAP_16
#undef	COB_BSWAP_32
#undef	COB_BSWAP_64

#define COB_BSWAP_16_CONSTANT(val)	((cob_u16_t) (		\
    (((cob_u16_t)(val) & (cob_u16_t) 0x00FFU) << 8) |		\
    (((cob_u16_t)(val) & (cob_u16_t) 0xFF00U) >> 8)))

#define COB_BSWAP_32_CONSTANT(val)	((cob_u32_t) (		\
    (((cob_u32_t) (val) & (cob_u32_t) 0x000000FFU) << 24) |	\
    (((cob_u32_t) (val) & (cob_u32_t) 0x0000FF00U) <<  8) |	\
    (((cob_u32_t) (val) & (cob_u32_t) 0x00FF0000U) >>  8) |	\
    (((cob_u32_t) (val) & (cob_u32_t) 0xFF000000U) >> 24)))

#define COB_BSWAP_64_CONSTANT(val)	((cob_u64_t) (		\
    (((cob_u64_t) (val) &					\
      (cob_u64_t) COB_U64_C(0x00000000000000FF)) << 56) |	\
    (((cob_u64_t) (val) &					\
      (cob_u64_t) COB_U64_C(0x000000000000FF00)) << 40) |	\
    (((cob_u64_t) (val) &					\
      (cob_u64_t) COB_U64_C(0x0000000000FF0000)) << 24) |	\
    (((cob_u64_t) (val) &					\
      (cob_u64_t) COB_U64_C(0x00000000FF000000)) <<  8) |	\
    (((cob_u64_t) (val) &					\
      (cob_u64_t) COB_U64_C(0x000000FF00000000)) >>  8) |	\
    (((cob_u64_t) (val) &					\
      (cob_u64_t) COB_U64_C(0x0000FF0000000000)) >> 24) |	\
    (((cob_u64_t) (val) &					\
      (cob_u64_t) COB_U64_C(0x00FF000000000000)) >> 40) |	\
    (((cob_u64_t) (val) &					\
      (cob_u64_t) COB_U64_C(0xFF00000000000000)) >> 56)))

/* Machine/OS specific overrides */

#ifdef	__GNUC__

#if	__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3)

#define COB_BSWAP_16(val) (COB_BSWAP_16_CONSTANT (val))
#define COB_BSWAP_32(val) (__builtin_bswap32 (val))
#define COB_BSWAP_64(val) (__builtin_bswap64 (val))

#elif	defined(__i386__)

#define COB_BSWAP_16(val) (COB_BSWAP_16_CONSTANT (val))
#define COB_BSWAP_32(val)					\
       (__extension__						\
	({ register cob_u32_t __v,				\
	     __x = ((cob_u32_t) (val));				\
	   if (__builtin_constant_p (__x))			\
	     __v = COB_BSWAP_32_CONSTANT (__x);			\
	   else							\
	     __asm__ ("bswap %0"				\
		      : "=r" (__v)				\
		      : "0" (__x));				\
	    __v; }))
#define COB_BSWAP_64(val)					\
       (__extension__						\
	({ union { cob_u64_t __ll;				\
		   cob_u32_t __l[2]; } __w, __r;		\
	   __w.__ll = ((cob_u64_t) (val));			\
	   if (__builtin_constant_p (__w.__ll))			\
	     __r.__ll = COB_BSWAP_64_CONSTANT (__w.__ll);	\
	   else							\
	     {							\
	       __r.__l[0] = COB_BSWAP_32 (__w.__l[1]);		\
	       __r.__l[1] = COB_BSWAP_32 (__w.__l[0]);		\
	     }							\
	   __r.__ll; }))

#elif defined (__ia64__)

#define COB_BSWAP_16(val) (COB_BSWAP_16_CONSTANT (val))
#define COB_BSWAP_32(val)					\
       (__extension__						\
	 ({ register cob_u32_t __v,				\
	      __x = ((cob_u32_t) (val));			\
	    if (__builtin_constant_p (__x))			\
	      __v = COB_BSWAP_32_CONSTANT (__x);		\
	    else						\
	     __asm__ __volatile__ ("shl %0 = %1, 32 ;;"		\
				   "mux1 %0 = %0, @rev ;;"	\
				    : "=r" (__v)		\
				    : "r" (__x));		\
	    __v; }))
#define COB_BSWAP_64(val)					\
       (__extension__						\
	({ register cob_u64_t __v,				\
	     __x = ((cob_u64_t) (val));				\
	   if (__builtin_constant_p (__x))			\
	     __v = COB_BSWAP_64_CONSTANT (__x);			\
	   else							\
	     __asm__ __volatile__ ("mux1 %0 = %1, @rev ;;"	\
				   : "=r" (__v)			\
				   : "r" (__x));		\
	   __v; }))

#elif defined (__x86_64__)

#define COB_BSWAP_16(val) (COB_BSWAP_16_CONSTANT (val))
#define COB_BSWAP_32(val)					\
      (__extension__						\
	({ register cob_u32_t __v,				\
	     __x = ((cob_u32_t) (val));				\
	   if (__builtin_constant_p (__x))			\
	     __v = COB_BSWAP_32_CONSTANT (__x);			\
	   else							\
	    __asm__ ("bswapl %0"				\
		     : "=r" (__v)				\
		     : "0" (__x));				\
	   __v; }))
#define COB_BSWAP_64(val)					\
       (__extension__						\
	({ register cob_u64_t __v,				\
	     __x = ((cob_u64_t) (val));				\
	   if (__builtin_constant_p (__x))			\
	     __v = COB_BSWAP_64_CONSTANT (__x);			\
	   else							\
	     __asm__ ("bswapq %0"				\
		      : "=r" (__v)				\
		      : "0" (__x));				\
	   __v; }))

#else /* Generic gcc */

#define COB_BSWAP_16(val) (COB_BSWAP_16_CONSTANT (val))
#define COB_BSWAP_32(val) (COB_BSWAP_32_CONSTANT (val))
#define COB_BSWAP_64(val) (COB_BSWAP_64_CONSTANT (val))

#endif

#elif defined(_MSC_VER)

#define COB_BSWAP_16(val) (_byteswap_ushort (val))
#define COB_BSWAP_32(val) (_byteswap_ulong (val))
#define COB_BSWAP_64(val) (_byteswap_uint64 (val))

#else /* Generic */

#define COB_BSWAP_16(val) (COB_BSWAP_16_CONSTANT (val))
#define COB_BSWAP_32(val) (COB_BSWAP_32_CONSTANT (val))
#define COB_BSWAP_64(val) (COB_BSWAP_64_CONSTANT (val))

#endif

/* End byte swap functions */

GitMensch avatar Jun 24 '18 09:06 GitMensch

@GitMensch Maybe I was wrong about htons/htonl pair, but their inverse, ntohs and ntohl, have to be included in the skeleton as it is part of the decoder of the flex table file format.

I believe making htons/htonl macro defines would cause more portability problem than it solves, since (I think) some compilers would recognize htons/htonl functions and optimize them into inline assembly without generating function calls. Repeating what a compiler would do in a header is a bad idea.

I think with portability to Windows, including <winsock.h> would be the way to go. But for even more exotic platforms, I don't know if we would need this naïve fallback:

unsigned short htons(unsigned short val) {
    unsigned char arr[sizeof(short)] = {0};
    int i;
    for (i = sizeof(short) - 1; i >= 0; i--) {
        arr[i] = (unsigned char)(val);
        val >>= CHAR_BIT;
    }
    return (*(unsigned short *)arr);
}

(Yes, you get it. This is the most portable htons implementation within ANSI C constraints.)

Explorer09 avatar Jun 24 '18 10:06 Explorer09

If we're going to provide an implementation of a function, we should use the libobj mechanism.

On Sunday, 24 June 2018, 3:39 am -0700, "Kang-Che Sung (宋岡哲)" [email protected] wrote:

@GitMensch Maybe I was wrong about htons/htonl pair, but their inverse, ntohs and ntohl, have to be included in the skeleton as it is part of the decoder of the flex table file format.

I believe making htons/htonl macro defines would cause more portability problem than it solves, since (I think) some compilers would recognize htons/htonl functions and optimize them into inline assembly without generating function calls. Repeating what a compiler would do in a header is a bad idea.

I think with portability to Windows, including <winsock.h> would be the way to go. But for even more exotic platforms, I don't know if we would need this naive fallback:

short htons(short val) {
    unsigned char arr[sizeof(short)] = {0};
    int i;
    for (i = sizeof(short) - 1; i >= 0; i--) {
        arr[i] = (unsigned char)(val);
        val >>= CHAR_BIT;
    }
    return (*(short *)arr);
}

(Yes, you get it. This is the most portable htons implementation within ANSI C constraints.)

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/westes/flex/issues/367#issuecomment-399746765

-- Will Estes [email protected]

westes avatar Jun 24 '18 13:06 westes

@westes I think the licensing problem (it's LGPL) already prevents us from including libcob's header directly info flex's source.

Explorer09 avatar Jun 24 '18 13:06 Explorer09

libcob?

On Sunday, 24 June 2018, 6:29 am -0700, "Kang-Che Sung (宋岡哲)" [email protected] wrote:

@westes I think the licensing problem (it's LGPL) already prevents us from including libcob's header directly info flex's source.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/westes/flex/issues/367#issuecomment-399756660

-- Will Estes [email protected]

westes avatar Jun 24 '18 13:06 westes

@westes I was talking about libcob that @GitMensch mentioned. And I had no idea what is "libobj mechanism" you are talking about.

Explorer09 avatar Jun 24 '18 13:06 Explorer09

libobj as handled by automake. It's the way to provide portability functions when absent from the system at compile time.

On Sunday, 24 June 2018, 6:36 am -0700, "Kang-Che Sung (宋岡哲)" [email protected] wrote:

@westes I was talking about libcob that @GitMensch mentioned. And I had no idea what is "libobj mechanism" you are talking about.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/westes/flex/issues/367#issuecomment-399757294

-- Will Estes [email protected]

westes avatar Jun 24 '18 13:06 westes

@westes Gosh. There're many similarly named "libobj" library out there on the web, so it was not easy to see which one you were talking about. Well, I could make a PR about adding htons and htonl to automake $LIBOBJS, but that wouldn't solve the problem of ntohs and ntohl in the generated scanner code. I'm not sure what to do with the latter case. Adding a fallback version of yyflex_ntohs and yyflex_ntohl into the skeleton?

Explorer09 avatar Jun 24 '18 13:06 Explorer09

Hello everyone,

when I ported flex to Windows I used htons/htonl functions from <winsock.h>. then I decided to remove dependency from Ws2_32.dll by replacing htons/htonl with macro (because wintel has only one byte order).

I can easily revert back my "improvement" and use htons/htonl again.

ntohs/ntohl are used in generated code only. I didn't changed this code so parsers are dependent from Ws2_32.dll under windows now. And I think it's not a big deal.

In summary, I don't think ntohs/ntohl is a problem at all.

lexxmark avatar Sep 13 '18 09:09 lexxmark

@lexxmark Please revert this change to use functions from winsock.h. I'll then try to create a PR to this repo which includes the checks for #if defined (HAVE_SYS_WAIT_H) && defined (HAVE_FORK) and HAVE_WINSOCK_H (including configure.ac to set these).

This should be a good start to minimize the differences between this repo and your port.

GitMensch avatar Sep 13 '18 09:09 GitMensch

@lexxmark If it's me, I would like to have htons/htonl function names preserved. The reason is they're POSIX, even though they're not part of standard (ISO) C (I wish it become so, or at least reserve the names, so that when people refer to htonl they will always mean this function). I've read your code (this example), and I don't like the idea of hard-coding SWAP_BYTE_ORDER_UINT(), as it would bring more portability problems than it solves.

If you read my post above this issue report, I have presented a code that does endian-independant appraroch to byte swapping (or see this article on the web). If we are to add a htons or htonl in Flex, it would be that most portable version. Otherwise, I would better tell compilers to fix their optimization rather than implement a workaround that would always be inferior to a machine instruction.

Explorer09 avatar Sep 13 '18 10:09 Explorer09

@Explorer09 lexxmark wrote about using htons/htonl in winflexbison again (with additional winsock.h and link to Ws2_32.dll). As soon as this is done I'll do something similar for flex using defines and checks in configure.ac + PR here.

GitMensch avatar Sep 13 '18 13:09 GitMensch

I've reverted htons/htonl (see my commit)

lexxmark avatar Sep 14 '18 07:09 lexxmark