flex icon indicating copy to clipboard operation
flex copied to clipboard

What happens at the end of the file for lex? - input() return value

Open spth opened this issue 4 years ago • 2 comments

I wonder what is supposed to happen when a lex lexer reaches the end of the input calling input().

This information needs to be included in the manual. The only information I found in the flex manual states:

If 'input()' encounters an end-of-file the normal 'yywrap()' processing is done. A "real" end-of-file is returned by 'input()' as 'EOF'.

That part is the same for flex 2.5.4 and flex 2.6.4. But the two versions behave quite differently. Also what is an ''end-of-file" vs a "'real' end-of-file"?

The breaking change between 2.5.4 and 2.6.4 should be documented in the manual. And I wonder why it was made.

This small example reproduces the difference:


%%
.       {for(int i = 0; i < 4; i++) {int ch = input(); printf("%d\n", ch);}}

%%
main()
        {
        yylex();
        }
        
int
yywrap (void)
{
  printf("yywrap!\n");
  return 1;
}

Invoked on input containing only one character, I see

philipp@notebook5:/tmp$ ./a.out < test.c
10
yywrap!
-1
yywrap!
-1
yywrap!
-1
yywrap!

i.e. input() returning EOF for flex 2.5.4, and

philipp@notebook5:/tmp$ ./a.out < test.c
10
yywrap!
0
yywrap!
0
yywrap!
0
yywrap!

i.e. input() returning 0 for flex 2.6.4.

There is also a Debian bug report about this: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=911415 and there was a previous issue reported here, but closed without comment: https://github.com/westes/flex/issues/394

This change breaks e.g. the Small Device C Compiler.

spth avatar Jun 03 '20 08:06 spth

The change was made here, but there is no information as to why, and no corresponding change in documentation: https://github.com/westes/flex/commit/f863c9490e6912ffcaeb12965fb3a567a10745ff

spth avatar Jun 03 '20 08:06 spth

This also breaks the scanner used by libdtrace in FreeBSD.

markjdb avatar Feb 12 '21 19:02 markjdb