inih icon indicating copy to clipboard operation
inih copied to clipboard

Seems to not discard the rest of a long line?

Open xlucn opened this issue 1 year ago • 4 comments

The problem:

The issue assumes we are dealing with a long line in an ini file.

In line 136, reader (fgets) will read max_line chars. If using a heap, then in line 139 and forward, the heap will resize and all the rest of the line will be read:

https://github.com/benhoyt/inih/blob/5e1d9e2625842dddb3f9c086a50f22e4f45dfc2b/ini.c#L136-L141

But, in the default case with a stack (fixed size array), fgets will read max_line and do nothing with the rest of the line, which I think will stay in the input stream. Then, in the next iteration of while on line 136, fgets will continue to read that rest of the line, but inih treats those text as the next line. The code will most likely find no : or = in those text, thus report an error on line #(line number of the long line + 1).

Discussion:

So, is this expected behavior, or should inih discard the rest of a lone line? The latter might be something like this:

    while (reader(line, (int)max_line, stream) != NULL) {
#if INI_ALLOW_REALLOC && !INI_USE_STACK
... ...
#else
if (strchr(line, '\n') == NULL)
    while (fgetc(stream) != '\n')
        ;
#endif

xlucn avatar Oct 23 '22 14:10 xlucn

Hmm, it's a fair question. However, I'm not sure silently discarding the rest of the line is better than returning an error (and inih will keep parsing on error by default). I guess we could discard the rest of the line to avoid unexpected behaviour, and still return an error?

benhoyt avatar Oct 23 '22 21:10 benhoyt

I guess we could discard the rest of the line to avoid unexpected behaviour, and still return an error?

I guess that works, too. Only that "not having read all the texts" is not as severe as a format error or handler-returned error. And currently inih does not provide any other information about the error type (similar to errno).

Anyway, the rest of a long line should be regarded no matter what, in case those texts represent another valid section or key-value entry, thus contaminating the result.

xlucn avatar Oct 24 '22 02:10 xlucn

Could use some test cases like this:

a=<197 * '1'>b=222

and see if b=222 is parsed as a new entry. My result with current inih is (in this case, no error is reported due to the rest of the line is successfully parsed)

section: , name: a, value: 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
section: , name: b, value: 222
0

with this test program

#include <stdio.h>
#include <ini.h>

int handler(void *user, const char *s, const char *n, const char *v)
{
	printf("section: %s, name: %s, value: %s\n", s, n, v);
	return 1;
}

int main (int argc, char *argv[])
{
	printf("%d\n", ini_parse("testini.ini", handler, NULL));
	return 0;
}

xlucn avatar Oct 24 '22 03:10 xlucn

Thanks for this. I don't have time to fix this properly now, but I'd be happy to review a quality PR (code and tests) to address this. I think we should probably discard the rest of the line, keep parsing, and return an error.

However, it's a little bit tricky. We need to distinguish the case where the last line in the file doesn't have a '\n' at the end -- that shouldn't be an error. Note also that we can't use fgetc, as we need to stick with the "reader" interface (fgets).

Also note that strchr(line, '\n') will scan the string an extra time -- it'd be good to avoid that if possible. Maybe we can combine the rstrip(start), which does a strlen(s), with a check for '\n'?

benhoyt avatar Nov 10 '22 23:11 benhoyt