neomutt icon indicating copy to clipboard operation
neomutt copied to clipboard

check return value of sscanf()

Open roccoblues opened this issue 1 year ago • 1 comments
trafficstars

This fixes the Missing return-value check for a 'scanf'-like function alerts from Github code scanning.

There are a lot more places where we don't check the return value of sscanf() but somehow Github doesn't complain about those. Maybe there's a limit of alerts per rule?

roccoblues avatar May 13 '24 18:05 roccoblues

I don't think we should check the calls of sscanf on the result of a successful regex match. Failing to parse those would indicate an error in the code, not in the provided data. If anything, we could assert that sscanf succeeds, but I don't think we do that very often.

gahr avatar May 13 '24 20:05 gahr

If we're going to improve sscanf(3) calls, I'd go a step further and remove them.

sscanf(3) says:

BUGS
   Numeric conversion specifiers
     Use of the numeric conversion specifiers produces Undefined Behav‐
     ior for invalid input.  See C11 7.21.6.2/10.  This is a bug in the
     ISO  C  standard,  and  not an inherent design issue with the API.
     However, current implementations are not safe from that bug, so it
     is not recommended to use  them.   Instead,  programs  should  use
     functions  such  as  strtol(3)  to  parse numeric input.  Alterna‐
     tively, mitigate it by specifying a maximum field width.

alejandro-colomar avatar May 13 '24 21:05 alejandro-colomar

In https://github.com/shadow-maint/shadow/, I've been developing a set of macros that wrap strtol(3) et al. to do numeric parsing safely:

https://github.com/shadow-maint/shadow/pull/893

And wrote a library, hosted in https://git.kernel.org/pub/scm/libs/liba2i/liba2i.git/.

alejandro-colomar avatar May 13 '24 21:05 alejandro-colomar

I don't think we should check the calls of sscanf on the result of a successful regex match.

Failing to parse those would indicate an error in the code, not in the provided data.

sscanf(3) could fail due to ENOMEM at run-time. An error doesn't necessarily indicate a programmer error.

If anything, we could assert that sscanf succeeds, but I don't think we do that very often.

I would assert programmer errors (or just ignore them), but for run-time errors, usual error handling is more appropriate, I think.

alejandro-colomar avatar May 13 '24 22:05 alejandro-colomar

Honestly, I don't think ENOMEM in sscanf is a realistic scenario (nowadays). But sure, let's take a nodiscard approach everywhere, if that's what we want to focus on.

Edit 1: btw, sscanf producing ENOMEM might be a Linux detail? POSIX doesn't seem to mention it (it does for fscanf), and the FreeBSD manual page doesn't either.

Edit 2: I have double check the source of the FreeBSD implementation - just to have an empirical data point, and it never fails on ENOMEM. Which makes sense: what memory would sscanf need to allocate?

gahr avatar May 14 '24 04:05 gahr

If roccoblues is willing, I suggest:

  1. Test the retval of every scanf() This means making a decision, for each case, on how to handle an error. ASSERT() is ok where we've already done some checking, e.g. regex.

  2. Simplify every numeric scanf() into an atoi() call. Use gahr's wrappers in mutt/atoi.h

  3. Future: Consider using liba2i

flatcap avatar May 14 '24 07:05 flatcap

If roccoblues is willing, I suggest:

sure, this might take a while but it's an easy to work on incremental.

1. Test the retval of every `scanf()`
   This means making a decision, for each case, on how to handle an error.
   `ASSERT()` is ok where we've already done some checking, e.g. regex.

ok, I've updated this PR.

2. Simplify every numeric `scanf()` into an `atoi()` call.
   Use gahr's wrappers in [mutt/atoi.h](https://github.com/neomutt/neomutt/blob/main/mutt/atoi.h)

just for me to be sure: in the current changes those couldn't be used right?

roccoblues avatar May 14 '24 08:05 roccoblues

in the current changes those [atoi functions] couldn't be used right?

We can use them, I was suggesting that we sort out the error paths first. s/we/you/ :-)

flatcap avatar May 14 '24 08:05 flatcap

As discussed on IRC: this feels like a lot of busywork without producing much value. I'm going to just drop this.

roccoblues avatar May 14 '24 09:05 roccoblues

Honestly, I don't think ENOMEM in sscanf is a realistic scenario (nowadays). But sure, let's take a nodiscard approach everywhere, if that's what we want to focus on.

Edit 1: btw, sscanf producing ENOMEM might be a Linux detail? POSIX doesn't seem to mention it (it does for fscanf),

sscanf(3posix):

DESCRIPTION
     Refer to fscanf().

fscanf(3posix):

ERRORS
     For the conditions under which the fscanf() functions fail and may
     fail, refer to fgetc() or fgetwc().

fgetc(3posix):

     The fgetc() function may fail if:

     ENOMEM
            Insufficient storage space is available.

and the FreeBSD manual page doesn't either.

I don't find an ERRORS section in FreeBSD's sscanf(3). :|

Edit 2: I have double check the source of the FreeBSD implementation - just to have an empirical data point, and it never fails on ENOMEM. Which makes sense: what memory would sscanf need to allocate?

I honestly don't know. Supposedly, via fgetc(3), according to POSIX.

alejandro-colomar avatar May 14 '24 10:05 alejandro-colomar

that chain of logic makes no sense. the f*() functions can fail if they cannot allocate a buffer. sscanf does not allocate buffers.

ossilator avatar May 14 '24 10:05 ossilator

3. Future: Consider using liba2i

I hope to get it packaged into the first distributions during this year or the next, although it depends on how much time it takes to to convince distros, and how much time they have.

Also, I'll wait to do that after shadow has finished migrating to those macros.

But when that's closer to be done, I'll come back here and do the migration to liba2i, if you want.

alejandro-colomar avatar May 14 '24 10:05 alejandro-colomar

that chain of logic makes no sense. the f*() functions can fail if they cannot allocate a buffer.

sscanf does not allocate buffers.

I see the following on musl (I chose musl because it's usually easier to read than glibc):

$ grepc -tfd sscanf . | grep scanf
./src/stdio/sscanf.c:int sscanf(const char *restrict s, const char *restrict fmt, ...)
	ret = vsscanf(s, fmt, ap);
$ grepc -tfd vsscanf . | grep scanf
./src/stdio/vsscanf.c:int vsscanf(const char *restrict s, const char *restrict fmt, va_list ap)
	return vfscanf(&f, fmt, ap);

It is just a wrapper to f*() functions, and so it can allocate, and so it can ENOMEM. I didn't check if the paths that allocate are unreachable when called from vsscanf(3), though.

$ grepc -tfd vfscanf . | grep alloc
	int alloc = 0;
		alloc = 0;
			alloc = !!dest;
			alloc = 0;
				if (alloc) {
					wcs = malloc(k*sizeof(wchar_t));
					if (!wcs) goto alloc_fail;
					if (alloc && i==k) {
						wchar_t *tmp = realloc(wcs, k*sizeof(wchar_t));
						if (!tmp) goto alloc_fail;
			} else if (alloc) {
				s = malloc(k);
				if (!s) goto alloc_fail;
						char *tmp = realloc(s, k);
						if (!tmp) goto alloc_fail;
			if (alloc) {
alloc_fail:
		if (alloc) {

alejandro-colomar avatar May 14 '24 10:05 alejandro-colomar