llvm-project icon indicating copy to clipboard operation
llvm-project copied to clipboard

folding snprintf with n greater than INT_MAX succeeds and fails to set errno

Open msebor opened this issue 2 years ago • 2 comments

As the following test case shows, LLVM folds calls to snprintf with a constant format string containing no formatting directives to its length. This is done provided the bound (the second argument, n, to the function) is greater than the string length. This is both overly aggressive and needlessly restrictive. This issue is about the former.

@s = constant [4 x i8] c"123\00"

declare i32 @snprintf(i8*, i64, i8*, ...)

define i32 @f(i8* %d) {
  %f = getelementptr [4 x i8], [4 x i8]* @s, i32 0, i32 0
  %n = call i32 (i8*, i64, i8*, ...) @snprintf(i8* %d, i64 2147483648, i8* %f)
  ret i32 %n
}

The resulting IR of the function:

define i32 @f(i8* %d) {
  %1 = bitcast i8* %d to i32*
  store i32 3355185, i32* %1, align 1
  ret i32 3
}

However, as an extension to C, POSIX specifies that

The snprintf() function shall fail if:

[EOVERFLOW] [CX] The value of n is greater than {INT_MAX}.

This requirement is violated by the folding above. To conform, the optimization either needs to fold the call to -1 and arrange for errno to be set to EOVERFLOW on POSIX systems, or it needs to avoid the folding for n in excess of INT_MAX. (GCC does the latter, after issuing a warning for the excessive bound.)

msebor avatar Jul 18 '22 21:07 msebor

As a data point, Glibc has opted not to follow the POSIX requirement due to PR #14771.

msebor avatar Jul 25 '22 15:07 msebor

Fixed.

msebor avatar Aug 15 '22 22:08 msebor