ebpf
                                
                                 ebpf copied to clipboard
                                
                                    ebpf copied to clipboard
                            
                            
                            
                        errors: remove verifier truncation heuristic, rely on ENOSPC
Was getting strange results re-running prog loads with progressively larger buffers to fit verifier logs of large Cilium programs. The verifier always writes a NUL byte to the last byte of the log buffer, so if the next-to-last character was a newline (incredibly common), the error was incorrectly marked as non-truncated.
There is no indication that kernels 5.18 and up don't return ENOSPC for BTF loads. This commit removes the newline heuristic to determine log truncation and relies on ENOSPC exclusively for that.
ErrorWithLog now takes an ancillary error from a second BPF load operation that is used to determine truncation status. This second error will not be wrapped into the VerifierError.
@lmb I've preserved the existing tests for the heuristic, wanted to know what you think of the approach first. We simply cannot rely on checking for one or more newlines. The only other thing we could do is check for processed in the last line, but not sure if that works for BTF logs.
There is no indication that kernels 5.18 and up don't return ENOSPC for BTF loads.
Have you actually managed to get ENOSPC from loading BTF? In my experience an EINVAL or similar will override ENOSPC when loading invalid BTF. Maybe you're talking about ENOSPC from valid BTF with a small log?
Agreed re bringing the check for ENOSPC back, but I think we might still need the heuristic for BTF. We could fall back if the logErr is nil, or we could factor it out into a separate function and call that from the BTF loading code.
Have you actually managed to get ENOSPC from loading BTF? In my experience an EINVAL or similar will override ENOSPC when loading invalid BTF. Maybe you're talking about ENOSPC from valid BTF with a small log?
I see, I misinterpreted your syscall will never return ENOSPC as of 5.18-rc4 comment to be 'as of 5.18-rc4...', implying something changed in 5.18. This should probably be ENOSPC until at least 6.0, as this is still the case on master. In btf.c, at the end of btf_parse(), ENOSPC is only returned if the log space check is hit, which requires a successful BTF parse.
I'll think of something else.
We could fall back if the logErr is nil, or we could factor it out into a separate function and call that from the BTF loading code.
This feels clearer to me. BPF and BTF verifier logs have different styles and their APIs have different semantics.