Allow filter_input to work again
There was a small logic bug in frankenphp.c where if should_filter_var is true, it would never execute the next part of the if-statement, and thus never call sapi_module.input_filter(). This PR simply adjusts the logic to ensure if should_filter_var is true, then call sapi_module.input_filter(), otherwise, only enter the body if should_filter_var is false.
closes #1657
filter_input in combination with $_SERVER and $_ENV is disabled by default since 1.3 if nothing is explicitly set in the php.ini. It makes global registration very inefficient, even though it's probably not used 99% of the time with $_SERVER and $_ENV.
It still works if any filter is defined explicitly in the php.ini (also filter.default=unsafe_raw), it probably makes sense to document this.
with any filter_vars:
without filter_vars:
The slowdown comes from the fact that filter_var allocates the whole $_SERVER array 2 additional times if I remember correctly.
I imagine there will be some kind of optimization coming in PHP 9, since the ini directive is also apparently deprecated.
Hmmm. I was going to look into that. filter.default is deprecated and going away in php 9, so I'm not sure if that's a good idea to rely on it. (And if filter.default is not set, there is still a default filter IIRC).
Yeah the default filter is unsafe_raw, which doesn't do anything without flags (it will just copy $_SERVER in case filter_input(INPUT_SERVER) is called during the request).
The slowdown is probably not worth it, and I'd recommend using filter_var($_SERVER["REQUEST_METHOD"] ?? "" directly if this stays unoptimized in PHP 9. But hard to say, maybe PHP 9 will even have a request object.
Hmmm. So, if you don’t set a default right now, filter_input doesn’t do anything in FrankenPHP? It just returns an empty string? I’m not sure if that’s better or worse for libraries/frameworks that expect it to at least return unsafe-raw if the extension is available.
filter_input is the only way to get around hackers/frameworks/libraries messing with $_SERVER (and other super globals) in memory. It doesn’t access them at all, is immutable, and should return the "raw/filtered" data.
filter_input still works with all globals, only for $_SERVER and $_ENV filter.default has to be explicitly defined in the php.ini, else it will always return NULL.
I think this is definitely the correct default for worker mode, since worker mode necessarily has to register $_SERVER on every request. Production grade $_SERVER arrays often end up being multiple KB in size and there would be no way to avoid those additional allocations, even though all frameworks supporting a worker mode don't make use of filter_input(INPUT_SERVER).
That being said, I would be fine with enabling it in 'regular mode', if you think the performance hit is worth being more compatible with FPM defaults.
As a sidenote:
Searching for $_SERVER on github yields 3.4 million results, while searching for filter_input(INPUT_SERVER only yields 6300. There's also an (unrelated) note of caution in the manual against using it with $_SERVER.
I have strong opinions about neutering built-in extensions just to eek out a little extra performance. If users want greater performance, they can build php with --disable-filter or we should build it without that extension available, so users have to install it themselves.
Otherwise, if the extension is available, we should behave just like any SAPI, even if it hurts performance (or make changes to php-src so that the performance impact is mitigated).
Searching for $_SERVER on github yields 3.4 million results, while searching for filter_input(INPUT_SERVER only yields 6300.
As mentioned earlier, filter_input is the only way to ensure nothing has modified input data (handy in untrusted environments like WordPress plugins). Just because most code is built to run in trusted environments (i.e. an application), doesn’t make the point moot.
There's also an (unrelated) note of caution in the manual against using it with $_SERVER.
That bug was fixed ~15 years ago. However, many frameworks also inject things into $_ENV and $_SERVER, such as .env file loaders, making it somewhat still relevant in those frameworks.
wdyt @dunglas?
I agree with @withinboredom rationale. However it would be nice to fix that in php-src.
I guess I disagree, at least for worker mode.
I think the defaults for worker mode should be fast, performance is the whole point of its existence. Building PHP without --filter isn't really an option, many libraries and applications still rely on the filter extension (it's a good extension).
Only the filter_input(INPUT_SERVER combination is niche enough to not be worth the slower default. I also think this should be optimized in php-src, first the deprecated global filters need to be removed entirely in PHP 9 though.
However it would be nice to fix that in php-src.
@AlliBalliBaba Can you give this branch+fork of php-src + this branch of FrankenPHP a go? (you should be able to apply the diff to any branch of PHP as well).
I think the defaults for worker mode should be fast, performance is the whole point of its existence.
It shouldn't sacrifice correctness for performance. The whole point is to make it faster while still being correct. Otherwise, we could just delete half PHP to make it faster. (edit to add: though I actually think I agree with you -- I was mostly just arguing on principles. Worker mode is already quirky.)
Looks like this branch would at least avoid 1 $_SERVER copy, I'll give it a go 👍
Hard to say from looking at the flamegraphs alone, but I think your branch makes this a bit more efficient. Last time I looked at it, it was actually copying the whole memory twice over, I think your branch only copies it once. Haven't tested it for correctness, but this probably also speeds up all other SAPIs.
I still think it would be fine to break with FPM default behavior in worker mode.
your fork + this branch
main
Should we merge this one?
filter_input is also on the chopping block for deprecations in 8.5 (not yet approved though)
Let's see how that RFC shakes out then. I'm just getting back from a short vacation and have a bajillion emails to read through -- I hope it doesn't get removed, but rather it may be better to replace it with something else.
Looks like none of the filter deprecations passed.