fatfree
fatfree copied to clipboard
Cache Configs?
I like the idea of using config files, especially for routes when they are very long. I also like the idea of using opcode caching like APC, etc.
When I include my routes in the PHP file, they are cached in the opcode cache, however, when I keep them in a config file, there is a file read for every hit.
Since these change infrequently, does it not make sense to provide the option to cache configs? I have some code I am testing in my fork https://github.com/richgoldmd/fatfree-core/commit/fd3315936330530ba55be1fcf1562ef1c8bb286a
basically:
function config($file,$allow=FALSE, $cache_ttl=0) {
$cache = Cache::instance();
if (!$cache_ttl || !$cache->exists($hash=$this->hash($file),$matches)) {
preg_match_all(
'/(?<=^|\n)(?:'.
'\[(?<section>.+?)\]|'.
'(?<lval>[^\h\r\n;].*?)\h*=\h*'.
'(?<rval>(?:\\\\\h*\r?\n|.+?)*)'.
')(?=\r?\n|$)/',
$this->read($file),
$matches,PREG_SET_ORDER);
}
if ($matches) {
if ($cache_ttl) {
$cache->set($hash, $matches, $cache_ttl);
}
.....
:+1:
As for the implementation, I'm not sure if the ttl makes sense. We don't care much if the file gets cached for 1h or 1 week, but we do care that it gets refreshed as soon as we change one parameter.
So maybe something with filemtime
like in Preview::render
?
Also config files that contain dynamic tokens (cf. the $allow
parameter) can't be cached.
I think we could adopt the same strategy as in Preview::render
: to compile config files into plain PHP files. Once a file would be compiled, we would just check if filemtime
didn't change and simply require
it.
I think that makes sense but if the caching mechanism is the filesystem I think there is no real performance gain (Since the parsing is trivial and the bulk of the cycles are spent reading the file, IMO). So unlike Preview::render - I would skip the whole thing if there is no caching and maybe also if the cache mechanism is the filesystem.
I think for this to be meaningful, we would cache the last filemtime
along with $matches
and compare that to the filemtime
of the actual config file.
Caching matches
still works because dynamic tokens can still be resolved. Converting the whole process to a runnable PHP file seems like overkill.
It remains to be seen if the cache-check/cache-load + stat on a file is more efficient that simply reading the file.
Thoughts?
@richgoldmd if you cache only $matches
, you just skip the preg_match + file read, which is not the most time consuming. I just made a quick test with the definition of 50 variables + 50 routes:
- ini file: ~6 ms
- php file: ~3 ms
- ini file with cached matches: ~6 ms
Actually I'm pretty sure that the most time-consuming task is the call to $f3->route
. I made another test, where ROUTES
is cached and restored using a simple $f3->ROUTES+=$cached['ROUTES']
and I get an average of 0.3 ms for the same test as above!
@stehlo you're right. The only thing is that it's safer to provide an ini file to a non-that-techie client than giving him index.php.
@xfra35 thanks for running the timings. What cache engine are you using in the test?
Are you using an opcode cache as well?
@richgoldmd APC and yes, it has an opcode cache. This was a just a quick test to get a rough order of magnitude.
Well, then @xfra35, you are correct. @stehlo is also correct.
It seems the best compromise if there is a concern about performance at this point (in the absence of rendering out a PHP file that handles the routes as well, which is likely not worth the effort) is to keep the configs, especially the routes, in a PHP file. Opcode caching can be leveraged as well by putting the details in a separate, explicitly required
file. This would likely achieve the same goal of keeping the bootstrap file clean, making configs/routes easy to maintain, and leveraging caching to avoid repetitive file reads.
Thanks to you both for the input.
-Rich
Hey guys, before we close the topic, I've made a quick benchmark to find out what could significantly speed up the configuration phase.
In short, I've compared the loading times of:
- 1 config file without caching
- 2 smaller config files (globals + routes) with caching of routes (easier to cache than globals)
- 1 config file with caching (routes + globals)
Each test is performed twice: the first one against an .ini file, and the second one against a .php file. Config data contains 50 vars + 50 routes.
Here are the results (averages of 20 runs):
case | format | APC + op. cache | File cache |
---|---|---|---|
1 file not cached | .ini | 6.3 ms | 6.4 ms |
1 file not cached | .php | 4.6 ms | 4.7 ms |
1 file cached + 1 not cached | .ini | 5.3 ms | 5.2 ms |
1 file cached + 1 not cached | .php | 3.4 ms | 4.0 ms |
1 file cached | .ini | 1.2 ms | 0.7 ms |
1 file cached | .php | 1.1 ms | 0.7 ms |
Looks like the initial suggestion of caching config files makes totally sense.
@xfra35 Can you share how you cached the configs? I imagine from our discussion you cached more than just $matches
?
Didn't see the paste link. Got it!
Ok that's interesting. To be clear "File Cache" is caching the files in APC when you set $f3->CACHE=TRUE
- the AP + op-cache column has $f3->CACHE=FALSE
so no caching is happening, just opcode caching.
Sorry I should have clarified the 2 columns:
- APC + op. cache: test run on a VPS with APC + opcode cache enabled
- File cache: test run locally with no in-memory cache (thus file cache is used)
So that were comparing apples-to-apples, Can you run it on your VPS with $f3->CACHE=FALSE
and also the the folder
dsn to force file caching?
Why do you say so? We are comparing standard config()
command (which doesn't cache anything even when CACHE=TRUE
) to semi and full cache-aware config()
in a cached environment.
The reason why I've run a 2nd test on a local machine after the VPS test was to make sure that the results were not biased by the opcode cache. And also to check if the performance gain would be as obvious with a file cache. It turns out to be similar.
What would a test with CACHE=FALSE
bring?
was it 6ms without any caching at all (current way of v3.5.0)?
You're correct, CACHE=FALSE
is redundant. (@ikkez - case 1 is the default framework mechanism). However I am concerned that your local machine with file caching performed better than your VPS with OPcode + memory caching - it would be useful to compare it on the same server with file-based caching.
I am concerned that your local machine with file caching performed better than your VPS with OPcode + memory caching
That surprised me too ^^. On another hand, that's the cheapest VPS I could find. I'll try on a faster one when I have time. You're free to run the script on your side too.
@ikkez without caching of config settings (but with CACHE=TRUE
)
Is the ini file you used something you can share?
On Fri, Jun 19, 2015, 9:24 AM Florent [email protected] wrote:
I am concerned that your local machine with file caching performed better than your VPS with OPcode + memory caching
That surprised me too ^^. On another hand, that's the cheapest VPS I could find. I'll try on a faster one when I have time. You're free to run the script on your side too.
@ikkez https://github.com/ikkez without caching of config settings (but with CACHE=TRUE)
— Reply to this email directly or view it on GitHub https://github.com/bcosca/fatfree/issues/845#issuecomment-113514535.
Sure: here it is.
I just love this hair-splitting exercise, and I am elated that the community is still looking for ways to make this speedy framework faster - things that I never got (nor bothered) to refine further. @xfra35 is right. Routes take up the bulk of the parsing time. Perhaps an array_diff_assoc()
on $this->hive['ROUTES']
within the config()
method can help?
More hair-spitting!
Here is data from a M3.Large server on AWS with locally installed redis vs. file cacheing
Mode | FileCache | Redis | FileCache | Redis |
---|---|---|---|---|
1 | 4.58 | 7.29 | ||
2 | 3.04 | 5.75 | ||
3 | 3.64 | 6.57 | 3.44 | 6.11 |
4 | 2.65 | 5.48 | 2.39 | 5.05 |
5 | 0.73 | 0.80 | 0.49 | 0.43 |
6 | 0.65 | 1.08 | 0.40 | 0.43 |
(Values are msec). APC opcache active. Caching where used is with redis.
The last two columns are with timing starting after the cache value is set so discounts the overhead of connecting to the cache (which needs to be distributed over all caching operations, not just configs). Interesting that the cache setup overhead is more than the file access.
After looking more into it, it turns out that most of the time is spent in $f3->set()
. And therefore the big gap observed between 3-4 and 5-6 in the previous test is mostly due to the fact that we replace 50 calls to $f3->set()
by 2.
I've made another test where we keep the 50 calls and the performance gain is smaller:
case | format | VPS (APC) | Local (File) |
---|---|---|---|
1 file not cached | .ini | 5.8 ms | 5.7 ms |
1 file not cached | .php | 3.5 ms | 4.4 ms |
1 file cached + 1 not cached | .ini | 5.2 ms | 4.6 ms |
1 file cached + 1 not cached | .php | 2.9 ms | 3.6 ms |
1 file cached | .ini | 2.4 ms | 3.6 ms |
1 file cached | .php | 2.4 ms | 3.6 ms |
Also I've added 2 more cases to compare $f3->set()
and $f3->route()
:
case | VPS | Local |
---|---|---|
50x set | 2.2 ms | 3.2 ms |
50x route | 1.2 ms | 0.8 ms |
Looks like if you want to speed up a bit things, you should first have a look at the set()
method.
When cache is enabled, this line takes most of the time spent in set()
. If we skip the $cache->exists
, we get:
case | format | VPS (APC) | Local (File) |
---|---|---|---|
1 file not cached | .ini | 5.6 ms | 3.3 ms |
1 file not cached | .php | 2.6 ms | 2.1 ms |
1 file cached + 1 not cached | .ini | 4.1 ms | 2.2 ms |
1 file cached + 1 not cached | .php | 2.2 ms | 1.4 ms |
1 file cached | .ini | 1.9 ms | 1.2 ms |
1 file cached | .php | 1.7 ms | 1.2 ms |
case | VPS | Local |
---|---|---|
50x set | 1.6 ms | 1.0 ms |
50x route | 1.0 ms | 0.8 ms |
Please draw some conclusions.. I'm a bit confused about what we're trying to solve here ^^
The original question was whether there is a performance hit when using configs because the contents don't benefit from the opcode cache and instead have another fie read. I think we've demonstrated that that is true, but that the caching of configs is complicated, and much of the performance is in the parsing - moreso than in the retrieval of the configs from the file or the cache.
I think the horse is dead. To properly realize the performance gains, the routes and globals need to be handled separately and the implementation is application specific (i.e. Are there multiple route config files? Are the globals prefixed? what about maps?)
For my own part, I think I'll separate route configs, and cache routes per @xfra35 prototype and @bcosca;s suggestion regarding array_diff_asssoc()
in the bootstrap code.
Thanks for weighing in.
Why don't we simply cache the whole HIVE (or its changes) after the config was parsed and processed, and restore that var from cache as long as the config file's modified time has not changed. That reduces all config actions to one set call.
just as a little follow up:
xfra35: When cache is enabled, this line takes most of the time spent in set()
Indeed, this has changed in v3.6 now, so maybe the issue isn't that big anymore.
So @ikkez asked if I was bored to look into this. I came up with a slightly different solution that did have some good impact on performance, even in a small config file. I think there's still some things to improve, but the bulk of it is here.
// updated config method
/**
* Configure framework according to .ini-style file settings;
* If optional 2nd arg is provided, template strings are interpreted
* @return object
* @param $source string|array
* @param $allow bool
**/
function config($source,$allow=FALSE,$config_ttl=0) {
if (is_string($source))
$source=$this->split($source);
if ($allow)
$preview=Preview::instance();
$is_caching_enabled = $config_ttl !== 0;
$has_routes = false;
foreach ($source as $file) {
// pretty much this if statement
if($is_caching_enabled) {
$Cache = Cache::instance();
$config_array = [];
// other cache keys could be implemented to account for unique template vars if $allow = true
$cache_key = $this->hash($file).'.ini';
$Cache->exists($cache_key, $config_array);
if(!empty($config_array)) {
$this->mset($config_array);
continue;
}
}
preg_match_all(
'/(?<=^|\n)(?:'.
'\[(?<section>.+?)\]|'.
'(?<lval>[^\h\r\n;].*?)\h*=\h*'.
'(?<rval>(?:\\\\\h*\r?\n|.+?)*)'.
')(?=\r?\n|$)/',
$this->read($file),
$matches,PREG_SET_ORDER);
$lvals = [];
if ($matches) {
$sec='globals';
$cmd=[];
foreach ($matches as $match) {
if ($match['section']) {
$sec=$match['section'];
if (preg_match(
'/^(?!(?:global|config|route|map|redirect)s\b)'.
'(.*?)(?:\s*[:>])/i',$sec,$msec) &&
!$this->exists($msec[1]))
$this->set($msec[1],NULL);
preg_match('/^(config|route|map|redirect)s\b|'.
'^(.+?)\s*\>\s*(.*)/i',$sec,$cmd);
continue;
}
if ($allow)
foreach (['lval','rval'] as $ndx)
$match[$ndx]=$preview->
resolve($match[$ndx],NULL,0,FALSE,FALSE);
if (!empty($cmd)) {
isset($cmd[3])?
$this->call($cmd[3],
[$match['lval'],$match['rval'],$cmd[2]]):
call_user_func_array(
[$this,$cmd[1]],
array_merge([$match['lval']],
str_getcsv($cmd[1]=='config'?
$this->cast($match['rval']):
$match['rval']))
);
// and this one
if($is_caching_enabled && $cmd[0] === 'routes') {
$has_routes = true;
}
}
else {
$rval=preg_replace(
'/\\\\\h*(\r?\n)/','\1',$match['rval']);
$ttl=NULL;
if (preg_match('/^(.+)\|\h*(\d+)$/',$rval,$tmp)) {
array_shift($tmp);
list($rval,$ttl)=$tmp;
}
$args=array_map(
function($val) {
$val=$this->cast($val);
if (is_string($val))
$val=strlen($val)?
preg_replace('/\\\\"/','"',$val):
NULL;
return $val;
},
// Mark quoted strings with 0x00 whitespace
str_getcsv(preg_replace(
'/(?<!\\\\)(")(.*?)\1/',
"\\1\x00\\2\\1",trim($rval)))
);
preg_match('/^(?<section>[^:]+)(?:\:(?<func>.+))?/',
$sec,$parts);
$func=isset($parts['func'])?$parts['func']:NULL;
$custom=(strtolower($parts['section'])!='globals');
if ($func)
$args=[$this->call($func,$args)];
if (count($args)>1)
$args=[$args];
if (isset($ttl))
$args=array_merge($args,[$ttl]);
call_user_func_array(
[$this,'set'],
array_merge(
[
($custom?($parts['section'].'.'):'').
$match['lval']
],
$args
)
);
if($is_caching_enabled) {
$lvals[] = $match['lval'];
}
}
}
}
// and this one
if($is_caching_enabled) {
$config_array = $this->hive();
$config_array = array_intersect_key($config_array, array_flip($lvals));
if($has_routes) {
$config_array['ROUTES'] = $this->get('ROUTES');
}
$Cache->set($cache_key, $config_array, $config_ttl);
}
}
return $this;
}
This is the example config file I had:
[globals]
CACHE = true
UI = ui/
TEMP = tmp/
ENVIRONMENT = DEVELOPMENT
DEBUG = {{ @ENVIRONMENT === 'DEVELOPMENT' ? 3 : 0}}
LOG = log/
[routes]
GET / = Test_Controller->testEndpoint
One hiccup I did run into was before the $f3->config()
was called the CACHE
hive had to be set. I'm sure we could easily work around this though.
All results were run with ab -n 10000 -c 500 http://localhost:8000/
The results with no caching enabled: https://fpaste.me/aH3bAIVf6K
Percentage of the requests served within a certain time (ms)
50% 926
66% 944
75% 952
80% 957
90% 965
95% 980
98% 1002
99% 1006
100% 1009 (longest request)
The results with caching enabled: https://fpaste.me/U2CO5VfLAM
Percentage of the requests served within a certain time (ms)
50% 519
66% 529
75% 539
80% 548
90% 556
95% 590
98% 633
99% 652
100% 656 (longest request)
The results with caching and igbinary enabled: https://fpaste.me/HjuqnAoMd6
Percentage of the requests served within a certain time (ms)
50% 500
66% 507
75% 517
80% 522
90% 528
95% 544
98% 559
99% 561
100% 565 (longest request)
Like others have mentioned (and what I've tested on my own) cache the routes brings the biggest performance gain. https://fpaste.me/GjoLv0hzTA Shows how much extra processing time goes into just route processing.
From Slack Convo Aug 13th:
Very similar results if I change it from a closure to a Class->method route If I take the routes and simply at the end just cache $fw->ROUTES and then comment out all the routes and then call back the cache for $fw->ROUTES I get a nice little boost from about 350-400ms average request to 230-270ms average request. I get very similar results if I don't use the Cache class, but instead just use file_get_contents and file_put_contents of $fw->ROUTES being serialized. I bet it'd be a little better with igbinary or msgpack Fancy.....with igbinary, that drops down to about 185-200ms per request. So with just a couple tweaks, you can cut the request time in half.