quickstart-bitnami-wordpress
quickstart-bitnami-wordpress copied to clipboard
Bad performance of EFS might be fixed with cachefilesd
EFS are performing very badly with WordPress due to the high amount of file access and latency, and are generally reported. This might be fixed with cachefilesd and the fsc option in fstab that I have been testing. How can this be set/configured when provisioning new instances? I reckon it is a part of the image.
Is it safe to use cloud-init to run commands as a part of provisioning?
Hi @adionditsak , Thanks for sharing this, I am interested in your findings, could you share your testing and your configuration ? It is true that NFS can be slow when accessing the files, but once PHP has cached the compilation of the file this should not be an issue. I would like to know is this could be a specific issue for you use case, or if this will hit every usecase.
Answering your question, It should be safe according to the cloud-init documentation the script only runs when booting, so it should not interfere the rest of services.
My concern is that if for cachefilesd you need the NFS filesystem to be already mounted and if the mounted filesystems are available at that stage.
Thanks forehand, Rafael Rios Saavedra
Hi @adionditsak, We have enabled opcache by default, and I have checked that is working.
You can check if you have enabled opcache creating a file info.php with the following code:
<?php
phpinfo();
?>
and then access to the url: http://your_ip_or_domain/info.php. Look for a section called: Zend OPcache. This is an example of what I got:
Opcode Caching | Up and Running
-- | --
Optimization | Enabled
SHM Cache | Enabled
File Cache | Disabled
Startup | OK
Shared memory model | mmap
Cache hits | 198464
Cache misses | 274
Used memory | 49994192
Free memory | 150485480
Wasted memory | 846920
Interned Strings Used memory | 1850792
Interned Strings Free memory | 14926424
Cached scripts | 266
Cached keys | 281
Max keys | 7963
OOM restarts | 0
Hash keys restarts | 0
Manual restarts | 0
As you can see I got almost 200k hits in the cache after doing running benchmarking test with ab:
ab -t 60 -c 10 http://your_ip_or_domain/
So, php should not be accessing the file sytem for each of your requests.
Could you check you have opcache enabled and you have a similar behavior ?
Best regards, Rafael Rios Saavedra
Hi @rafariossaa,
Thanks for your responses.
opcache is working as expected, so this is not the problem. I suspect plugins or similar. I even experimented with CDN, memcached and cachefilesd. I will investigate further in a few days and get back to you.
What is the expected average response times with your stack? After migration the application went from 800ms to 4-6 seconds to TTFB. I will see what I can do, and if necessary you should eventually expand the documentation to include optional cache layers
Hi @adionditsak, I would suggest you to have a look into how is the load in database. Sometimes you need many queries to get one page rendered. There are many factors that affects the average response time, it will depends on the load of the servers, the load on aws, load on the database, the size of servers and database, if there is some backup running, the theme you are using, the plugins, etc. I am looking forward to your results.
Thanks forehand, Rafael Rios Saavedra
Hi @adionditsak, I going to perform a study on the performance, I would like if you can give me some details about the wordpress you are running.
I would like to know:
- specs of the deployment you have done (number and type of nodes, region, if you are use memcached, etc ...)
- If the WP you have tested is the default one, or if you have migrated a WP from other place to this one. In this case, I also would like to know the size of the wordpress site, the size of the database, running plugins.
Thanks forehand Rafael Rios Saavedra
@adionditsak @rafariossaa - We're also interested in data you can provide around EFS performance.
Hi @adionditsak and @agamzn , I did some performance analysis on the impact of EFS in this solution and I found that it doesn't have a impact. Measuring this is complex because there are many variables involved, like the size of the site, number of requests per second, amount of persisted files (media files), plugins used, js used etc. So in order to have a base I used a wordpress with the default theme, and I generated some random content. I not using any plugins nor anything that can cache the requests (eg. memcached, w3total cache) or help serving (eg. CDN). This is to measure raw power, the only helping things are: the linux filesystem cache and opcache (in php).
In this sceneario, I simulated 100 concurrent users loading the page. and I tested using Max I/O and provisioned throughput, and it made no difference on the serving of the page. In our solution, the core of WP is in a local file system, and only the persisted data (media, plugins installed,etc) are recovered from the EFS. PHP files, once compiled by php-fpm are very very fast, and I got cache hits in the order of 99.36% so very little is done by filesystem access. What I found is that it is very sensitive about what happen in the client side, most of this things are theme related.
I hope this gives some light, anyways I am still interested in what is happening to you. Please, could you give me more information about your scenario? .
Best regards, Rafael Rios Saavedra.
Hi everyone, I had been testing a similar environment without luck, definitely EFS it's the problem.
Here it the thing, you probably don't want HA with auto scaling for small wordpress sites but for big ones. Meaning lots of plugins and lots of assets and probable logged in users (what makes caching a nightmare).
So in that scenery each request would access tons of file hit hard EFS. The latency it has (EFS) will make the site unusable. So, we can't have WP residing in EFS...
I will try cachefilesd (i will check https://blog.lawrencemcdaniel.com/tuning-aws-efs-for-wordpress/)... Hope next we I may share some info!
Best regards, Mauricio
@rusowyler - I'm circling back on this. Did cachefilesd work?
@andrew-glenn I have tested this, but it has no radical impact really. I am still interested in hearing other, as it might be due to the configuration
Hi, everyone, I had been testing at ap-northeast-2. Very slow performance compared to us-east-2.
Of course I used opcache.
It's the EFS problem.
I have just switched from multi-node EC2 on NFS to a single-node EBS and my website is running 4x faster. Don't waste time with Bitnami HA solution.
I am having the same problem here and it is a challenge. I also switched to EBS and it works perfectly and I would say more than 10x the speed. However, I do not have a choice, I have to run HA wordpress. Is there a way we can extract the media directory and any another directory that requires sharing to a separate directory? This would at least give us the possibility of mounting that in a shared location with RWX access and still allow the site to run on EBS. With this only access to those files would be required to hit the shared location. It is a terrible experience to just sit and wait for a site to load.
Now that it is possible to provision higher throughput with EFS without uploading 100GB of dummy files, did someone try that?
Is EFS for wordpress helping or not?
From everything I've read the main issue is speed, even with file caching, etc. At Re:Invent 2019, Amazon has announced that it would increase the base performance by 5X and solve the burst credit issues that everyone is complaining about. We'll have to see how the new setup performs in the real-world, but it could be good enough for hosting WordPress on EFS.
Can some one try to benchmark it again? It might be "fixed" now: https://aws.amazon.com/de/about-aws/whats-new/2020/04/amazon-elastic-file-system-announces-increase-in-read-operations-for-general-purpose-file-systems/
FYI - multi-attach EBS is now avialable in provisioned iops EBS volumes. This might be a decent solution to replace EFS. https://aws.amazon.com/about-aws/whats-new/2020/02/ebs-multi-attach-available-provisioned-iops-ssd-volumes/
Has anyone benchmarked the new EFS performance or tried the multi-attach EBS with Wordpress?
I am also interested in hosting Wordpress on AWS, but I read many posts online about the terrible filesystem performance.
@collimarco , I ran tests with EFS burst and provisioned throughput. Here are my conclusions (for my use case, a large WP site)
- PHP Page Generation goes from ~300ms to ~1200ms just by going from EBS to EFS, changing nothing else. (yes, NFS caching and opcode caching was ON)
- Burst throughput can go high, but not for very long
- Provisioning higher than ~20MBps EFS didn't yield higher performance, probably because WP is not an app that can create a lot of concurrency needed for EFS to scale.
- Copying data using standard commands like cp or rsync is excruciatingly slow. Again, it scale if you can create hundreds of threads to copy stuff.
So, "it can work", but I didn't find it to be convenient, and I'm not against the wall where I have to scale horizontally.
A friend of mine mentioned that it is possible to cache NFS files using Memcache, but I haven't looked at it.
multi-attach EBS doesn't work with EXT4, and only runs on the Nitro instances, so not 100% sure how practical it is, but I'd love to know!
@hubertnguyen Shouldn't opcode caching alleviate any of the bottlenecks of the file system in use? Warming up of the opcode cache can be dead slow, but once it caches, its doing little i/o, atleast that's the idea with all the approaches i.e. cache PHP files in someway or the other on the server itself.
Can you confirm with opcode caching, you did turn off stat functionality which checks for file modification with each request to a file before simply using it from cache?
What do you think is causing the EFS to be a bottleneck here?
@ashfame if you turn off file checks completely I guess that you break core / plugin updates... is there any solution?
@ashfame , on paper, it seems like "it should", and when you run traffic, most of the needed files should be warmed up after page view #1 (WP-core, your template etc). I can confirm opcode caching and NFS file caching was "on" -- I don't remember specifically removing the per-request checking though, so if you find out, let me know.
@collimarco , default TTL is ~60sec for opcode caching. As you point out, it's probably not a great idea to significantly extend it. I was worried of the same thing.
I'm not sure what the ~900ms gap is. Unfortunately during my research, I have not found an example where people ran with it, without any kind of significant trade-off.
Update: Checking the Opcache statistics showed that memory wasn't large enough at 16MB to hold all scripts. Bumping to 32MB shows much better results which are at the bottom.
Cache hits | 466415
Cache misses | 306
Used memory | 25762104
Free memory | 7792328
I'm running a very small Wordpress installation (four plugins, 1.2GB of content, a few hundred megs of DB). I decided to break down each performance improvement I had in place.
The TL;DR for the below is it seems that a t3.micro is a bad idea for a reasonably busy website? Performance goes down as more performance features are enabled. My usual response times I get after each upgrade are around 300-400ms. I've added them to the bottom. It's the very first ab
run before all this experimenting which accidentally may have had Opcache completely disabled. 😆
Test machine:
- t3.nano in ca-central-1
- With Opcache enabled 232MB of RAM active
- 1GB of swap, ~100MB active
- Ubuntu 18.04.4 TLS
- Many many retests happened, T3 credits dipped from 144 (max) to 124
I didn't reboot the machine, but did re-build my docker-compose setup between each of the below calls. Every test had an equal call to ab
call between them. Also, I didn't remove the fsc
argument from fstab and instead chose to disable the cachefilesd daemon.
Starting with everything off:
me@web-01:~/web-service$ cat docker-compose/php-fpm/opcache.ini
; From https://laravel-news.com/php-opcache-docker
;
[opcache]
opcache.enable=0
; 0 means it will check on every request
; 0 is irrelevant if opcache.validate_timestamps=0 which is desirable in production
opcache.revalidate_freq=600
opcache.validate_timestamps=1
opcache.max_accelerated_files=983
opcache.memory_consumption=16
opcache.max_wasted_percentage=10
opcache.interned_strings_buffer=8
me@web-01:~/web-service$ ab -n 1000 -c 20 https://example.com/index.php
This is ApacheBench, Version 2.3 <$Revision: 1807734 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking example.com (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software: nginx
Server Hostname: example.com
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
TLS Server Name: example.com
Document Path: /index.php
Document Length: 0 bytes
Concurrency Level: 20
Time taken for tests: 234.205 seconds
Complete requests: 1000
Failed requests: 0
Non-2xx responses: 1000
Total transferred: 217000 bytes
HTML transferred: 0 bytes
Requests per second: 4.27 [#/sec] (mean)
Time per request: 4684.098 [ms] (mean)
Time per request: 234.205 [ms] (mean, across all concurrent requests)
Transfer rate: 0.90 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 2 3 9.0 2 199
Processing: 963 4648 5207.1 3752 48704
Waiting: 963 4648 5207.1 3751 48704
Total: 985 4651 5210.2 3754 48708
Percentage of the requests served within a certain time (ms)
50% 3754
66% 3779
75% 3797
80% 3807
90% 3840
95% 3873
98% 27081
99% 38119
100% 48708 (longest request)
Now with Opcache enabled but revalidate_freq
set to zero
; From https://laravel-news.com/php-opcache-docker
;
[opcache]
opcache.enable=1
; 0 means it will check on every request
; 0 is irrelevant if opcache.validate_timestamps=0 which is desirable in production
opcache.revalidate_freq=0
opcache.validate_timestamps=1
opcache.max_accelerated_files=983
opcache.memory_consumption=16
opcache.max_wasted_percentage=10
opcache.interned_strings_buffer=8
me@web-01:~/web-service$ ab -n 1000 -c 20 https://example.com/index.php
... removing the repeated info ...
Concurrency Level: 20
Time taken for tests: 94.065 seconds
Complete requests: 1000
Failed requests: 0
Non-2xx responses: 1000
Total transferred: 217000 bytes
HTML transferred: 0 bytes
Requests per second: 10.63 [#/sec] (mean)
Time per request: 1881.296 [ms] (mean)
Time per request: 94.065 [ms] (mean, across all concurrent requests)
Transfer rate: 2.25 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 2 3 3.7 3 30
Processing: 489 1859 120.5 1859 2536
Waiting: 489 1859 120.5 1859 2536
Total: 515 1862 119.5 1862 2565
Percentage of the requests served within a certain time (ms)
50% 1862
66% 1882
75% 1894
80% 1903
90% 1927
95% 1966
98% 2084
99% 2187
100% 2565 (longest request)
Now with revalidate_freq
set to ten minutes:
me@web-01:~/web-service$ cat docker-compose/php-fpm/opcache.ini
; From https://laravel-news.com/php-opcache-docker
;
[opcache]
opcache.enable=1
; 0 means it will check on every request
; 0 is irrelevant if opcache.validate_timestamps=0 which is desirable in production
opcache.revalidate_freq=600
opcache.validate_timestamps=1
opcache.max_accelerated_files=983
opcache.memory_consumption=16
opcache.max_wasted_percentage=10
opcache.interned_strings_buffer=8
me@web-01:~/web-service$ ab -n 1000 -c 20 https://example.com/index.php
... removing the repeated info ...
Concurrency Level: 20
Time taken for tests: 92.347 seconds
Complete requests: 1000
Failed requests: 0
Non-2xx responses: 1000
Total transferred: 217000 bytes
HTML transferred: 0 bytes
Requests per second: 10.83 [#/sec] (mean)
Time per request: 1846.940 [ms] (mean)
Time per request: 92.347 [ms] (mean, across all concurrent requests)
Transfer rate: 2.29 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 2 3 3.6 3 31
Processing: 464 1829 114.7 1832 2312
Waiting: 464 1829 114.7 1832 2311
Total: 478 1832 113.6 1835 2343
Percentage of the requests served within a certain time (ms)
50% 1835
66% 1854
75% 1867
80% 1876
90% 1902
95% 1929
98% 1939
99% 1981
100% 2343 (longest request)
Now turning the cachefilesd daemon back on:
me@web-01:~/web-service$ sudo service cachefilesd start
me@web-01:~/web-service$ ab -n 1000 -c 20 https://example.com/index.php
... removing the repeated info ...
Concurrency Level: 20
Time taken for tests: 94.687 seconds
Complete requests: 1000
Failed requests: 0
Non-2xx responses: 1000
Total transferred: 217000 bytes
HTML transferred: 0 bytes
Requests per second: 10.56 [#/sec] (mean)
Time per request: 1893.749 [ms] (mean)
Time per request: 94.687 [ms] (mean, across all concurrent requests)
Transfer rate: 2.24 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 2 3 2.7 3 28
Processing: 505 1871 111.8 1874 2368
Waiting: 505 1871 111.8 1874 2368
Total: 521 1874 110.9 1876 2392
Percentage of the requests served within a certain time (ms)
50% 1876
66% 1895
75% 1906
80% 1912
90% 1929
95% 1950
98% 1975
99% 2019
100% 2392 (longest request)
Now turning off access time in fstab:
me@web-01:~/web-service$ cat /etc/fstab
LABEL=cloudimg-rootfs / ext4 defaults,discard 0 0
/var/swap swap swap defaults 0 0
fs-abcd1234.efs.ca-central-1.amazonaws.com:/ /mnt/efs nfs4 nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,async,noatime,fsc 0 0
me@web-01:~/web-service$ sudo umount /mnt/efs && sudo mount /mnt/efs
me@web-01:~/web-service$ ab -n 1000 -c 20 https://example.com/index.php
This is ApacheBench, Version 2.3 <$Revision: 1807734 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking example.com (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software: nginx
Server Hostname: example.com
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
TLS Server Name: example.com
Document Path: /index.php
Document Length: 0 bytes
Concurrency Level: 20
Time taken for tests: 92.908 seconds
Complete requests: 1000
Failed requests: 0
Non-2xx responses: 1000
Total transferred: 217000 bytes
HTML transferred: 0 bytes
Requests per second: 10.76 [#/sec] (mean)
Time per request: 1858.155 [ms] (mean)
Time per request: 92.908 [ms] (mean, across all concurrent requests)
Transfer rate: 2.28 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 2 3 3.5 3 30
Processing: 604 1837 110.2 1839 2416
Waiting: 604 1837 110.3 1838 2416
Total: 615 1841 109.7 1841 2446
Percentage of the requests served within a certain time (ms)
50% 1841
66% 1863
75% 1876
80% 1884
90% 1904
95% 1927
98% 1974
99% 2165
100% 2446 (longest request)
What I saw at the beginning of these tests
me@web-01:~/web-service$ ab -n 1000 -c 20 https://example.com/index.php
This is ApacheBench, Version 2.3 <$Revision: 1807734 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking example.com (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software: nginx
Server Hostname: example.com
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
TLS Server Name: example.com
Document Path: /index.php
Document Length: 0 bytes
Concurrency Level: 20
Time taken for tests: 11.618 seconds
Complete requests: 1000
Failed requests: 0
Non-2xx responses: 1000
Total transferred: 217000 bytes
HTML transferred: 0 bytes
Requests per second: 86.07 [#/sec] (mean)
Time per request: 232.365 [ms] (mean)
Time per request: 11.618 [ms] (mean, across all concurrent requests)
Transfer rate: 18.24 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 2 3 3.5 3 29
Processing: 32 227 22.8 227 515
Waiting: 32 227 22.8 227 515
Total: 58 231 21.7 230 541
Percentage of the requests served within a certain time (ms)
50% 230
66% 236
75% 240
80% 242
90% 248
95% 256
98% 273
99% 281
100% 541 (longest request)
For completeness, here are my graphs for EFS and t3 credits:
Results after increasing Opcache buffers:
Concurrency Level: 20
Time taken for tests: 11.507 seconds
Complete requests: 1000
Failed requests: 0
Non-2xx responses: 1000
Total transferred: 217000 bytes
HTML transferred: 0 bytes
Requests per second: 86.91 [#/sec] (mean)
Time per request: 230.136 [ms] (mean)
Time per request: 11.507 [ms] (mean, across all concurrent requests)
Transfer rate: 18.42 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 2 4 4.4 3 36
Processing: 38 225 20.8 226 264
Waiting: 38 225 20.8 226 264
Total: 50 228 18.9 229 285
Percentage of the requests served within a certain time (ms)
50% 229
66% 235
75% 238
80% 240
90% 247
95% 253
98% 259
99% 265
100% 285 (longest request)
We are also trying to run our PHP-based application (not Wordpress) over EFS, and we found there is a significant difference between EC2 instance types due to their network bandwidth. We benchmarked t3.large (5Gb/s), m5.large (10Gb/s) and m5n.large (25Gb/s). These all have the same number of vCPUs and memory.
- t3.large -> m5.large: +10% performance
- m5.large -> m5n.large: +10% performance
- t3.large -> m5n.large: +20% performance
Be aware that the smaller t3 instances have 1Gb/s, which means network latency will be considerably higher.
AWS has a HA setup using LightSail with media offload. No reason for EFS. Has anyone tried the LightSail approach?
@minumula , how do they keep the code in sync between multiple nodes?
Lightsail only guarantees 20% or 30% of the CPU, the rest can be stolen by other tenants. I did my tests with Lightsail, but don't want to use it in production.
@hubertnguyen There is an article and a video published on AWS Online Tech Talks. There is a plugin that offloads media to S3 including the uploads folder. Other than that, there is no file sync between the instances.
I am reading AWS PDF whitepaper and referencing other architectures, which makes WordPress HA such a challenge. Do we just mount wp-content only on the EFS and put the rest of the files on individual EC2 servers?
Got it. If you have the link, could you share it? If not, I'll do a search later. Not having the code sync is king of a big problem, unless your deployment method is to have a master, which is then cloned and you rebuild a cluster everytime.
Even then, the DB might and code may be out of sync and cause problems, so you'd have to clone the DB too.
With EFS or some kind of file sync, your WP runs like one system. You can mount only /uploads/ on EFS if you want, but you still need to deal with the code sync. If you mount EFS on the web root, then EFS does the code sync for you (no sync needed)
EFS is based on an NFS filesystem, you can find more liiterature on WP+NFS.