unit apcu_fetch(): Failed to acquire read lock on production testing

Hello,

We are trying to replace php-fpm with nginx unit, running inside docker, but after tuning the setups to the same resources and configuration, we get this error from apcu lib:

Noticed exception 'ErrorException' with message 'apcu_fetch(): Failed to acquire read lock' in /var/www/.../ApcuStore.php:23

The code there is very simple:

    /**
     * @param array $items
     * @param int $ttl
     */
    public function store(array $items, int $ttl): void
    {
        apcu_store($items, null, $ttl);
    }

    public function fetch($identifiers): array
    {
        return apcu_fetch($identifiers);
    }

Any idea why this is happening? Do we need to change something in regards to concurrency locks now that php is running in embed mode?

Thank you in advance! R

May 10 '22 13:05 razvanphp

Hi @razvanphp Do you have some sources to reproduce this error / issue? We will have a look on it

May 11 '22 10:05 tippexs

What do you mean by sources? A complete reproducible setup/code? Not really, but the relevant PHP part is already pasted above.

May 11 '22 10:05 razvanphp

Hi, I am a colleague from @razvanphp Unfortunately, we can't provide the complete code to reproduce it but I can tell you a bit more about the issue.

We have a product that has categories. Categories are shared across many products and to avoid fetching them from other sources - we store them in APCu.

A typical request that is coming to our service has 100 products with around 20 categories. If the category is not found in APCu, it is fetched from Redis and stored in APCu. There is no rocket science on how it is working - it is just fetch/store methods from apcu. somehow I believe that the problem is laying somewhere in parallel execution as at the same time another product with the same category may be requested by another thread. We also cannot really see the issue if there is no traffic, so it is only visible under some load (10k rpm is enough)

Unfortunately, there is not much I can say about the issue on top of that :(

May 11 '22 12:05 arku31

Hi @arku31 - thanks for making that clear. We will try to create a peace of software that will be able to reproduce it (fill APCu Cache with data and try to read and write from / to it) but on top of that - could you also share the Unit debug log?

What version of Unit are you using and what PHP Version? Are you using our docker images? Just asking to be able to reproduce the issue.

However based on what you described I have a feeling it is about writing to the cache while another thread is trying to read a cache entry by a given key. That would explain why you only see this as the app is under more load. But we need to check this to be sure. Looking forward to get more information from you

Good night Timo

May 11 '22 21:05 tippexs

Hi, sorry for the long answer..

We were using PHP 8.0.18 and nginx unit 1.25.0 which was running in the docker on AWS. It was behind the normal nginx. We've used a custom docker image but in terms of nginx unit that was copied from that repo - https://github.com/nginx/unit/blob/1.25.0/pkg/docker/Dockerfile.php8.0

May 19 '22 19:05 arku31

Hi, No problem. We were also busy with KubeCon EU last week, but this issue is on my list this week (along site with some other PHP related once). Keep you posted and thanks for sharing the information with us.

May 23 '22 06:05 tippexs

Hi, I am facing the same issue (apcu_fetch(): Failed to acquire read lock ) when routing some of the production traffic with over ~500 req/s to app based on lumen framework with apcu cache driver enabled.

Actually, it makes unusable APCU with PHP8 for higher performance apps

PHP 8.0.25 (cli) (built: Oct 28 2022 18:38:41) ( NTS ) Copyright (c) The PHP Group Zend Engine v4.0.25, Copyright (c) Zend Technologies with Zend OPcache v8.0.25, Copyright (c), by Zend Technologies

php.ini: apcu apc.coredump_unmap => Off => Off apc.enable_cli => On => On apc.enabled => On => On apc.entries_hint => 4096 => 4096 apc.gc_ttl => 3600 => 3600 apc.mmap_file_mask => no value => no value apc.preload_path => no value => no value apc.serializer => php => php apc.shm_segments => 1 => 1 apc.shm_size => 32M => 32M apc.slam_defense => Off => Off apc.smart => 0 => 0 apc.ttl => 0 => 0 apc.use_request_time => Off => Off

config.json: "applications": { "lumen": { "type": "php", "root": "/var/www/public/", "script": "index.php", "limits": { "timeout": 92, "requests": 10000 }, "processes": { "max": 64, "spare": 64, "idle_timeout": 30 } } }

Any updates in this issue?

Nov 05 '22 21:11 hbernaciak

Hi @hbernaciak we haven't looked into the apcu features again. We are working currently on other features of the Unit project. But as you mentioned this issue seams still present. We would need to create some sort ob Lab for this issue to trace it down. But this is currently not on the Prio-List. But any code / configuration you can share to reproduce the issue would be appreciated.

Nov 07 '22 07:11 tippexs

I'll see if I can somehow reproduce this.

Nov 10 '22 13:11 ac000

Well that was easy! :) :(

Nov 10 '22 13:11 ac000

Great @ac000! if you still need anything, just let me know :)

Nov 10 '22 20:11 hbernaciak

Here's what I've found so far.

tl;dr

The issue seems to be caused by our usage of clone(2) (well actually we call the kernel system call directly and not the glibc wrapper).

Gory details

This all applies to Linux (maybe others...)

Some background

The APCu module uses a mmap(2)'d file as a shared memory cache between processes. At the start of this memory is a header and in that header is a lock structure for whatever locking mechanism is in use, which on Linux will most likely be POSIX Read/Write locks, pthread_rwlock-*, which uses a pthread_rwlock_t structure to store state.

One of the members of the pthread_rwlock_t structure is __cur_writer, that will be important.

Test setup

Latest(ish) unit (1.28.0-30-ga0327445) compiled from source, latest APCu (5.1.22) compiled from source

Unit configuration

{
    "listeners": {
        "[::]:8080": {
            "pass": "applications/php"
        }
    },
    "applications": {
        "php": {
            "type": "php",
            "root": "/var/tmp/unit-php",
            "processes": 2
        }
    }
}

PHP test

<?php

$item = "Hello World";

apcu_store('foo', $item, 600);

var_dump(apcu_fetch('foo'));

?>

Then hit it with ApacheBench

$ ab -n 200 -c 10 http://localhost:8080/apcu.php

The problem

We get errors like

'apcu_fetch(): Failed to acquire read lock'

due to glibc (well, only been debugging this on Fedora) detecting a potential deadlock situation and failing the lock acquisition, via checks like

/* Make sure we are not holding the rwlock as a writer.  This is a deadlock
   situation we recognize and report.  */
if (__glibc_unlikely (atomic_load_relaxed (&rwlock->__data.__cur_writer)
                      == THREAD_GETMEM (THREAD_SELF, tid)))
   return EDEADLK;

Which is kind of weird. There is no threads involved and a process will acquire a read or write lock, do it's thing, then release it.

Testing the same exact thing under php-fpm showed no issues.

... some time later ...

Having a look again with gdb after we have acquired a lock (either one) showed the pthread_rwlock_t structure as the following (this is for unit)

gdb) p *lock
$2 = {
  __data = {
    __readers = 3,
    __writers = 0,
    __wrphase_futex = 1,
    __writers_futex = 1,
    __pad3 = 0,
    __pad4 = 0,
    __cur_writer = 95104,
    __shared = 1,
    __rwelision = 0 '\000',
    __pad1 = "\000\000\000\000\000\000",
    __pad2 = 0,
    __flags = 0
  },
  __size = "\003\000\000\000\000\000\000\000\001\000\000\000\001", '\000' <repeats 11 times>, "\200s\001\000\001", '\000' <repeats 26 times>,
  __align = 3
}
(gdb)

At first glance it all looks sane.

However, remember that __cur_writer member?, that's supposed to store the PID of the process which currently holds the write lock.

Lets look at what processes we have

andrew     95104   11765  0 18:56 pts/3    00:00:00 unit: main v1.29.0 [/opt/uni
andrew     95106   95104  0 18:56 pts/3    00:00:00 unit: controller
andrew     95107   95104  0 18:56 pts/3    00:00:00 unit: router
andrew     95108   95104  0 18:56 pts/3    00:00:00 unit: "php" prototype
andrew     95109   95108  0 18:56 pts/3    00:00:00 unit: "php" application
...

Here __cur_writer == pid of the main unit process. Hmm, that's not right! (It should be one of the php application processes).

Just for a sanity check, I looked at the same thing with php-fpm

(gdb) p *lock
$3 = {
  __data = {
    __readers = 3,
    __writers = 0,
    __wrphase_futex = 1,
    __writers_futex = 1,
    __pad3 = 0,
    __pad4 = 0,
    __cur_writer = 95200,
    __shared = 1,
    __rwelision = 0 '\000',
    __pad1 = "\000\000\000\000\000\000",
    __pad2 = 0,
    __flags = 0
  },
  __size = "\003\000\000\000\000\000\000\000\001\000\000\000\001", '\000' <repeats 11 times>, "\340s\001\000\001", '\000' <repeats 26 times>,
  __align = 3
}

root       95198       1  2 19:02 ?        00:00:00 php-fpm: master process (/et
apache     95200   95198  0 19:02 ?        00:00:00 php-fpm: pool www
...

Here __cur_writer == pid of the php-fpm worker process. Good, as expected.

I started suspecting it could have something to do with how we create the processes, using clone rather than fork(2).

Indeed, forcing unit to use fork(2) (which it will do if there is no clone available) made the problem go away. I also replicated this issue with a simple test program to verify.

Solution

So to recap, the problem is that when glibc checks the current thread id against what's stored in __cur_writer, they both seem to be set to the PID of the main unit process and so glibc thinks the process is trying to acquire a lock (either one) when it already has the write lock, but it doesn't really.

At the moment I'm not entirely sure how to fix this.

Currently our processes are created something like

clone(child_stack=NULL, flags=SIGCHLD)  = 5754

whereas if you use fork(2) which is implemented with clone, you get

clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f317e668a10) = 5857

I believe the key bits here are the CLONE_CHILD_SETTID flag and the setting of child_tidptr which in glibc's case points into some core thread memory area which I assume updates THREAD_SELF->tid as used in the above glibc check.

More investigation required.

Nov 14 '22 03:11 ac000

@tippexs @ac000 is anybody in your team who can push this forward?

Nov 14 '22 15:11 hbernaciak

@hbernaciak

It's a known issue with glibc (musl libc seems to be OK and perhaps glibc < 2.25) but the glibc developers don't seem to be in a hurry to fix it, in fact there seems to have been some disagreement that they even should.

At the moment it's looking like the way forward would be to drop our use of SYS_clone, we use it so that we can make use of namespaces(7), but we can achieve the same with a combination of fork(2)/unshare(2).

I will keep this issue updated along the way.

Nov 14 '22 16:11 ac000

@razvanphp @arku31 @hbernaciak

Hi folks,

I have put up a branch that should resolve this issue (my simple test case now works 100%), you can grab it here.

If anyone is able to test this, that would be good (feel free to ping me here if you need help setting it up). Report success or failure. If it fails, please let me know your Linux (and this only effects Linux) distribution and version and libc version, whether that's glibc, musl libc or something else, if you know it, no problem if not.

Cheers, Andrew

Nov 21 '22 21:11 ac000

Updated branch with improved version.

Dec 02 '22 01:12 ac000

unit unit copied to clipboard

apcu_fetch(): Failed to acquire read lock on production testing

tl;dr

Gory details

Some background

Test setup

The problem

Solution

unit
unit copied to clipboard