nydus icon indicating copy to clipboard operation
nydus copied to clipboard

http sever occasionally return response status “Resource temporarily unavailable ”

Open changweige opened this issue 3 years ago • 8 comments

After scanning nydus api server code, it does not return status “Resource temporarily unavailable ”, is this returned by http server crate?

        dist.put_multiple_files(20, Size(8, Unit.KB))
    
        image.set_backend(Backend.BACKEND_PROXY).create_image()
>       nc.pseudo_fs_mount(image.bootstrap_path, f"/pseudo{suffix}", conf.path(), None)

functional-test/test_nydus.py:505: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (<nydusd_client.NydusAPIClient object at 0x7f03d1558520>, '/home/runner/work/image-service/image-service/contrib/nydus-test/bs2', '/pseudo2', '/tmp/tmpixjok_airafs.config', None)
resp = <Response [500]>

    def wrapped(*args):
        resp = func(*args)
>       assert resp.status_code < 400 or resp.status_code == 501, resp.content.decode(
            "utf-8"
        )
E       AssertionError: Resource temporarily unavailable (os error 11)

nydusd_client.py:53: AssertionError

changweige avatar Aug 09 '22 04:08 changweige

Seems poor nydusd api server never has a chance to process this failure api request

changweige avatar Aug 09 '22 05:08 changweige

Maybe the python HTTP client has opened more connections than the MAX_CONNECTIONS defined here for keeping its connection pool.

imeoer avatar Aug 09 '22 06:08 imeoer

The dbs-uhttp limits the maximum concurrent connections to MAX_CONNECTIONS(10). Maybe we should enlarge the limit. The failure should be caused by the code below:

    fn handle_new_connection(&mut self) -> Result<()> {
        if self.connections.len() == MAX_CONNECTIONS {
            // If we want a replacement policy for connections
            // this is where we will have it.
            return Err(ServerError::ServerFull);
        }
    }

jiangliu avatar Aug 09 '22 06:08 jiangliu

The dbs-uhttp limits the maximum concurrent connections to MAX_CONNECTIONS(10). Maybe we should enlarge the limit. The failure should be caused by the code below:

    fn handle_new_connection(&mut self) -> Result<()> {
        if self.connections.len() == MAX_CONNECTIONS {
            // If we want a replacement policy for connections
            // this is where we will have it.
            return Err(ServerError::ServerFull);
        }
    }

It may not matter with MAX_CONNECTIONS, after I change it to 1000, I still get Resource temporarily unavailable (os error 11) error.

sctb512 avatar Sep 02 '22 08:09 sctb512

@sctb512 Are you using the go or python HTTP client? May be we can use netstat to inspect the TCP connections of the client side first.

imeoer avatar Sep 02 '22 08:09 imeoer

@sctb512 Are you using the go or python HTTP client? May be we can use netstat to inspect the TCP connections of the client side first.

OK, I will try it.

sctb512 avatar Sep 02 '22 08:09 sctb512

I delve into the uHttp code a bit. It does not return an error. I think the root cause is Unix socket listener file is set to nonblocking. The solver could be making the client retry until it is connected or replace mio

changweige avatar Sep 02 '22 09:09 changweige

The rust UnixListener has configured the listener queue depth as 128 截屏2022-09-07 下午5 23 00

The go unix dialer uses non-blocking to connect to server, which will cause EAGAIN when the listener queue is full in the linux kernel. So we have no chance to change the kernel behavior, and it's not easy to enlarge the queue depth.

Seems the way to solve it is "client retry"

jiangliu avatar Sep 07 '22 09:09 jiangliu

will be solved by https://github.com/openanolis/dbs-uhttp/pull/18

changweige avatar Nov 12 '22 10:11 changweige

new version of dbs-uhttp has been publish, need to update the cargo.lock file

jiangliu avatar Nov 18 '22 09:11 jiangliu

Already updated dependency on dbs-uhttp. This problem is addressed.

changweige avatar Nov 21 '22 06:11 changweige