goofys icon indicating copy to clipboard operation
goofys copied to clipboard

Client.Timeout a concern?

Open phil-grayson opened this issue 4 years ago • 4 comments

Hi,

I'm working with many (>10) large (> 1 Gb) mounted files on an AWS EC2. Variably I get this message in my logs with certain programs accessing these files:

2021/06/13 04:33:39.461498 fuse.ERROR *fuseops.ReadFileOp error: net/http: request canceled (Client.Timeout exceeded while reading body)

Sometimes it just appears once or twice, but I have one program that is frequently seeing 100s of lines like this in a 2.5 hour run (64 threads). Is this something that I should be concerned about (i.e., given the variable nature of this error, if I run the same job multiple times should I expect to get different results)?

Thanks!

phil-grayson avatar Jun 14 '21 23:06 phil-grayson

goofys verion is: 0.24.0-45b8d78375af1b24604439d2e60c567654bcdf88

phil-grayson avatar Jun 15 '21 14:06 phil-grayson

if you see this, then some read requests have failed. Depends on your application, it may or may not have retried. It doesn't seem like the workload should have triggered this though, do you have more debug logs? What kind of instance are you running this on?

kahing avatar Jul 12 '21 00:07 kahing

Hello,

I have this same error.

I have goofys mounted to an S3 and on my server NGINX uses the goofys mount to serve up .jpg & .mp4 files. The syslog has tons of these errors so not sure if this is just the visitor closing the connection on purpose or the visitor is viewing a mp4 and suddenly it closes. If I can help troubleshoot this please let me know what to do. Thanks much

stevyn81 avatar Jul 28 '21 08:07 stevyn81

@kahing Following up for @phil-grayson here.

We most frequently see this error when running a program called VEP, which requires mounting ~400Gb of files (using goofys) and is used to annotate variants (operating on VCF files). We are using a very large, compute optimized server in the AWS Cloud (c5n.18xlarge). This machine has 72 threads, 64 of which we are dedicating to our job while leaving some remaining for goofys/system usage with ~80Gb EBS storage. We notice anywhere from 80 to 400 of these error messages during these jobs. We've compared results from several runs with and without goofys (so we download the needed files directly instead of mounting). We don't see any meaningful difference in the results with or without the mounting (even in the presence of these errors).

I've attached a file with the full output from one of our runs. Since we are getting no errors from VEP and our analysis is identical for a few runs, do you suspect that this error is totally non-affecting? fuse_error_log.txt

willronchetti avatar Sep 23 '21 17:09 willronchetti