Taking up gigs of memory
Any reason why this would be taking up so much memory after a few days of running? It ends up eating 3.5 gigs of mem and exhausting 1.5 gigs of swap before I kill the process.
This seems to be an inherant flaw with the original s3fs wrapper. Depending on how much I/O it's having to deal with on a day-to-day basis, this grow fairly quickly. I've seen s3fs consume over 7Gb memory. The only way I've found to get around it right now is to put a daily crontab in place that does a umount -l /mnt/s3/xyz, and re-mounts it. You get interruption for up to 5 seconds or so, but since installing the cron job, I've never out out-of-memory segfaults since.
This appears to be a memory leak. A quick way to reproduce:
- Mount an S3 bucket with s3fs-c:
s3fs some_bucket S3
- Exercise the mount (e.g., copy files (cp, rsync) to/from S3 via the mount, walk the mount with "find", etc.):
cp /foo/* S3
- Watch the memory usage of s3fs-c increase (e.g., via "top"):
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 27046 bar 20 0 304m 49m 2372 S 18.0 8.4 2:03.93 s3fs
Two minutes later:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 27046 bar 20 0 318m 66m 2524 S 6.0 11.2 3:54.21 s3fs
Eventually s3fs-c runs out of memory and is killed by the kernel. From syslog:
Jun 19 18:22:38 domU-12-31-39-0F-24-1E s3fs: init $Rev: 355 $ ... Jun 19 19:07:17 domU kernel: [203302.500062] Out of memory: Kill process 26021 (s3fs) score 739 or sacrifice child Jun 19 19:07:17 domU kernel: [203302.500086] Killed process 26021 (s3fs) total-vm:682772kB, anon-rss:446960kB, file-rss:0kB
Thanks for the quick response, Dan. I've been using both s3fs and s3fs-c (I prefer s3fs-c because of the compatibility it is trying to address with other clients). I agree that the s3fs code has a memory leak in it as well, but in my tests s3fs-c has a much larger leaks. Browsing through the issues, it looks like the s3fs project has tried to address some of the memory leaks (although clearly not all of them), and maybe some of these haven't been merged into s3fs-c? For example:
http://code.google.com/p/s3fs/issues/detail?id=216
Any idea if this will be merged with s3fs-c? Anything stopping s3fs-c developers from going after these bugs independently of s3fs (since they are arguably showstoppers)?
I intend to do some more experimentation and report the issue to the s3fs project as well, as most of the memory leak issues reported have been (perhaps erroneously) closed.
You've proposed an interesting workaround. The problem is what if data on an S3 mount is in use when the cron job is run? I think "umount -l" will gracefully deal with this (won't unmount until there are no pointers to the data on the mount), but the "killall" would immediately kill the mount even if data was in use, wouldn't it? I'm not sure what happens if "umount -l" is immediately followed by an attempt to remount, since the mount point might not be unmounted by the time the attempt to mount again occurs. I'd rather not depend on running a cron job at a time when no one is using the mount, because this is something I cannot guarantee.
I'd be curious to see the script you've written, if you don't mind sharing.
Well, I actually have 2 scripts that run on cron. One runs daily and does a force remount of all the s3fs-mounted filesystems. umount -l "lazily" unmounts the filesystem, basically ignoring the fact that there might be a pointer/handle to a file open on the filesystem. The killall I execute straight after umount returns is because, sometimes, I've had s3fs processes persistent across remounts, so unmounting and remounting didn't solve the memory issue, it just minorly inconvenienced anything that had a file open at the time.
The second script I have runs on cron every 5 minutes, and basically ls's the directory, and looks for something specific in the response. This is mainly to check if s3fs has crashed during the day (you end up with a fuse error "transport endpoint is not connected"). If it detects the error, or can't find what it's looking for, it'll also do a force remount. The daily script is straight forward. This is my 5-minutely script (run as root, obviously):
#!/bin/bash
MOUNT_DIR=/mnt/s3/bucketName
CHECK_DIR_WITHIN_MOUNT=some_file_inside_mount_dir
RESULT=`ls $MOUNT_DIR | grep $CHECK_DIR_WITHIN_MOUNT`
if [ "$RESULT" != "$CHECK_DIR_WITHIN_MOUNT" ];
then
echo "Mount down - remounting";
umount -l $MOUNT_DIR;
mount $MOUNT_DIR;
fi
Simple as that. The issue you mention about crons (or any jobs) having files open on the filesystem during the dis/remount is a perfectly valid one. It's something I spent a great deal of time trying to work a way around. The reason we do it the way we do is simple: if we wait (perhaps forever) for a process to close a file, s3fs will probably run out of memory and crash anyway, so it'll lose it's handle/error out anyway. The dis/remount is actually a very quick process (~1/2 seconds return for both umount and mount), and I've not had any reports of failed crons due to this action so far. You could arguably put the s3fs_remount scripts in cron.daily, let run-parts execute it, and if you name the script something like "000-s3fs_remount" in cron.daily, this will guarantee run-parts will execute the remount before it executes any other cronjobs (thus making this issue about crons having open file handles slightly more irrelevant).
Hope this helps.
Side note: I deleted my response here to your first message, because after I posted it, I realised that what I said I already said in my first comment!
I've been throwing valgrind at the problem, there are quite a few memory leaks. I've got a branch that I am addressing them in (see the memory-fix branch in https://github.com/franc-carter/s3fs-c.git). I've got the memory leaks down a fair bit, but i have struck a tricky one with a call to fdopen()
Memory fixed have been merged into the master, post if you are still finding memory problems and I'll dig into it again
Hi @franc-carter ,
I am using s3fs v1.82, the issue we are facing so far is that the memory usage by s3fs is increasing continuously (right now it is at 4 GB).
I have not used any of the following cache related settings:
- stat_cache_expire (not set)
- max_stat_cache_size (not set)
- use_cache (not set)
I am not able to figure out where s3fs is using this much of memory and how to handle this.
Thanks, Vikash
Do we have any fix for this? I can't believe its been there since 2012. We are currently facing this in EC2 instances and it is crashing our application once in a while.
Edit: Tagging @franc-carter, @franc-carter-sirca @dannosaur @PVikash to see if anyone found any solution/workaround for this.
@anilkumardesai I left the company where I implemented this many years ago. I never found a solution beyond force killing the s3fs process and re-mounting daily on a cron.
As far as possible solutions go, it depends why you're using S3FS.
If you're using it to share files between load balanced services, Amazon has Elastic File System (EFS) now, which is NFSv4 based, and can be mounted using your system's NFS utils, and is much more reliable.
If you're using S3FS to put files on S3 by simply saving files on your file system, I'd seriously consider looking into engineering your application to talk to S3's API directly with PutObject.
I think S3FS filled a gap in AWS's services that was either clunky or non-existent back in the days that this was written. They've come a long way since.
@dannosaur Thanks Dan. We fixed the issue by changing the s3fs url from https(default) to http. To my surprise, I have not seen that issue for last 3 weeks. Thanks for that s3 putObject API recommendation. Will have to look into this, if we started to see more issues with s3fs.