coreutils icon indicating copy to clipboard operation
coreutils copied to clipboard

cat fast path (splice on linux) does not work for 0 size files

Open jendap opened this issue 2 weeks ago • 2 comments

Cat has fast path using splice system call. Unfortunately it does not work for many "file systems" reporting wrong file size.

In particular I have fuse file system returning incorrect size (zero or otherwise). It should be fine since one should treat file size on linux as a hint and read until EOF. That is what gnu cat does. It is also what uutils do in slow path! The fast path calling splice will stop at whatever file size is returned in stat. It is inconsistent and break stuff. Can we disable it please?

BTW: It works for cat /proc/self/stat which also report 0 size. The reason is probably for it somewhere in splice.c. But we're not going to change syscall behavior anyway. Calling the syscall in rust code looks fine, right?

BTW: I can see workaround if you really want to keep the splice around. But it is already quite a lot of code to do the splice. Is it worth even more hacks?

jendap avatar Dec 09 '25 10:12 jendap

I can't replicate it now myself. Something spooky with my kernel? :) Next time I make sure to open issue only after replicating on least two machines. Sorry to bother you. Thanks for uutils 👍

jendap avatar Dec 09 '25 18:12 jendap

Here is reproducible code - at least on Ubuntu 25.10 with 6.17 kernel (even fresh virtual machine):

#define FUSE_USE_VERSION 31

#include <fuse.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
#include <errno.h>

static void *bugfs_init(struct fuse_conn_info *conn, struct fuse_config *cfg) {
    cfg->direct_io = 1;
    return NULL;
}

static int bugfs_getattr(const char *path, struct stat *stbuf, struct fuse_file_info *fi) {
    if (strcmp(path, "/") == 0) {
        stbuf->st_mode = S_IFDIR | 0755;
        stbuf->st_nlink = 2;
    } else {
        stbuf->st_mode = S_IFREG | 0444;
        stbuf->st_size = 3;
    }
    return 0;
}

static int bugfs_read(const char *path, char *buf, size_t size, off_t offset, struct fuse_file_info *fi) {
    char *data = "foo bar baz";
    size_t len = strlen(data);
    if (offset < len) {
        if (offset + size > len)
            size = len - offset;
        memcpy(buf, data + offset, size);
        return size;
    } else {
        return 0;
    }
}

static struct fuse_operations bugfs_operations = {
    .init = bugfs_init,
    .getattr = bugfs_getattr,
    .read = bugfs_read,
};

int main(int argc, char *argv[]) {
    if (argc != 2) {
        fprintf(stderr, "Usage: %s <mount_point>\n", argv[0]);
        return 1;
    }

    return fuse_main(argc, argv, &bugfs_operations, NULL);
}

Compile and run:

sudo apt install gcc pkgconf libfuse-dev

gcc -o bugfs bugfs.c `pkg-config fuse3 --cflags --libs`

mkdir mnt_test
./bugfs mnt_test

See the problem:

cat mnt_test/file
# 'foo'

# there is a number of options to print something extra in cat - any of them will trigger slow path
cat -e mnt_test/file
# 'foo bar baz'

gnucat mnt_test/file
# 'foo bar baz'

The problem is with 'direct_io = 1'. As far as I know it is the only way to create something like /proc filesystem. Such fs typically return st_size = 0 because it is expensive to compute and/or different every time. Read() does the real work (not fixed string like this toy example).

The slow path work (read() in loop until EOF). The fast path on linux using splice does not work. It is probably a bug in kernel. But uutils should be consistent with itself and with gnu utils, right?

jendap avatar Dec 10 '25 11:12 jendap