mio
mio copied to clipboard
Large memory-mapped files, POSIX, mmap64(). I/O stream interface ?
Does the mio support large memory-mapped files ? https://www.gnu.org/software/libc/manual/html_node/Memory_002dmapped-I_002fO.html
I just browsed source code and found that is uses mmap() call instead of mmap64() for mapping in POSIX systems.
The other issue is using C++ I/O stream type interface to access data. Looks like mio does not provide this kind of access. Could it be possible to do this by developing some sort of std::streambuf like interface into memory-mapped data ? Could you provide some hints how to implement this.
On the usage of mmap64: maybe mmap64 can just be used by default on 64-bit platforms at compile-time(sizeof(void*) == 8) and uses mmap otherwise.
mmap has a 2GB limit
Yep. The stream interface could be implemented by using trivial stream buffers for read and write. I will try to write a simple test program for both mmap64 and stream interface.
struct ism_buf: std::streambuf { ism_buf(char* base, std::ptrdiff_t n) { this->setg(base, base, base + n); } };
ism_buf input(mio_map.data(), mio_map.size()); std::istream is(&input);
It looks like mio already supports large files. Attached You will find a test program that I made. I created 8GB file and then made mapping for it. I also made simple stream interface for writing and reading from mapped file.
https://keybase.pub/pulmark/random/main.cpp
It looks like mio already supports large files.
After doing some light research, I'm finding that the standard mmap function on a 64-bit linux already supports larger file mappings, and the 64 in mmap64 only refers to a difference in the last offset parameter being 64-bit should you want to be more explicit rather than relying on your build environment. off_t alone is not rigidly defined either as it will depend on your _FILE_OFFSET_BITS definition as well.
From the GNU C Library reference:
off_t
This is a signed integer type used to represent file sizes.
In the GNU C Library, this type is no narrower than int.
If the source is compiled with _FILE_OFFSET_BITS == 64 this
type is transparently replaced by off64_t.
off64_t
This type is used similar to off_t. The difference is that
even on 32 bit machines, where the off_t type would have 32 bits,
off64_t has 64 bits and so is able to address files up to 2^63 bytes
in length. When compiling with _FILE_OFFSET_BITS == 64 this type
is available under the name off_t.
These definitions alone though imply that we mio should use explicit overloads of mmap and mmap64 to potentially support a large 64-bit offset parameter rather than relying on just the configuration of the build environment to possibly handle it for the user.
https://github.com/mandreyel/mio/blob/76251b8dde16bdac44acf2547be2470fd75703e1/include/mio/detail/mmap.ipp#L190-L196
This also goes for functions like fstat and fstat64 which provide 64-bit variants of their individual structures.
https://www.mkssoftware.com/docs/man5/struct_stat.5.asp
The stat64 structure is similar to the stat structure, except that it is capable of returning information about files that are larger than 2 gigabytes. The stat64 structure is returned by the fstat64() and stat64() functions, which are a part of the large file extensions.
Thanks for your input guys. Unfortunately, currently I'm unable to work on mio or any other open source project, but I'm not against adding the mentioned features. So if anyone wants to take a stab at this :)