exrs
exrs copied to clipboard
Read the input file in parallel
What can be improved or is missing?
OpenEXR format allows the file to be read in parallel. It is considerably faster than a single-threaded read all by itself, see https://github.com/johannesvollmer/exrs/pull/183#issuecomment-1374250749
It will also let us initialize buffers in parallel (until #186 becomes feasible) and allows performing pixel format conversions in worker threads more naturally (see #182) - AFAIK doing it with the current architecture would require a memcpy()
.
Implementation Approach
Memory-mapping is tempting, but is going to result in lots of very spooky UB the moment someone modifies the file we're reading. So we'll probably have to open a file descriptor per thread and seek it.
Would it be faster to not open multiple file handles, given no memory mapping is used? What would it even look like, I thought File
can not be shared across threads?
Right now, memory mapping is supported by the library implicitly by allowing the user to pass any Read
trait. Would memory mapping also benefit from parallel reading?
Would it be faster to not open multiple file handles,
I don't think so. If you have a single file descriptor for everything, that means you cannot issue parallel reads. If nothing else, there is only one offset value for a given file descriptor, so you cannot issue reads from several places at once. Beyond that, a file descriptor is not safe to access in parallel, you'd need a mutex which defeats the point - see https://stackoverflow.com/a/823525
We'll just need to open several file descriptors, and File
s, pointing at the same underlying file in the filesystem, and then independently seek and read from each descriptor. open()
is cheap, doesn't matter if we do it once or 32 times.
I thought so too. I just thought it sounded like there was an alternative to opening multiple handles. Let's try that then :)
Right now, memory mapping is supported by the library implicitly by allowing the user to pass any
Read
trait. Would memory mapping also benefit from parallel reading?
Yes, but it would be horrifically unsafe. Memory-mapped data changes in memory whenever the file is changed. That means that if you create an &str
from a memory-mapped &[u8]
, the string is checked to be valid UTF-8 on creation, but the file may change later and turn into invalid UTF-8, violating the safety invariant. In fact, it's illegal to even have a &[u8]
pointing to something that may change - the compiler assumes that &[u8]
is immutable and generates code based on that. If that's violated, the very fabric of the compiler's vision of reality breaks down and anything is possible.
Okay, but if you use the mapped memory only through the interface of the Read
trait, then this can't happen, right? Because in that case, data always copied and never referenced? Or does it mean that the Rust compiler simply can't handle memory mapping?
Rust just doesn't allow creating a &[u8]
over a memory map, which is required for the Read
trait.
You can take a raw pointer to the memory map and ptr::read_volatile
into a buffer. However, mmap is only useful when you want to avoid copying the data - in all other cases you can just read from a file descriptor into a buffer, and the performance will be exactly the same.
So to actually take advantage of mmap, you need to either adapt basically all your code to use read_volatile
on raw pointers instead of creating slices, or find some platform-specific way to make sure nothing else modifies your file while it's memory-mapped.
thanks for that explanation :) good to know
I'm wondering how this would happen, from my docs perusing , it seems the crate wraps a BufferedReader from which it gets it's bytes from, they don't implement clone and the same one can't be sent to multiple threads , creating multiple bufreaders to the same underlying reader is explicitly mentioned in the docs to cause data loss.
When reading from a file, you'll need to open multiple File
handles (backed by a file decriptor) and wrap each one in its own BufReader
.
For reading from an in-memory buffer, you can just send the &[u8]
references to different threads - under the hood that's just pointers.
AFAIK there's no way to implement this for a generic buffered reader. There is, for example, no reasonable way to read a stream from the network in parallel, let alone seek the network stream. So if that's a use case you'd like to support, reading needs to be single-threaded in this case.
meta data should be read with buffered readers, since a lot of tiny read requests are made. The pixel data, which will be read in parallel, will not benefit from any buffering
the cloning of the byte source will have to be a custom abstraction. the File
type has a function try_clone
, but there is no trait with that function. We'll have to add our own simple trait for cloning a byte source that we implement for Files and for other readers
or alternatively we abstract over the ability to create a new reader from scratch, by remembering the path of the file, and calling File::open
repeatedly
those individual readers can then be buffered where it makes sense
And what about data passed as a slice? Won't that remove parallelism from it?
Ideally what you want is a trait, I'll call it ParalellReader
, implemented for File, BufferedFile, slices (or anything that derefs to a slice), with. Method that allows whole cloning,
For a file, create a new file handler(we can't use clone because change in one file seek affects all other instances of files) for data using a cursor, create a new cursor with the same referenced data and returns it.
Also the trait should implement independent seeking, i.e each independent clone should allow seeking without affecting the others
I'm not sure if this is possible I've never really done it as I'm just speaking what I think may work.
yes! I'm pretty sure that calling File::try_clone
on will create a new handle with an independent seek position. But manually creating new ones will be easier anyways I think