OpenJPH icon indicating copy to clipboard operation
OpenJPH copied to clipboard

Providing more parameters for open/fopen

Open hmaarrfk opened this issue 1 month ago • 5 comments

https://github.com/aous72/OpenJPH/blob/edc3210f7c9640a76b9d6150cfad48bb157f4549/src/core/others/ojph_file.cpp#L244

it would be nice to be able to specify flags here

Specifically, the flag O_DIRECT is nice for large images.

Would you consider patches for this?

I think we would have to move to the lower level functions here. https://man7.org/linux/man-pages/man2/open.2.html

hmaarrfk avatar Dec 15 '25 04:12 hmaarrfk

Hi Mark,

This is C or C++ version of fopen, but your question is about low-level open function. Of course open is faster but it is OS specific; Windows has a similar function, but not exactly the same. As you have seen adding aligned_alloc was problematic, because I did not want to use C++17. It took a few iterations. My main worry is compatibility.

Have you tested on your side, how much faster open is?

Cheers, Aous.

aous72 avatar Dec 15 '25 10:12 aous72

As you have seen adding aligned_alloc was problematic, because I did not want to use C++17. It took a few iterations.

Yes thank you for iterating through those...

I noticed your CIs have really grown since i've followed the project! I know that takes work...

My main worry is compatibility.

I understand, i think that I can likely try to do things myself by subclassing jph_infile or similar. I'll report back

Have you tested on your side, how much faster open is?

"open" itself isn't faster, but O_DIRECT effectively tells the OS to "get out of the way". When you are in a pattern of "read once", and reading "100s GB", you really want to "skip the OS Cache" and just get it into userland. You must have the data aligned, which is why the aligned memory allocator are helpful (what we have now helps me save in an aligned way).

I've found great effect with "raw" data.

With compressed streams it is hard to ensure that the full pipeline is optimized (as our interactions have highlighted).

I'm pretty "focused" on linux support, which means that these low level optimizations don't always benefit cross platform things.

hmaarrfk avatar Dec 15 '25 13:12 hmaarrfk

let me make some experiments by implementing it myself and get back to you.

hmaarrfk avatar Dec 15 '25 13:12 hmaarrfk

open is faster because it does not use C++ read and write buffering mechanism. C++ create a buffer of some size, which it then uses to buffer read and write operations. For large files, this buffer becomes a bottleneck. open and low-level IO can skip all of that. In my investigation, I looked at the hdparam tool, which measures hard disk speed; this tool uses all low-level read and write.
I think this is because using normal C++ fread and fwrite can negatively impact speed results.

aous72 avatar Dec 15 '25 22:12 aous72

C++ read and write buffering mechanism.

Thank you for the re-clarification.

I am not a big C++ user, so I was ignoring the potential for C++ to have its own buffering mechanism and was only considering the OS level buffering.

I guess for "large data", I've found it to be a constant struggle with:

  1. OS level cache <-- hard to fully turn things off.... especially in cross platform way.
  2. Language level file caching (C++) <-- i was ignoring this
  3. Library level file caching (OpenJPH) <-- we can engineering this
  4. Application level file caching <-- i can engineer this

re: hdparm, i should go back to this tool. It does have an O_DIRECT option too ;) https://github.com/Distrotech/hdparm/blob/master/hdparm.c#L1490

hmaarrfk avatar Dec 15 '25 23:12 hmaarrfk