emcee icon indicating copy to clipboard operation
emcee copied to clipboard

Hdf5 swm rmode

Open Thalos12 opened this issue 4 years ago • 10 comments

The SWMR feature of HDF5 has been implemented in the HDFBackend and is activated setting the swmr argument to True. The process reading the file should not only use the SWMR mode but also set the environment variable HDF5_USE_FILE_LOCKING="FALSE".

Let me know if I should change anything I have added.

Thalos12 avatar Jul 12 '21 19:07 Thalos12

I can see that the build of the documentation failed, but in the details I can only see a warning about the swmr.ipynb not being in any toctree. I am not familiar with sphinx, therefore I do not know how to fix this error...

Thalos12 avatar Jul 12 '21 19:07 Thalos12

Nevermind, it was easy enough that I finally figured it out.

Thalos12 avatar Jul 12 '21 19:07 Thalos12

@Thalos12: Thanks for this!! I'll need a couple of days to get to this, but as a zeroth order comment: how hard do you think it would be to add a unit test for this feature? Also it would be good to put the version requirements for h5py somewhere - what were they again?

dfm avatar Jul 15 '21 12:07 dfm

@dfm: Regarding the unit test, I have to say that I am not very experienced in writing it, but I will try my best nevertheless using those already present as examples. For the version requirements of h5py, there are none (explicitly), the requirements are on the underlying HDF5 library (>=1.10), and internally h5py already checks them, so i chose not to add duplicate code. I can mention these requirements in the tutorial notebook, if you think it would be a good idea! I can also add them to the HDFBackend docstring.

Thalos12 avatar Jul 15 '21 12:07 Thalos12

Awesome! Yes - please add some comments about versions to the docstring and tutorial notebook.

For the unit test: no stress! I think it'll be tricky to write exactly the one we want (we'll probably need to launch multiple processes) so I'm happy to do that as I'm reviewing. Do you want to write one that just checks that the backend at least works with swm mode enabled?

dfm avatar Jul 15 '21 12:07 dfm

Thank you, I will add the requirements in both places then! Also, I suppose I should be able to make the test for the backend, I'll add a commit as soon as I have implemented it.

Thalos12 avatar Jul 15 '21 13:07 Thalos12

نابرده رنج گنج میسر نمیشود

kerzeleng avatar Jun 11 '22 21:06 kerzeleng

@dfm and @Thalos12 is there anything that can be done to help this? This feature looks super useful so I'd be glad to pitch in.

znicholls avatar Mar 15 '23 23:03 znicholls

Hi @znicholls, while making tests I discovered that the change I proposed did not work as I wanted. Going by memory, the problem is that the output hdf5 file is opened (for writing) and closed at every step of the chain (see emcee/backends/hdf.py). Consequently, crashes can still occur if a reader opens the file just before the writer, regardless of SWMR. I could not find a way to fix this at the time, and I forgot to close the issue (@dfm: apologies, I should have done that).

To make SWMR works, it would be necessary to modify HDFBackend to keep the file open, if at all possible, and also implement some mechanism to safely close it when needed (chain ends, exception is raised, ...). @dfm: please correct me if I'm wrong

Thalos12 avatar Mar 16 '23 09:03 Thalos12

Huh that's a shame. When I read your notebook, it seemed like the reader would sometimes crash, but the writer would never crash which still seemed like a win to me and something worth including. Maybe I misunderstood

znicholls avatar Mar 16 '23 10:03 znicholls

I believe that I did find that my solution sometimes failed, but I can't be sure it wasn't just my fault. I might try to dig up the code I used and see if I can reproduce the errors I remember I was having.

Thalos12 avatar Mar 16 '23 13:03 Thalos12