ceph-rust icon indicating copy to clipboard operation
ceph-rust copied to clipboard

async/await compatible wrapper for librados AIO methods

Open jcsp opened this issue 4 years ago • 12 comments

This PR adds an async/await compatible interface for librados's aio methods.

A new Completion type wraps Ceph's completions into a Future, including the cancel-on-drop behaviour expected of rust futures.

For reading and writing large objects, there are streaming variants that return Stream/Sink compatible objects, with configurable chunk sizes and numbers of IOs in flight.

jcsp avatar Jun 15 '21 16:06 jcsp

Thanks for tackling this. I started on something like this awhile back but didn't know where to go with it. I'm kinda leaning towards your first approach with passing around an Arc all over the place.

cholcombe973 avatar Jun 15 '21 18:06 cholcombe973

I've modified this to stop using Arcs for the Completion->IoCtx relationship. Originally I wanted Arcs to make it easier to detach Completions and move them around outside the original &IoCtx lifetime, but because we're returning async{}-generated futures in the public interface anyway, the public facing futures already had a lifetime bound from their self argument. Using plain references makes everything quite a bit cleaner.

The async-no-cancel branch was a dead end, we definitely need the cancellation to avoid use-after-free on buffers that have been passed to async rados calls.

jcsp avatar Jun 18 '21 11:06 jcsp

I've just pushed an updated version of this. I haven't worked on it for a few months, but I had some extra code lying around so took the time to clean it up a bit today. There are now streams for reads & writes of large objects, streams for object listing, and xattr methods.

I originally wrote this alongside a spare time project (https://gitlab.com/putget-io/capra/-/blob/main/src/server/backend/rados.rs) and I have a lot less spare time right now, so I'm not sure if it makes sense to merge this without a more real user, but hope this code is useful to the next Rustacean that wants to talk to rados from an async application!

jcsp avatar Aug 30 '21 14:08 jcsp

Thanks for this @jcsp

badone avatar Sep 01 '21 00:09 badone

I've been testing the July 29 version of this under fairly heavy load for a few weeks, and have been hitting the occasional segfault, and a few instances where a call to rados_async_object_read() returned successfully but with the buffer still in its initial zeroed state. Can't rule out Luminous as the cause, as I only have experience with it in blocking mode in the past. Once I find the time to update to a newer Ceph I'll see if the issue remains and report back.

dae avatar Sep 06 '21 12:09 dae

@dae thanks for the feedback - there were definitely some bugs in that earlier code (including segfaults) that I fixed in the intervening period while testing it under load with the read/write streams, so hopefully you should see better stability with the latest.

jcsp avatar Sep 06 '21 13:09 jcsp

Ah, thanks, that's good to hear - will give that a go first.

dae avatar Sep 06 '21 23:09 dae

Same issues after updating the unfortunately, so I'll still need to try updating ceph - getting segfaults with messages like "double free or corruption (!prev)" and "mismatching next->prev_size (unsorted)", and still the occasional invalid data returned from a read.

dae avatar Sep 08 '21 07:09 dae

Initial results from updating to a newer Ceph look promising - will let it bake for a week, but it looks like it may well have been a bug in the older librados.

Later edit: zero problems since the Ceph upgrade.

dae avatar Sep 11 '21 10:09 dae

Is there anything missing for this to merge? Thanks!

leseb avatar Feb 08 '22 15:02 leseb

I'm not super clear on what's blocking this at this point. Any insight?

Ten0 avatar Jul 29 '23 15:07 Ten0

I've been using this successfully in production since my previous post, FWIW.

dae avatar Jul 29 '23 22:07 dae