aerospike-client-rust icon indicating copy to clipboard operation
aerospike-client-rust copied to clipboard

Async Client

Open jonas32 opened this issue 2 years ago • 56 comments

This first commit includes most of the basic changes to the client to get it running on async-std and tokio. There are surely a few more spots that need work or block threads. Feel free to have a look. The client is not really working yet. I did never test it at least since unit tests are still broken.

Tests are not working yet. There are still a few logical things that need to be changed for async, but that would again be a bigger change itself (thread pools etc.).

  • [x] Async Networking stack
  • [x] Async tend thread
  • [x] Async Client API
  • [x] Runtime on tokio
  • [x] Runtime on async-std
  • [x] Runtime on actix-rt (not required for the client scope since we dont test against actix)
  • [x] Sync Client API
  • [x] Individual tests for sync and async APIs
  • [x] Threadpool rework (optional, it works like it is, but a threadpool on async seems a little off goal)

jonas32 avatar Aug 10 '21 17:08 jonas32

~~In general, the async implementation seems to work. There is still one bug somewhere, that causes the CI to be stuck (see https://github.com/aerospike/aerospike-client-rust/runs/3326575266). Please cancel this CI run since it does not look like it will do by itself and I'm not able to stop it.~~

~~I'm not sure yet where this comes from, since the CI does not give me that much info about it and I'm not able to reproduce it locally. Maybe its related to the CI setup. Please pull this version and try to run the async-std tests locally a few times so i get to see if its related to the client or the CI env.~~

The async-std CI bug is fixed. Looks like a limitation of concurrent connections in the CI env. Since the lazy static solution with the client instance is not working anymore due to async, every test needs to open its own instance of the Client. There was a Problem in the tests for expressions, where a new client was opened for every single expression.

The benchmark tool is still broken. Sadly, the bencher lib does not support the async setup. Testing against the sync API would be useless since the sync setup is just a block_on wrapper to the async one. That just adds a lot of overhead and slows it down.

Please also check the code and tell me if you find anything that i might have forgotten. I think the optimization is good, but surely there is more potential to it. (Ignore batch reads for now. I'm still working on that. The current implementation is still a more or less hacky workaround to get it working).

The client API might still be inconsistent between a few functions. That's mainly related to moving values between threads. It removes some comfort features from the related functions (operate, query and batch). I would like to keep it that way so this comfort can stay for the other functions, but that's up to you.

Right now i hardcoded an older Aerospike version (5.5.0.2) in the CI file to make it run without the required partitioning update. That needs to be removed later.

jonas32 avatar Aug 14 '21 04:08 jonas32

Ready to review from my side.

I'm a little unsure about the other open PRs. This one moved the whole client to the aerospike-core folder. In theory, git tracked the files as moved and not recreated, so i guess it should also be able to correlate the changes. If not, #100 and #107 need to be changed for this. Please don't merge them before. I think a conflict in this one would be much more work... #105 would be broken because the cargo file was not moved and the updated dependencies were partly removed.

@khaf you said you are working on a rework of the partitions. If you already got anything running there, would you please share it as a draft on a own branch for example so i can check if it fits in without a problem or even already integrate it?

I think unit tests and a new benchmark should be implemented, but i will do that later in another PR. I guess there is already enough to review here with 117 changed files.

jonas32 avatar Aug 15 '21 08:08 jonas32

attempting to try this stuff out with

aerospike = { git = "https://github.com/asynos/aerospike-client-rust/", branch = "async", features = ["rt-async-std"] }

but am getting

the package `...` depends on `aerospike`, with features: `aerospike-macro` but `aerospike` does not have these features.

databasedav avatar Nov 01 '21 07:11 databasedav

Can you check if there is something cached in your system? This looks like it tries to load the old, sync client.

jonas32 avatar Nov 02 '21 08:11 jonas32

@jonas32 clearing the cargo cache didn't do it

databasedav avatar Nov 02 '21 09:11 databasedav

it works when i do

aerospike-core = { git = "https://github.com/asynos/aerospike-client-rust/", branch = "async", features = ["rt-async-std"] }

(adding the -core) and then in the code i do

use aerospike_core::{self as aerospike};

databasedav avatar Nov 04 '21 07:11 databasedav

Fixed. Looks like cargo does not like it when i load member crates as dev dependency. Its now always loading aerospike-macro. For me its compiling correctly now.

jonas32 avatar Nov 04 '21 13:11 jonas32

@khaf do you have any time plan on when you are going to review this?

jonas32 avatar Nov 17 '21 12:11 jonas32

I've been mostly sleeping on it, since we are experimenting more with async implementations in our ecosystem and learning new things as we go. The latest case which we have learned is that difference in timeout quantization between different transactions can lead to significant performance penalties. It appears that sharing the same state machine( / event loop) between transactions with wildly different timeouts should be avoided. To be able to do that, the API should be able to let the user to somehow target different event-loops. To my understanding, the current API in this PR seems to be lacking such support and it's not immediately clear to me if the underlying design and libraries include such a feature. Do you have some insights to share in this regard?

khaf avatar Nov 17 '21 13:11 khaf

Can you give me some more info on the case where you discovered that? I dont really understand whats the point in having multiple runtimes. Normally, that should only be a problem if you start blocking somewhere. You can see in multiple spots in this PR, that i even removed the timeout parameters for operations, since having timeouts is not really needed with an async runtime. Timeouts make sense when you need to cancel something in case it hangs up, so you "free the space" for something new. Something that would be able to block that space would be blocking and therefor not async.

jonas32 avatar Nov 17 '21 16:11 jonas32

Sorry I got distracted again. The issue is that when you use a single event loop (mapped to an epoll instance in the case of Linux) and mix commands with two significantly different timeout values (e.g. one with t and the other t*5), the event queue fills up with the entries with the longer timeout, and reduces capacity for the transactions for the shorter one, increasing the 99% latency and decreasing throughput. From our limited benchmarks on EC2, the penalty can be up to 2x. In the case of removing timeouts, of course you'll need them to remove entries from the event queue, and precisely control when and how the flow goes back to the user, otherwise you'll leak entries, including file descriptors for connections and whatnot. Every entry (async request) takes memory and is looped over until it is resolved somehow behind the scene.

khaf avatar Nov 23 '21 16:11 khaf

First to go back to your original question about the loops: In general this is not a problem of the client. The client itself never starts any runtime. It just creates futures. Where and how they are executed is handled by the code that calls the client lib. This is something the user has to take care of.

I started to investigate if the client has similar problems. And it actually has. (only under high load) ~~To be exact, strace says that a Mutex gets locked and never unlocked. I'm not 100% sure where and why, but its probably the connection pools SharedQueue internals Mutex. I searched the whole codebase multiple times now to see if anything blocks (see the last commit). This is extremely strange behavior. Its not like the queue just slows down, its a full deadlock on all threads, even freezing the tend thread. There should really not be any reason why this is permanently locked. Have you somewhere seen such a behavior or any idea why this might happen?~~

The problem is exactly what you just described. Filling up all the available threads of the runtime with long running ops (TCP connection startup is blocking), so no futures will be executed anymore. Ill have a look at how to solve that.

jonas32 avatar Nov 23 '21 18:11 jonas32

The obvious workaround is to give the user the ability to fine tune the client by being able to assign different commands to different event loops. The first step would be to find out if the underlying libraries allow such tweaking. If they do, we can then discuss different mechanisms for the API for convenience and flexibility. Otherwise, we will have to take our case to the library developers.

khaf avatar Nov 24 '21 06:11 khaf

In general you are right. But i would like to prevent doing that, since it would not really make sense in the rust ecosystem.

More exact description of the current problem: The TcpStream connection process looks like its blocking. After connecting its async unblocking and fine. But while connecting it blocks the whole thread. Tokio starts with 1 event loop thread per core of the host system. In my case that's 6. As soon as the client opens the 6th connection, all threads have a TCP stream running and are fully blocked. Starting more runtimes would not really solve that, since i would have to start 256 runtimes per node. This overhead is not acceptable in my opinion. On the other hand, this problem only exists if you run a single simple runtime on user side. If you run this for example on top of actix, you will probably never hit this problem. That happens since the client library does not at all decide on what runtime features are scheduled. This is the decision of the runtime that the user starts. You can manipulate where and how it will be scheduled from user side, depending on the context where the operation is executed. For example actix has worker threads with independent runtimes. Even if the Client initially started in the main runtime, the client interaction is done on the runtime of each the worker. Since our problem here is the start of new connections under load, this would also always happen under the worker runtime.

Starting any runtimes in the client itself would break a lot of compatibility (for example with actix) and add a huge complexity into the client. I would prefer finding a way to start the threads without blocking. As a workaround you can set the client policy to max_conns = cpu_num-1. And even with that "downgrade" the client is still extremely fast. I wrote more than 10k records in under 1s.

jonas32 avatar Nov 24 '21 12:11 jonas32

Found the problem. Now the problem is that the clients retries are too fast. It tries to get a connection from the pool 4 times. After that, the operation fails. Since the pool is non-locking now, the 256 conns are nearly instantly started and a few operations will fail with their 4 retries. I could only reproduce it with more than 8k concurrent writes and it depends on how fast the server is. The simplest solution would be to stop retrying on an iterations base and just using timeouts. What do you think about that?

jonas32 avatar Nov 25 '21 15:11 jonas32

any update on this?

databasedav avatar Feb 18 '22 06:02 databasedav

From my side, it depends on if the client should be handling runtimes itself or not. In many cases, there is probably a runtime handler in place. That means that it would be unnecessary overhead. I have a modified version, that is more optimized for single runtime and external runtime handler setups. I just don't want to push it yet, in case the client should handle that itself. This version here would probably fail often under load since it does not have the modifications from my last message yet.

Generally, everything seems to work fine from a performance and stability perspective. I use this modified version in staging already and had one test service deployed to production with that client version. It just probably needs a lot more testing and review time since this version here changed minor details on most client internals.

jonas32 avatar Feb 19 '22 01:02 jonas32

@jonas32 all sounds good to me, i've been using it for a few months now too, all my unit tests pass and everything works fine but my deployment isn't under any load or anything

what sort of tests still need to be written? i can help out with that

databasedav avatar Feb 19 '22 07:02 databasedav

Im not talking about written tests... That part should be ok. More like testing it in different situations, under load etc. Something that is not total scripted lab env. Problems like a runtime deadlock under load will not show up in scripted tests. That for example only showed up when i first tried to benchmark it.

jonas32 avatar Feb 19 '22 16:02 jonas32

From my part, I think I have exhausted all the feedback I could give. I don't want to let the perfect be the enemy of the good. If we have a better design that addresses those issues, we can begin looking at it (maybe in a different PR), but if everyone thinks that the current concept is good enough, we can begin reviewing it and plan to merge.

khaf avatar Feb 24 '22 17:02 khaf

cargo check produces an error

error: Please select a runtime from ['rt-tokio', 'rt-async-std']
 --> aerospike-rt/src/lib.rs:2:1
  |
2 | compile_error!("Please select a runtime from ['rt-tokio', 'rt-async-std']");
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

petar-dambovaliev avatar Mar 02 '22 16:03 petar-dambovaliev

Thanks for the review @petar-dambovaliev. I will take a look at your suggestions.

cargo check produces an error

I know about that and this is more or less wanted behavior. The imports change based on the selected runtime and there is no default selection. If you have any idea on how to make this better, I'm happy to hear your suggestions.

jonas32 avatar Mar 02 '22 19:03 jonas32

Thanks for the review @petar-dambovaliev. I will take a look at your suggestions.

cargo check produces an error

I know about that and this is more or less wanted behavior. The imports change based on the selected runtime and there is no default selection. If you have any idea on how to make this better, I'm happy to hear your suggestions.

IMO, i haven't seen anybody use async-std. Coupling with tokio is not an issue, in this sense. And my opinion is that sync and async should be entirely separate projects.

petar-dambovaliev avatar Mar 03 '22 10:03 petar-dambovaliev

IMO, i haven't seen anybody use async-std. Coupling with tokio is not an issue, in this sense. And my opinion is that sync and async should be entirely separate projects.

I agree. But one of the goals was to have it as modular as possible for the runtime (see #102). I also only see tokio or actix-rt (basically tokio) as relevant, but supporting async-std cant be bad, since its just a swap of imports (see aerospike-rt). And since this 3 runtimes are the major players, it makes sense to be compatible. Most other database libraries also do that. The reference model for the async implementation was the way how sqlx did it. I would be fine with adding rt-tokio to the default features, but not with removing async-std entirely.

The sync version is more or less meant to be deprecated. Its only a compatibility wrapper on top of the async implementation so it wont break the whole API. There still are use-cases where that might be relevant. Thats also the reason why there are no tests for the sync client. Its just the async one with a built-in await.

jonas32 avatar Mar 04 '22 04:03 jonas32

@petar-dambovaliev i use async-std :)

databasedav avatar Mar 04 '22 08:03 databasedav

Nice work @jonas32, any chance it will be released?

twobiers avatar Apr 23 '22 19:04 twobiers

@jonas32 very excited by this patch, it seems tokio is like an even better Rust version of libev, which is where the C client has gone.

It seems to have stalled a little bit, wondering what the future holds.

CMoore-Darwinium avatar Sep 28 '22 12:09 CMoore-Darwinium

OK I believe I have neglected this PR long enough. Time to begin merging it, I'll apply my own changes on top. I think it will still take a while to iron everything out, but it's time to get it moving. The following are the necessary steps:

  1. error_chain needs to go, and preferably replaced with thiserror. This does not mean we don't need a kind of chained error type (any command that goes to multiple nodes will require chained errors) but the create is deprecated and clippy will forever complain.
  2. We need to replace most &str fields and arguments with String. (Or maybe we should state this goal as an all encompassing Lifetime simplification)
  3. Batch needs to be updated to the latest iteration, since the BatchRead itself is updated and will be a breaking change if released without the update. Good news: 90% done. Not so good news: 90% remaining.
  4. I need to document this whole change-set.
  5. We need to go through this process together, which will be time consuming. Rust is very complex with diverse features, so I welcome everyone to review the upcoming updates. If everyone agrees, I'll start the discussions. The two pressing topics are "error_chain" and "deserialization".
  6. Benchmarks are needed.
  7. A load generator is needed for stability testing.
  8. Github actions for testing.
  9. A server compatibility table is needed. Some older server/client features are missing that need to go in before we can publish this crate, but most of them are half implemented on my side. I just need to get them in which may take a minute.

I'll document items as I go through the issues. There is another associated PR that implements value serialization/deserialization, that I think should also go with this one. We need some discussion regarding implicit type conversion during deserialization, in case a user wants an integer, and the field type is string (maybe with a feature flag). Thank you everyone for your patience. Rust client is one of our most important clients and as a result has been subjected to unusual scrutiny from my side. That has in turn stopped the code base moving forward, stalling the incremental updates. The async feature will open a new era for high performance data access to Aerospike, so we want to get it mostly right on the first try. Please keep your ideas coming, we appreciate your feedback.

khaf avatar Sep 28 '22 14:09 khaf

Sounds great. I like the Idea to remove error_chain. This one is a pain... Let me know if i can support somewhere.

jonas32 avatar Sep 28 '22 15:09 jonas32

I'm also very happy to do some work to help this happen since we definitely need this patch. @khaf if you were to push some of your in-progress work to a branch, it might be easier for others to jump in.

Regarding (de)serialization. It would be nice to see something capable of directly going from the on-wire msgpack representation to the desired structure and back as the current system based around the Value enum spends a lot of time building these structures that will be read once and thrown away.

For instance, if the Value was a struct containing the on-wire bytes (or even better, a Cow<[u8]>), you could use various from() methods to build it from other types and try_from to deserialize it into the type you want if what's inside can fit into the target.

You could also add failable iterators to iterate the values or key-value pairs of what's inside in place.

It would be a breaking change, but if there is interest, I could give it a shot?

CMoore-Darwinium avatar Sep 28 '22 23:09 CMoore-Darwinium