differential-datalog icon indicating copy to clipboard operation
differential-datalog copied to clipboard

RFC: API to run DDlog instance in a separate process.

Open ryzhyk opened this issue 3 years ago • 13 comments

@mbudiu-vmw , @blp, @Kixiron , your feedback on this RFC would be appreciated.

We would like to support programs that start multiple instances of the same or different DDlog programs in separate processes on the same or remote hosts and interact with them using IPC (same host) or TCP.

Motivation

  • Run multiple instances of the same program on behalf of multiple tenants while enforcing resource isolation between tenants.
  • Run different DDlog programs for the same or multiple tenants (DDlog-generated '.so' libraries export the same set of symbols; therefore multiple libraries cannot be easily loaded in the same process).

Design

Architecture

We will add a mode to run the DDlog-compiled executable (<program>_cli) as a server on a ZeroMQ socket (IPC or TCP). The client links against a stub library that speaks to the server. The stub library is generic, i.e., it does not depend on a specific DDlog program and can be used to interact with multiple instances of potentially different programs. The library exposes the trait DDlog interface to the client, so that the same client code can work with DDlog instances running in the same process, a separate process on the same machine, or a remote server. Both dynamically typed (Record-based) and statically typed (DDValue-based) APIs will be supported, although the statically typed API will require the client to additionally link against a DDlog-generated library.

API

The new DDlog API will consist of three parts.

  1. Starting a DDlog instance. This API starts a DDlog executable on a specified host and socket, with a specified number of worker threads. Since there are many ways to do this and the best solution depends on the specific customer environment, this API is technically not part of DDlog, although we may provide some convenience methods, e.g., to start DDlog as a local process.

  2. Connecting to a DDlog instance. This API takes a DDlog server address, establishes a connection to it over the DDlog protocol (see below) and returns the set of initial facts on success. It also returns an instance of trait DDlog that the client can use to interact with the server.

  3. Interacting with a DDlog instance. We already have a trait (trait DDlog) that is meant to represent a DDlog instance. It currently has a single implementation struct HDDlog. With some refactoring this trait can be generalized to work with both local and remote instances, namely:

    • Remove the run method. This method encapsulates the instantiation and connection APIs (see above) for the special case of an in-process DDlog instance and should not be part of trait DDlog.
    • For historical reasons, trait DDlog is missing some essential methods that are defined and implemented in HDDlog (get_table_id, get_table_name, clear_relation etc). These methods will be moved to trait DDlog and their signatures will be changed to remove dependencies on program-specific Relation and Index types.
    • Several other methods must be added to trait DDlog or a separate extended API:
      • record_commands - do we want command recording to be implemented by the stub library or the server?
      • dump_input_snapshot
      • dump_table - actually, this might be the time to eliminate this API (along with the do_store argument to run), superseded by indexes.
      • enable_cpu_profiling, profile
  • Split trait DDlog into statically and dynamically typed APIs. trait DDlog currently supports both Record-based and DDValue-based APIs. We would like to be able to use it with clients that only work with the dynamic representation and do not even link against program-specific type declarations, as well type-safe clients. When working with a type-safe client, we want to convert records to a more space-efficient DDValue representation before sending them over the wire. When working with a dynamically-typed client, only the record-based API is supported, with records being serialized as is, without converting them to DDValue. trait DDlog encapsulates the record-to-ddvalue conversion in the type Convert: DDlogConvert member. We will convert that into converter: Option<&dyn DDlogConvert>, supplied by the client at runtime. When set, the stub library will use converter to perform the record-to-ddvalue conversion; otherwise it will send raw records to the server.

We will also refactor the C API to work on top of trait DDlog instead of HDDlog.

Protocol

The DDlog protocol, implemented over the ZeroMQ protocol, will consist of pairs of request/reply messages that match methods in trait DDlog, plus several configuration messages:

  • Handshake messages that establish a client/server connection and check protocol version compatibility.
  • Heartbeat messages sent when there is no client request in more than N seconds.

Most request/response pairs will be synchronous, as dictated by the current API; however the protocol will support totally ordered asynchronous messages: the client can send multiple requests without waiting for server's responses. The server is expected to process requests in-order and send a response message for each request. This is particularly useful for the apply_updates method that is allowed to complete asynchronously.

Error handling: we rely on ZeroMQ to restore connection after an intermittent failure. This can lead to lost messages (but not out-of-order messages as far as I can tell). We can handle this failure mode by assigning sequence numbers to messages, having the server reply to out-of-order requests by sending an error message with the last in-order message number back, and having the client retransmit unacknowledged messages, up to some threshold.

We will treat all other network-related errors as catastrophic failures. On the client side the API handle is placed in an error state where all subsequent calls fail; on the server side, the server process exits. The server process also exits if no requests have been received for M, M>N seconds.

Data serialization

Protocol messages will carry serialized representation of Record and DDValue types. It helps that both types already implement Serialize and Deserialize. We may want to serialize both to something like bincode or even make the format configurable. However, this may also be a good opportunity to properly implement Abmonation for DDlog types.

Not covered by this RFC

  • Multiple clients talking to the same server
  • Distributed DDlog (i.e., multiple servers working together)

ryzhyk avatar Oct 30 '20 07:10 ryzhyk

It may be awkward in the actual implementation, but I think basing the api off of an in-process model is the best route for having an api that feels good to work with. Ideally you could totally swap backends from ddlog to d3log with little to no changes in code, even though that's a slightly unrealistic goal to have

Kixiron avatar Dec 29 '20 01:12 Kixiron

statically typed API will require the client to additionally link against a DDlog-generated library.

Do you intend to support the case of client that loads multiple such libraries at once?

mihaibudiu avatar Dec 30 '20 01:12 mihaibudiu

Starting a DDlog instance.

Maybe use ssh for this? Are there any requirements for the environment of a ddlog process? I can't really think of any myself - if you can run a rust executable you should be able to run ddlog.

mihaibudiu avatar Dec 30 '20 01:12 mihaibudiu

Split trait DDlog into statically and dynamically typed APIs.

You should probably start with this piece, since it has nothing to do with the distributed execution.

mihaibudiu avatar Dec 30 '20 01:12 mihaibudiu

The DDlog protocol, implemented over the ZeroMQ protocol

ZeroMQ seems to offer much more than you need. For example, I think you will never need multicast.

heartbeat messages

If you use an existing framework it should take care of this. You don't want to implement this by hand.

mihaibudiu avatar Dec 30 '20 01:12 mihaibudiu

statically typed API will require the client to additionally link against a DDlog-generated library.

Do you intend to support the case of client that loads multiple such libraries at once?

Not at the moment, but this is in any case orthogonal to this RFC.

ryzhyk avatar Dec 30 '20 01:12 ryzhyk

How about having two separate traits for producers (who supply deltas) and consumers (who implement the callbacks)? This way the consumers don't have to be in the same process and you can make a chain of processes.

mihaibudiu avatar Dec 30 '20 01:12 mihaibudiu

Starting a DDlog instance.

Maybe use ssh for this? Are there any requirements for the environment of a ddlog process? I can't really think of any myself - if you can run a rust executable you should be able to run ddlog.

Right, this would be one option. One could also use some form of container technology, a dedicated factory server, or any other method of running a program. I don't think we need to prescribe anything here.

ryzhyk avatar Dec 30 '20 02:12 ryzhyk

Split trait DDlog into statically and dynamically typed APIs.

You should probably start with this piece, since it has nothing to do with the distributed execution.

yep

ryzhyk avatar Dec 30 '20 02:12 ryzhyk

The DDlog protocol, implemented over the ZeroMQ protocol

ZeroMQ seems to offer much more than you need. For example, I think you will never need multicast.

It does, but I don't need to use all of ZeroMQ, and there are many useful parts I actually want to use.

heartbeat messages

If you use an existing framework it should take care of this. You don't want to implement this by hand.

ZeroMQ makes it relatively easy to do this, but I don't think there's a built in primitive.

ryzhyk avatar Dec 30 '20 02:12 ryzhyk

You will have to think about the semantics of the various primitives. What does the return result from apply_updates mean? That the updates have been queued, received by the server, or that the computation is completed and changes have been already sent to the callback?

mihaibudiu avatar Dec 30 '20 02:12 mihaibudiu

You will have to think about the semantics of the various primitives. What does the return result from apply_updates mean? That the updates have been queued, received by the server,

it means that updates have been queued somewhere. It does not matter where as long as they will be applied in the next commit. The only guarantee we provide is that, if the commit succeeded then all preceding updates have been applied in order (i.e., they don't get lost or reordered). If DDlog cannot enforce this, it will go into an error state.

or that the computation is completed and changes have been already sent to the callback?

apply_updates never meant that.

ryzhyk avatar Dec 30 '20 02:12 ryzhyk

How about having two separate traits for producers (who supply deltas) and consumers (who implement the callbacks)? This way the consumers don't have to be in the same process and you can make a chain of processes.

Definitely, not another producer/consumer pattern :) You can totally chain things using trait DDlog (or whatever we evolve it into).

ryzhyk avatar Dec 30 '20 02:12 ryzhyk