deepkit-framework icon indicating copy to clipboard operation
deepkit-framework copied to clipboard

RPC Docs + Reasoning

Open 205g0 opened this issue 3 years ago • 13 comments

I assume you know it yourself that the second section about RPC in your docs is broken/not yet finished, so no worries and I think you just doubled the navigation elements there or whatever.

I rather wonder and would like to hear your take on RPC. Nowadays everyone sticks to ancient, unmaintainable REST or jumps to convoluted GraphQL APIs, defining types multiples times... So, I am happy that somebody just uses RPC haha. But what are disadvantages from your perspective and why did you still opt for RPC?

205g0 avatar Jul 01 '21 12:07 205g0

Yeah, deepkit/rpc is not yet documented (only for the framework part). Is that what you mean with broken?

In my experiences, the library deepkit/RPC made my life so much easier. RPC in general is not entirely different to REST, in the sense that both execute code on the server and return results. One uses method names the other HTTP verbs (and sees it as resources). Both, in the way its implemented these days in TS have the same overhead and are equality type-unsafe and have high maintenance cost. GraphQL is inherently a RPC, actually on top only much more complex and has as only advantage that you can choose what to return, which is built-in to the protocol itself. It has massive disadvantages though, like n+1 problem, being slow, and inherently type-unsafe. The main feature of selecting what you need could be built into a REST Api as well, but is not baked into the protocol. All of them have as issue that they are not type-safe per se. That doesn't mean you can't make them type-safe, it just means you have to use additional stuff, usually less convenient and more code, to make it type-safe.

In our RPC implementation it's different: It is inherently type-safe. You define a controller, its interface, and consume that TypeScript interface directly in your client code. It's perfectly type-safe, completely automatically. Refactoring, adding new methods, error handling, validation, etc is a piece of cake because its happens automatically. It's much faster to build network APIs,. Its mind-boggling to me that nobody built something like that before.

It's designed in a way, that it could completely replace GraphQL. For example could we integrate a Query Layer on top of it where you can execute queries like against your Database via deepkit/orm remotely via RPC. For example a regular Database query looks like that on the server:

class User {
  //...
}

const database = new Database(...);

const users = await database.query(User).select('username', 'created', 'image').find();

To get access to the User data source on a remote client is much harder. How does this look now with current approaches?

REST/HTTP

You could build a REST API, define a route with unique name, and make sure everything is as JSON serialized. On the frontend you need to know that magic string of your route URL, query it, and deserialize the JSON so you have the correct data type. Zero type-safety. You can add additional a validator that makes sure data is correctly received to make it type-safe, but you have to do it manually - per route.

To select only certain fields, you need to add a new parameter to the route, something like select which allows to pass field names, maybe comma separated.

GraphQL

GraphQL is another beast, while it automatically serialises and allows you to add "selects", its inherently type-unsafe. The query you write is not even TypeScript anymore, and thus its not easily parseable by TypeScript to extract its type information. On top of that you need to learn a new query language and its limitations of the protocol, and it's slow and complex to setup. You need a lot of boilerplate and generators and what not to get the User list now from a network request. Not a good solution if you ask me.

RPC

RPC RPC has been around for a long time. But all implementations I found are unhandy, unsafe, don't serialize/deserialize correctly, have no validation, or error forwarding. You can build easily a method to receive a user array, but it faces the same problems as with REST. You have to deserialize it manually, add validations, and make it type-safe yourself.

Deepkit RPC

Deepkit RPC on the other side is inherently type-safe, much faster than GraphQL, serializes/deserializes and validators data, and makes sure errors are correctly forwarded (with the nominal error class).

To get now in Deepkit RPC the same (single good) feature of GraphQL to limit the result of network call, you could easily just create a separate method that returns what you need. Methods in Deepkit RPC are very quickly written.

An alternative, and this is what I was thinking for quite some time, is to add a query layer on top of Deepkit RPC. Completely type-safe with all the goodies of course. Something like

const client = new RpcClient('localhost:888');

const users = await client.query(User).select('username', 'created').find(); 

What it could do under the hood is to serialize the query and its information (which entity, which filter, selects, joins, etc), send it to the server, it will be resolved. For example the ORM could take those information directly and build a query out of it, since the API is essentially the same. Would be pretty dope if you ask me.

marcj avatar Jul 01 '21 12:07 marcj

Hey! Thanks for your extensive reply, gives good insights!

Yeah, deepkit/rpc is not yet documented (only for the framework part). Is that what you mean with broken?

Yes and all good, the framework part provides already a great outlook...

How does this look now with current approaches?

ATM, I have the data models as pure classes in isolated modules which both BE and FE do import. So, both benefit from type-safe]y at CT. For RT validations, these classes are decorated with class-validator, also for both BE and FE. So, also here just one source of truth.

As access layer, I use type-graphql because it let you use a shim on FE and hence, gets stripped away without the need to have some code-generation on file-save.

Despite type-graphql which helps def to keep all things together and is kind of DRY—man, it's really tedious. In contrast, it's again required because the client access layer is just a subset what the the BE can query from the DB. Or you would just take a GraphQL API built-in DB such as DGraph, k8ssandra or Fauna but AFAIK their restriction capabilities for the FE are also limited...

So, long story short: I think all this mess is required at the end and I was just wondering if RPC gives me a bit relief. What I understood is that I still have to define the subset of the access layer and even if I do not like GraphQL, there are many solutions in the GraphQL space, I can switch any part and that GraphQL vanishes in 5 years and I have unmaintained code can happen but probably unlikely but yeah...

To get now in Deepkit RPC the same (single good) feature of GraphQL to limit the result of network client,

So, that's the deciding point: Which solution creates less work? Should we roll our own solution or just take cumbersome GraphQL. Coming up with good and consistent abstractions that scale over all edge cases and not leading to the typical REST version mess after being one year in production is another endeavor. But you can still take a leaner approach than GraphQL took, just their query language is 🙄 and as you wrote, I find it super hard to work directly in it because it cannot be parsed, so going the decorator path is the only option.

Whatever, you are solving a major pain and have the right opinions on it. This 'having one singlesource of truth' is not as wide-spread in the community as one might think haha.

205g0 avatar Jul 01 '21 13:07 205g0

Honestly, I had never in my entire career the need for GraphQL. The costs of low performance and to make type-safetly working was just too damn high. I see the need for companies like Facebook &co though, where they provide user an API to a huge database. So selecting what you need is perfect here. But for products that doesn't even have a public API? Or only a frontend that uses < 100 API query routes? I don't think so. If my frontend/client needs less information than the route provides and if that is really a bottleneck, then I can add either a parameter to limit it, or in worst case implement a new route/method.

However, I see the conceptual advantage of having an API that is more dynamic than we currently have. Especially for querying data. It's cumbersome to think always in terms of the various network abstractions like Rest/GraphSQL/RPC/HTTP when you just wanna get data from your database. Thus I think having an Query API like I described for Deepkit RPC would make perfectly sense. With hooks for authentication/authorization and custom resolvers (for fields/relations that have a different data-source for example) it could be pretty powerful, without the performance, architectural, maintenance and learning costs of GraphQL.

marcj avatar Jul 01 '21 13:07 marcj

Honestly, I had never in my entire career the need for GraphQL.

Me neither haha. I once only was consumer of the Instagram API and yeah it was nice but as you said, they play in a different league. And all their clients are also not the fastest. I bet that TikTok is never ever using GraphQL.

Thus I think having an Query API like I described for Deepkit RPC

My problem is, I know that you are right, I'm just to lazy to rewrite my codebase. When will you have added more docs about your RPC client and btw, were really the existing RPC clients that bad? And a bonus q: Was it worth to rewrite Mongo's native driver, is your version that much faster?

205g0 avatar Jul 01 '21 14:07 205g0

were really the existing RPC clients that bad?

Yes, for me definitely. My driver in all my libraries was constantly: high performance in terms of execution speed and high performance in the way I write code. I absolutely hate to write stuff twice, thrice, ... GRPc for example uses protobuf, which requires you to define schemas in .proto format and compile them. Can you imagine how much overhead that produces when you build application like deepkit.ai, where you have dozens and dozens of models? Unbelievable inefficient. I'd probably still writing code and never have released that alone with other libraries. Then there are countless of other other libraries that use simple strings as method names, don't support parameter validation or serialization, no stream support via RxJS, no error forwarding, no download/upload tracking, and usually are incredible slow. I've built with deepkit/rpc exactly what I needed and I resolved all the issues other currently have - at least for my use case and my approach how to quickly build complex web apps.

And a bonus q: Was it worth to rewrite Mongo's native driver, is your version that much faster?

Definitely. Not only did I learn the ins and out of MongoDB, but I also built because of it deepkit/bson, which is the encoding for deepkit/rpc and one of the reasons why it's so fast. The client itself is also much faster. The query benchmark indicates only a few 30% because it compares the full fledged ORM against the raw MongoDB (not the deepkit mongo client vs official mongodb client), but inserting is 300% faster, and other operations as well - all from within the ORM. Using the deepit mongodb client directly is even faster. Besides that it fixes serious limitations currently that the official mongodb driver has, namely:

  1. bigint support.
  2. correct error stack traces. If you use the mongodb client and the servers response with an error, you don't know what code lead to that error. The whole error stack trace is lost, since they work still with callbacks and partially Promises. When an error occurs you don't know is it from a REST route, from a RPC call, from a scheduled task? Absolutely no no for anything serious.

marcj avatar Jul 01 '21 14:07 marcj

I think one issue with RPC is that it's inherently stateful whereas REST can simply round-robin between a 100 containers, with any number of them shutting down for upgrade at any time.

REST is also better for exposing an application's API outside. Deepkit's client code, while awesome, is still limited to typescript whereas with REST any programming language can be used.

I stay a big a fan of REST but I do plan on using Deepkit's RPC for inner workings of the app.

Rush avatar Jul 23 '21 00:07 Rush

I think one issue with RPC is that it's inherently stateful whereas REST can simply round-robin between a 100 containers, with any number of them shutting down for upgrade at any time.

That's a big difference currently. We could however make it stateless, by adding support for HTTP requests as transport layer (for non-streaming methods).

Deepkit's client code, while awesome, is still limited to typescript whereas with REST any programming language can be used.

This is also something I see as disadvantage. While its great for isomorphic TypeScript apps, it has not the same advantages when different languages are involved. Then frameworks like gRPC/GraphQL/REST are better as its cross-technology and has support for many language. We could try to make an official spec for Deepkit RPC and an official C++ implementation, which allows other languages to support the protocol easier.

marcj avatar Jul 25 '21 16:07 marcj

Hmm, it's a great discussion, but there are some considerations to be taken into account that is missing. This thread only describes in detail backend <-> backend, but not frontend <-> backend where RPC (HTTP/2) isn't natively supported yet, of course, one could use Envoy & gRPC-Web, but this doesn't really matter when Deepkit RPC isn't gRPC compatible (correct me if I'm mistaken). This is exactly where the need for GraphQL comes in as opposed to what REST has to offer.

We hopefully all know the advantages of GraphQL, but here's some anyway:

  • Big ecosystem
  • Easily cacheable
  • Offline support
  • Query language
  • State management

I would never in my life use Apollo Federation on the server side because of the poor performance.

marcus-sa avatar Jul 29 '21 09:07 marcus-sa

@marcus-sa Not sure what you mean, but Deepkit RPC works fine in frontend <-> backend communication as it uses WebSockets.

marcj avatar Jul 29 '21 09:07 marcj

@marcus-sa Not sure what you mean, but Deepkit RPC works fine in frontend <-> backend communication as it uses WebSockets.

Yeah, figured that out after commenting. Thought Deepkit RPC was built on HTTP/2. My bad.

marcus-sa avatar Jul 29 '21 09:07 marcus-sa

Coming to this discussion a bit late, but one thing confuses me with the statement GraphQL is "inherently type-unsafe". AFAIK, you can't do anything in GraphQL without it being typed. Typed (return) Objects, typed args, typed input objects, etc. So, what is meant by it being type-unsafe?

Scott

smolinari avatar Aug 26 '21 09:08 smolinari

I'm sorry if I was unclear with that statement. I was talking about type-safety in TypeScript land. Yes, GraphQL has types, but only their own types. Those GraphQL types have nothing to do with TypeScript types and can not be infered (as they are declared as strings) to TypeScript types from GraphQL directly. You have to use additional tools with boilerplate and sometimes code-generator and what not, to get actual TypeScript types. I should have said "GrahpQL itself is inherently type-unsafe in TypeScript" to make that clear.

Deepkit RPC on the other side works directly on the real TypeScript types. No additional tools, no code generation, etc needed. It is inherently type-safe in TypeScript, as both the client and sever use the exact same TypeScript type and even share their interface to make it perfectly type-safe.

marcj avatar Aug 26 '21 12:08 marcj

Ok. Thanks for the clarification @marcj . Yes, I've been looking at the tooling necessary to have as little code written as possible and yes, it requires code generation at two points. Creating the GraphQL schema via NestJS (the framework I'm using) and Nest's code-first approach, to then taking the schema and generating "use" functions for Vue (the frontend framework I'm using). I don't see any large discrepancies or potential issues with this process. The tools are already built in fact. I haven't gotten it all down to working yet, so it's theoretical. Maybe I'll run into some smaller issues, I don't doubt I will. I'm working on the backend currently and am looking to replace class-transformer with your type library (thanks for that work btw). 😁 That's how I ended up here to begin with.

Scott

smolinari avatar Aug 26 '21 13:08 smolinari

I've been using GRPC quite a bit with JS, but it's a very clunky library (it still uses callbacks!), and there are many things in it that make it quite inflexible. We've been using it quite advanced ways (peer to peer, proxied over UDP network, custom TLS verification logic, conversion of streams to async iterators, in-band error management).

I'm evaluating deepkit rpc. And I have some questions.

  1. What kind of streaming does this support? We're using client streaming, duplex streaming and server streaming. In particular duplex streaming is useful for creating sub-protocols on top of a general RPC system. We sometimes have an "back and forth interactions" between peers that requires statefulness for a short amount of time. That is "encapsulated" within a single "transaction".
  2. Is the general model transport agnostic? We need the RPC system to work on top of a reliable UDP stream? And if it is transport agnostic, how does it handle concurrency specifically muxing and demuxing, how does it identify streams in flight?
  3. JSON/binary serialisation/deserialisation. In some cases we have structured objects that are like JSON. In other cases we have actual binary content. If we used JSON soley, we would have to use a base64 encoding which blows up sizes by 33% (unless we use a custom base encoding to reduce this). Does this support a flexible serialisation/deserialisation for binary data?
  4. Error handling - how does errors propagate from client to server, or server to client? GRPC only supports server to client errors, but not client to server errors. Are the errors managed through "leading" or "trailing" headers?
  5. AsyncIterables vs Observables vs Promises - how do you foresee backpressure being managed and push vs pull APIs here when dealing with streams?
  6. What do you think JSON-RPC?

CMCDragonkai avatar Oct 13 '22 02:10 CMCDragonkai

@CMCDragonkai

  1. RxJS streaming via Observables/Subjects which can be from server->client (server controller) or vice-versa (client controller). Since connections are stateful you can control the stream with an arbitrary RPC action.
  2. yes, there is websocket and tcp implemented. as long as you make it connection-aware it should work fine. https://github.com/deepkit/deepkit-framework/blob/master/packages/rpc-tcp/src/server.ts#L91
  3. it uses a binary protocol and uses BSON for message serialization so works perfectly fine with binary data (typed arrays or ArrayBuffer work out of the box)
  4. when using a server-controller on the client side, thrown errors on the server will be redirected to the client. this can be disabled https://docs.deepkit.io/english/rpc.html#_transform_error
  5. back-pressure is checked on per-connection basis so no matter how many streams you have open. there is no measure in place to prevent back-pressure, so you either have to trust the client or implement it on your own on the transport layer (e.g. kick the client when bufferedAmount is too big).
  6. don't like it since I prefer fast binary protocols

marcj avatar Oct 13 '22 02:10 marcj

How hard is it to call Deepkit RPC service from other languages (ex. with a WebSocket library)?

NexZhu avatar Oct 13 '22 04:10 NexZhu

@NexZhu rather hard since you'd have to write a client implementing the protocol yourself, at least the header: https://github.com/deepkit/deepkit-framework/blob/master/packages/rpc/src/protocol.ts#L37-L74 the body is then regular BSON which can be parsed in almost any language

marcj avatar Oct 13 '22 16:10 marcj

Do you have any sequence diagrams that demonstrate the RPC protocol logic? I'm interested in any negotiation logic and error handling that occurs.

CMCDragonkai avatar Oct 14 '22 01:10 CMCDragonkai

@marcj regarding peer to peer usage. I noticed you have created concepts like "client" and "server.

If I am writing a client, and I connect to the server, can the server simultaneously call back methods on the client without having to establish a separate connection back to the client?

In our current P2P architecture, all nodes are both simultaneously clients and servers. In most RPC systems, this means each side has to set up a client connection to each other. However it should be possible that within a single connection, that connection could expose the client's API to the server at the same time as exposing the server's API to the client.

How difficult would it be to adapt your protocol to achieve such a thing?

CMCDragonkai avatar Oct 14 '22 01:10 CMCDragonkai

Client to server, server to client, peer to peer, everything happens through the same connection. So yes this is already possible.

marcj avatar Oct 14 '22 03:10 marcj

Hi @marcj I've been trying your examples for the deepkit rpc in:

  • https://deepkit.io/library/rpc
  • https://deepkit.io/documentation/framework/rpc/client
  • https://docs.deepkit.io/english/rpc.html

And none of the examples work. The server runs, but the client examples do not work.

I think may be that's what the OP was referring to.

For example this doesn't work:

import { rpc, RpcKernel } from '@deepkit/rpc';
import { RpcWebSocketServer } from '@deepkit/rpc-tcp';

@rpc.controller('myController')
class Controller {
    @rpc.action()
    hello(title: string): string {
        return 'Hello ' + title;
    }

    @rpc.action()
    async getUser(): Promise<string> {
      return 'this is a user';
    }
}

async function main () {

  const kernel = new RpcKernel();
  kernel.registerController(Controller);

  const server = new RpcWebSocketServer(kernel, 'localhost:8081');

  // @ts-ignore
  server.start();

  console.log('STARTED');
  // server.close();
}

main();

And

import { rpc, RpcKernel } from '@deepkit/rpc';
import { RpcWebSocketClient } from '@deepkit/rpc';

interface ControllerI {
  hello(title: string): string;
  getUser(): Promise<string>;
}

async function main () {

  const client = new RpcWebSocketClient('ws://localhost:8081');
  const controller = client.controller<ControllerI>('myController');

  const result1 = await controller.hello('world');
  const result2 = await controller.getUser();

  console.log(result1);
  console.log(result2);

  client.disconnect();
}

main();

Also I saw that in your docs, you said that it's possible for the client side to register a controller, but there are no docs on how to do this.

CMCDragonkai avatar Nov 15 '22 04:11 CMCDragonkai

Hi @marcj I've been trying your examples for the deepkit rpc in:

  • https://deepkit.io/library/rpc
  • https://deepkit.io/documentation/framework/rpc/client
  • https://docs.deepkit.io/english/rpc.html

And none of the examples work. The server runs, but the client examples do not work.

I think may be that's what the OP was referring to.

For example this doesn't work:

import { rpc, RpcKernel } from '@deepkit/rpc';
import { RpcWebSocketServer } from '@deepkit/rpc-tcp';

@rpc.controller('myController')
class Controller {
    @rpc.action()
    hello(title: string): string {
        return 'Hello ' + title;
    }

    @rpc.action()
    async getUser(): Promise<string> {
      return 'this is a user';
    }
}

async function main () {

  const kernel = new RpcKernel();
  kernel.registerController(Controller);

  const server = new RpcWebSocketServer(kernel, 'localhost:8081');

  // @ts-ignore
  server.start();

  console.log('STARTED');
  // server.close();
}

main();

And

import { rpc, RpcKernel } from '@deepkit/rpc';
import { RpcWebSocketClient } from '@deepkit/rpc';

interface ControllerI {
  hello(title: string): string;
  getUser(): Promise<string>;
}

async function main () {

  const client = new RpcWebSocketClient('ws://localhost:8081');
  const controller = client.controller<ControllerI>('myController');

  const result1 = await controller.hello('world');
  const result2 = await controller.getUser();

  console.log(result1);
  console.log(result2);

  client.disconnect();
}

main();

Also I saw that in your docs, you said that it's possible for the client side to register a controller, but there are no docs on how to do this.

You forgot to implement the controller interface & symbol.

Client

import {RpcWebSocketClient} from '@deepkit/rpc';
import {MyControllerInterface} from 'common';

//connect via WebSockets/TCP, or in-memory for SSR
const client = new RpcWebSocketClient('localhost');
const ctrl = client.controller(MyControllerInterface);

const user = await ctrl.getUser(42);
console.log(user.username);
console.log(user instanceof User); //true

Server

import {rpc} from '@deepkit/rpc';
import {MyControllerInterface, User} from 'common';

@rpc.controller(MyControllerInterface)
class MyController implements MyControllerInterface {
    @rpc.action()
    async getUser(id: number): Promise<User> {
        return new User(id);
    }
}

new App({ imports: [new Framework module()], controllers: [My controller] }).run();

Common

@entity.name('user')
export class User {
    username: string = '';
    constructor(public id: number) {}
}

export const MyControllerInterface = ControllerSymbol<MyControllerInterface>('my', [User]);
export interface MyControllerInterface {
     async getUser(id: number): Promise<User>;
}

marcus-sa avatar Nov 15 '22 09:11 marcus-sa

And none of the examples work

That's a bold claim. While it can be you didn't set up correctly, it is very unlikely that all these examples do not work, as we have hundred of tests that proof otherwise. You probably didn't install deepkit/type correctly.

I'd recomment to post an actual error message with stack trace instead of just a "it doesn't work". A reproduction repo is even easier, so we can point exactly to the mistake in your code instead of random guessing.

Client side controller can be seen here: packages/rpc/tests/back-controller.spec.ts

marcj avatar Nov 15 '22 11:11 marcj