couchdb icon indicating copy to clipboard operation
couchdb copied to clipboard

Query Server Protocol v2

Open wohali opened this issue 5 years ago • 10 comments

@janl:

We need to revamp the Query Server. It is hardcoded to an out-of-date version of SpiderMonkey and we are stuck with C-bindings that barely anyone dares to look at, let alone iterate on.

I believe the way forward is re-vamping the query server protocol to use streaming IO instead of blocking batches like we do now, and use JS-native implementation of the JS-side instead of C-bindings.

I’m partial to doing this straight in Node, because there is a ton of support for things we need already, and I believe we’ve solved the isolation issues required for secure MapReduce, but I’m happy to use any other thing as well, if it helps.

Other benefits would be support for emerging JS features that devs will want to use.

And we can have two modes: standalone QS like now, and embedded QS where, say, V8 is compiled into the Erlang VM. Not everybody will want to run this, but it’ll be neat for those who do.

@davisp:

I'd be quite happy to see this change made as well for all of the reasons listed and more (busy servers with lots of CouchJS processes sucking up RAM come to mind). And I know others would really like to see better integration with other languages (ie, Lua embedded in the VM has been mentioned a number of times).

However I think the protocol is probably the wrong place to look at for this. I'd instead like us to take a look at our "language" field on design docs and see if we can't come up with either an extension or a more abstract field that we can use.

For instance, your specific scenario where we use Node as an external process vs. Node embedded in the Erlang VM is a great case. Both are JavaScript but both are fairly significantly different on the implication.

The first thing that comes to mind is some sort of content type extension where we use some specific names, ie, "javascript+erlv8" or similar. Or perhaps add the ability to tag things with required features maybe? That way an admin could determine whether there's a node server running or that V8 is embedded and most design docs wouldn't care where their code ran.

However, I think once we explore this aspect, then each of our "language" implementations can write their own protocols and do whatever they want. Thus V8 doesn't have to do anything hacky around not reading/writing data to stdio of a port program and new Node query servers could do something fancy dance. And someone could come along and write a strange BF view server that used shared memory to transfer JSON docs in and out of the view if they so desired.

Or maybe I'm talking crazy talk.

@wohali:

With the discussion to move ChakraCore into the system, do we really want to bother with continuing to support a query server protocol at all? I'm not so sure.

Also see #1334.

wohali avatar Aug 07 '18 15:08 wohali

With the discussion to move ChakraCore into the system, do we really want to bother with continuing to support a query server protocol at all? I'm not so sure.

The hook up is still via the same protocol, just not streaming over stdio

janl avatar Aug 08 '18 12:08 janl

I'd like to see the protocol support the option to run a query server as a separate network service.

There's been significant adoption of gRPC in the cloud-native computing world recently. I think this could provide a lot of interesting benefits including the network service (HTTP/2 is used for transport) and the streaming piece mentioned above.

kocolosk avatar Aug 13 '18 19:08 kocolosk

If we're talking about reworking the protocol, it is worth considering the implications of how we enforce sandboxing. As soon as you move to a network interface or simliar, there's nothing that stops that process from having crazy side effects, like storing the document locally, making calls back to CouchDB while a document is being processed, etc. Our current JS implementation is very strict about not allowing those things, and it's always bugged me that people used to write their own "query servers" that did all of these things that violate the contract between Couch and a query server.

@kocolosk wouldn't it make more sense to keep everything inside of Erlang to enforce the above, and just redo rexi so it uses gRPC?

wohali avatar Aug 13 '18 20:08 wohali

Hmm, I guess I thought the sandboxing was a rather separate issue from the communication protocol. I'm not sure I see how they're related here.

I'm certainly open to more innovation that reduces the need for custom code execution.

kocolosk avatar Aug 13 '18 20:08 kocolosk

@kocolosk @davisp My argument is that the query server needs to go, entirely, so people don't mistake it for an external interface. If it's an internal Erlang thing (such as a gRPC rexi implementation or whatever) then it's immaterial, that's as good as a write-only language for many. ;)

wohali avatar Aug 13 '18 21:08 wohali

I'm all for having internal erlang functions to do most common functions (for performance reasons), but please keep in mind that if the query server goes away, and we have to write our functions in erlang, anyone wanting to do anything remotely advanced will likely have to learn a whole new language specifically for the database.

Since couchdb is designed to empower the client to connect directly, and skip the server, your user base, I believe, would primarily be front-end javascript developers, who likely don't know erlang, so it would be nice to keep it consistent.

If you were to use a language more common in web development for the query server (js, php, ruby, etc.), it would be fine, because even if the developer didn't know the language before, the skills will be useful, but that's where the abilities of a query server shine: the fact that it can be any language.

Also, given how long it is taking to port the current javascript implementation to a new version (it is now 7 years old), I'm a little hesitant to giving up the ability to do it myself.

That said, the query protocol could use an overhaul. In designing a typescript, node-based query server, and was unable to create enhancements like processing documents in parallel (bear with me a minute). This is where (I think) the network-based query protocol could be helpful (though I'd stick to unix sockets for performance reasons), because things like views' map functions don't need to be synchronous. Given a protocol that supported this, we could use all cores on a machine to process documents in parallel either through multiple connections, or by tagging commands with IDs, and then let couchdb sort them in the b-tree as they come in (like it does now).

sploders101 avatar Oct 01 '19 23:10 sploders101

Hi @sploders101 thanks for chiming in on this one! There’s been a lot of work on a next-generation query server network protocol outside the visibility of this ticket so I’d invite @davisp or maybe @garrensmith or @jiangphcn to summarize the progress. I believe the whole thing is gRPC-based, not sure if we’ve developed a variant that communicate over Unix sockets yet.

We’re certainly planning on allowing users to define JS functions for views for the foreseeable future.

kocolosk avatar Oct 01 '19 23:10 kocolosk

@sploders101 Ah, sorry! I think I was too terse on my explanation previously. This change is purely to open up more possibilities for improvement and evolution of query server communication. Previously we were fairly tied to a stdio approach due to the level of abstraction in the couch_query_servers and couch_os_process modules. This work is just to try and abstract our basic function calls so that developers are able to experiment more without requiring the Erlang process communication.

To be slightly more concrete, the goal here is to be able to allow for a NIF based query server (i.e., having SpiderMonkey or V8 linked directly to the Erlang VM) while also allowing for network attached execution environments (i.e., gRPC [1]).

Or to put that all in a whole new and different light, the original goal of having a new protocol was a bit limiting. Having a more abstract API that would let people execute COBOL on the moon is, I think, more in line with what was intended. Also, if someone has COBOL servers on the moon, I'd like to have a chat because that would be awesome.

[1] https://github.com/cloudant-labs/ateles

davisp avatar Oct 02 '19 02:10 davisp

@kocolosk That's good to know, thanks!

@davisp Thank you for the clarification! I was just submitting an issue I'm having with the current query server protocol and found this (seeing that it was open and had activity). Wohali's comment is what concerned me, talking about getting rid of the query server entirely, but it seems I misunderstood.

Using a well-defined protocol for query servers should be more than fine. If you are doing something network-based though, unix sockets may be worth looking into. I'm not sure how they are done in erlang, but the C/C++ api is almost identical to that of TCP. I believe the only difference is in construction, due to the fact that it uses a file path instead of a network address.

I'm excited to see where this goes!

Thanks!

sploders101 avatar Oct 02 '19 15:10 sploders101

This sounds really exciting, has progress been made on this idea behind the scenes? Also, does this mean that the indexing process would be radically faster than it is now if a JS engine is directly linked into the Erlang VM?

I am a happy couchdb user, the only worry I have is about the duration of indexing at scale (for a database with billions of documents). I wonder if indexing will be able to catch up if 10s of millions of documents are added on a daily basis. I am also thinking about the scenarios when for some reasons an index must be built from scratch on such a large database. Because of the network overhead with the Query Server (if i understand correctly), I am worried that indexing may never converge.

Boosting the performance of indexing and enabling the process to saturate bare metal would give users peace of mind when planning such large systems. So I am curious to know if the Query Server Protocol v2 would help with that or if this is out of scope.

nkosi23 avatar May 29 '22 12:05 nkosi23