lmql icon indicating copy to clipboard operation
lmql copied to clipboard

Use typing in server/client

Open lambdaofgod opened this issue 2 years ago • 3 comments

TL;DR

As for now the server code is pretty hard to comprehend.

I propose to add typing, I can rewrite server using FastAPI - in addition to specifying request/response schema it runs a /docs endpoint that contains examples so that there is no need to use Postman/CURL. Also it would be easier to maintain because when someone will use incompatible version than this person will get a ValidationError instead of some other runtime error in some other place.

Example where it would help - adding a new model interface

I've started working on adding RWKV integration, but I've run into the following problems

  • both server and client code need to be changed, I think I've updated server succesfully but on client side it doesn't work
  • RWKV uses tokenizers.Tokenizer which uses different interface than actual instantiated classes in transformers
  • I have problems with dc lib (actually what's up with that? Is this dclib a kind of abstraction over different models?)

@lbeurerkellner could you point me in the right direction?

Please let me know what you think.

Anyway, happy Easter!

lambdaofgod avatar Apr 08 '23 09:04 lambdaofgod

Wow, thanks for getting right into it. I won't have much time to look into it over the weekend, but I will answer more concretely next week. Happy Easter to you as well.

dclib is an array-based library for implementing decoding algorithms independently from model backends and e.g. control flow logic. More on this, a bit more publicly, soon.

lbeurerkellner avatar Apr 08 '23 10:04 lbeurerkellner

I've written synchronous version (I had problems with making the queues work) that can be used to illustrate the point.

lambdaofgod avatar Apr 08 '23 12:04 lambdaofgod

Thanks for the work. Can you comment on how FastAPI compares to e.g. gRPC with respect to throughput and latency. We are currently planning to optimise the LMQL Inference API or even switch to an established solution altogether.

lbeurerkellner avatar Apr 14 '23 08:04 lbeurerkellner

With the updated inference infrastructure, the API has been replaced by a socket-based custom protocol, i.e. LMTP: https://github.com/eth-sri/lmql/tree/main/src/lmql/models/lmtp

lbeurerkellner avatar Jul 19 '23 15:07 lbeurerkellner