codepod [Prototype] Self-hosted CodeLlama LLM for code autocompletion

Summary

This pr provides a solution to use CodeLlama for self-hosted code autocompletion.

A standalone LLM is required to run separately and expose a RESTful API to access. The most straight-forward way is to use llama.cpp, in which you can host the LLM either on a Mac laptop or a GPU-installed machine.
- Please follow the instructions of llama.cpp to run the LLM locally or in a cloud, also feel free to let me know if a detailed tutorial is needed.
A Sidebar setting is added to enable/disable the copilot
tRPC seems not very well support multiple providers in React, ref, thus I put the Route in api/src/server, initially we talked about to create an standalone server service to the copilot-related services.
There are a couple of follow-up patches to add, e.g., vite bug fix, monitoring the copilot service connection status, similar to the sync statue in the top of Sidebar, adding infilling mode in addition to autocomplete mode. This PR also aims to collect some early feedback regarding the overall design and architecture.

Test

First, enable the RESTful API on the llama.cpp, assuming the IP address is x.x.x.x, port is 9090
On the local machine, open the terminal,
- cd codepod/api/
- pnpm dev --copilotIP x.x.x.x --copilotPort 9090

Note that, the screenshot below intends to the demonstrate the functionality, the quality of the auto-completion might be low due to the 4-bit quantized llama-7b model.

copilot

Nov 04 '23 02:11 senwang86

tRPC seems not very well support multiple providers in React, ref,

I also came across multiple providers the other day, and it is well supported: trpc#3049. I have implemented multiple providers in https://github.com/codepod-io/codepod-cloud/pull/11. Related code:

https://github.com/codepod-io/codepod-cloud/blob/113f4f7ca3656d6db2296bb32a64dc8ae3ae3342/ui/src/lib/trpc.ts#L9-L16

Nov 05 '23 06:11 lihebi

With that said, it could actually be better and simpler to leave it in api/ routers, so that the frontend always has one API to talk to. We can let the api/ forward the request to the actual LLM service internally through tRPC or gRPC.

Nov 05 '23 06:11 lihebi

Also, there's an uncaught exception in the console for canceled API call. I'd like to catch it and display a canceling message to keep the console clean.

Nov 07 '23 19:11 lihebi

monitoring the copilot service connection status

This isn't that critical. We can assume that the service is up.

adding infilling mode in addition to autocomplete mode

This is quite important. It is quite often that we edit code in the middle.

Nov 07 '23 19:11 lihebi

After the discussion, we decide to leave this PR as a reference to integrate the self-hosted copilot and address the comments in the codepod-cloud repo.

Nov 17 '23 05:11 senwang86

codepod codepod copied to clipboard

[Prototype] Self-hosted CodeLlama LLM for code autocompletion

Summary

Test

codepod
codepod copied to clipboard