Open-Assistant
Open-Assistant copied to clipboard
Basic implementation of an plugin system for OA
the plugins
Hi, this is my first PR here, but I was somewhat active on other fronts of OA development.
This pr will bring some basic plugin functionality to the Open Assistant, and as discussed with @yk @andreaskoepf there are quite a few directions for something like this to be integrated with OA, but this should serve a purpose as some initial proof-of-concept and exploratory feature for 3rd party integrations with OA.
I also included a small calculator plugin as a possible candidate for the OA internal plugin, which would be like the default one, for people to try out and also as an example, of how one could implement own plugins.
If we decide to include this plugin, there should be added a deployment/hosting mechanism for it.
I will push a separate branch in the next couple of days, that will serve as an alternative to the approach, so we can A/B test it along with the new models (SFT-s/RLHF-s)
I also tried to comment on every weird quirk or decision in code, so one could easily understand, and change it, but there are quite a few places, where a simple new line char or like " char in specific strings, could affect LLM performance in the plugin usage scenario.
Will also try to push some documentation regarding plugin development, but there are already some useful comments in calculator plugin on what should be paid attention to.
Here are some of the current UI changes introduced with this PR.
Plugin chooser component

Plugin execution details component



Plugin assisted answer


Verified plugin usage UI look

Some plugin usage examples


Mixed usage example where model chooses not to use plugin on its own

So cool! if possible can you add some docstrings to some of the important functions to make it easier to try just jump in and understand them? I'm not 100% sure if we have any standards etc around this yet in the repo - but was thinking if thinking about some docs maybe starting there might be least painful for now potentially as such a big PR.
What would be a good way to try help test this out?
Would it be feasible for me to try this "locally" in some way?
I can tell you how I ran this "locally" if it helps.
I used docker for the complete OA stack on my local server like:
docker compose --profile frontend-dev --profile inference up --build --attach-dependencies
and the inference server was run on a remote server. (4XA100 80gb) sharded with https://github.com/huggingface/text-generation-inference
You can than point worker to that inference server with editing of worker_full_main.sh, but I am not sure that is the easiest way, however, that is how I did it.
You can easily fit it in single A100 or even weaker gpu with half precision, but if you want to do prompt engineering which requires multiple re-runs of generations over and over again, you will give up soon, because of slowness... at least i did lol.
As for requirements for the model, there are some details in readme.md files inside of inference and worker folders.
I used docker for the complete OA stack on my local server like:
docker compose --profile frontend-dev --profile inference up --build --attach-dependencies
Strangely I keep encountering
Error response from daemon: network 102b7c8cf4e45cd48696f81aaef73b33e88df2d979cdad8ca372c911c9753b7b not found
with the frontend. Can you also elaborate more on how to run it using just the inference text client and reducing the number of container services we need? For instance what docker container command line do you use to start the calculator plugin service?
EDIT: I cleaned up the issue there but the above is more about ways to set up a service like the calculator and possibly reduce the service dependencies to just a text client. In both cases we will need to start the plugin service as it won't be available from a GUI frontend as well.
Hi,
Plugin is not included in docker containers, as its just example for now, there is no internal agreement yet, how we would go about default 'out-of-the-box' plugins.
You can for now just run this command from the plugin folder: uvicorn main:app --reload --port 8085 --host 0.0.0.0
And update IP of the plugin in inference/server/oaast_inference_server/routes/configs.py the accordingly to your network environment.
Hi,
Plugin is not included in docker containers, as its just example for now, there is no internal agreement yet, how we would go about default 'out-of-the-box' plugins.
That's ok. The idea however is to spin it up so that we can experiment with a plugin in general and the changes here.
You can for now just run this command from the plugin folder:
uvicorn main:app --reload --port 8085 --host 0.0.0.0
And update IP of the plugin in inference/server/oaast_inference_server/routes/configs.py the accordingly to your network environment.
Not sure if all of this is enough:
I did
pip install uvicorn starlette
/mnt/local/notebooks/Open-Assistant/inference/worker/plugins/calculator> uvicorn main:app --reload --port 8085 --host 0.0.0.0
INFO: Will watch for changes in these directories: ['/mnt/local/notebooks/Open-Assistant/inference/worker/plugins/calculator']
INFO: Uvicorn running on http://0.0.0.0:8085 (Press CTRL+C to quit)
INFO: Started reloader process [29640] using statreload
INFO: Started server process [29642]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: 172.20.0.12:42990 - "GET /ai-plugin.json HTTP/1.1" 200 OK
INFO: 172.20.0.10:44318 - "GET /ai-plugin.json HTTP/1.1" 200 OK
INFO: 192.168.40.1:46356 - "GET / HTTP/1.1" 404 Not Found
INFO: 172.20.0.10:51110 - "GET /ai-plugin.json HTTP/1.1" 200 OK
and adapted the IP in
diff --git a/inference/server/oasst_inference_server/routes/configs.py b/inference/server/oasst_inference_server/routes/configs.py
index 46a8ff99..7c0f2a0e 100644
--- a/inference/server/oasst_inference_server/routes/configs.py
+++ b/inference/server/oasst_inference_server/routes/configs.py
@@ -12,7 +12,7 @@ from oasst_shared.schemas import inference
# NOTE: Replace this with plugins that we will provide out of the box
DUMMY_PLUGINS = [
inference.PluginEntry(
- url="http://192.168.0.35:8085/ai-plugin.json",
+ url="http://192.168.0.31:8085/ai-plugin.json",
enabled=False,
trusted=True,
),
then tested the port opening via telnet
and it is there. The web UI shows the borked menu you can see above and the logs don't show the plugins
list being populated with everything.
All that you have done seems about right. I am not sure, why is not working for you, do you have errors on the docker logger?
It would be cool if you or someone can make some automated "dev" chat sessions like: I or any dev, provide input messages as JSON files, drag over chat UI, and it automatically sends one message, waits for an answer, and then continues with a second message, etc For now, I do that manually.
All that you have done seems about right. I am not sure, why is not working for you, do you have errors on the docker logger?
Unfortunately no error of any kind (also no Failed to fetch plugin config
logging messages from the server), do the logs contain useful debugging information if the plugin was detected and loaded in some way? I can see the initial GET request for the service
INFO: 172.20.0.10:37900 - "GET /ai-plugin.json HTTP/1.1" 200 OK
and I can see the minimal plugin being listed in the work request:
plugins=[PluginEntry(url='Test', enabled=True, plugin_config=None, trusted=False)]
Perhaps the plugin_config should be populated and values here should be different?
Yes there are some logs, like:
in routes/config.py
Wondering, for a big PR like this, is there a way or does make sense to be able to deploy this as a feature branch to it's own testing env so that people reviewing can easily play with it on some specific dev env.
Not sure how feasible or not that would be in current infra and DevOps side of things but was just wondering if could be useful if was possible.
Not sure how feasible or not that would be in current infra and DevOps side of things but was just wondering if could be useful if was possible.
good idea, but gonna be pretty hard currently :D I think we'll go the route of just getting it onto dev/staging and then if things don't work out, fix or revert
I think last commit deleted website/package-lock.json
when it should delete package-lock.json
in repo root
Hmm, seems like the merge of main
into PR branch has caused some mess in the PR history
I reverted it, will try rebase, to fix conflicts
:x: pre-commit failed.
Please run pre-commit run --all-files
locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md