AICodeBot icon indicating copy to clipboard operation
AICodeBot copied to clipboard

WIP: local LLM prototype (would like feedback)

Open hanselke opened this issue 2 years ago • 3 comments

added docker-compose which launches nats + falcon7b.

it's currently pretty hacky, as nats requires async.

  1. Tried to use async LLM _call functions and agenerate but it gives a weird error.
  2. Used asyncclick instead to have async cli functions.
  3. ignored "INP001", # TODO dont know how to deal with services/falcon7b/nats_falcon7b.pyis part of an implicit namespace package. Add aninit.py`. ]

plus falcon7b doesnt work with current prompts. it returns pretty much the full prompt for now.

TODOs pending approach review:

  1. clean up nats server routing so it can be run outside of the docker network
  2. universal-ish transformers pipeline for different models? Prob have a few categories like bfloat16, various quantitation options. maybe find a way to rip off https://github.com/go-skynet/LocalAI

Bottlenecks:

  1. need a mechanism to have different prompts for different LLMs
  2. need to learn how to use pytest properly, and ideally have a code generation test so we can actually compare the different LLMs. Need to close the loop for generated code to run against unit tests.

hanselke avatar Jul 28 '23 06:07 hanselke

Working on this now

TechNickAI avatar Aug 02 '23 14:08 TechNickAI

I got basic Hugging Face Hub "working" (with bad results)

https://github.com/gorillamania/AICodeBot/commit/cb604ca970146d72d4dc836ba8a6888528a61c6c

But I think what we actually need is local LLMs, and the direction I'm going is this docker image from hugging face that runs highly optimized local models.

https://github.com/huggingface/text-generation-inference#using-a-private-or-gated-model

TechNickAI avatar Aug 04 '23 10:08 TechNickAI

cool. i didnt know that you could self-host models thru huggingfacehub.

so from my research, it seems like we're gona need >20B params for it to be of any use.

I think the best way really is to close the loop by running the output code against unit tests. Then we'll be able to just run it thru all the models out there. Probably wont be straight forward due to prompt differences, but I feel dumb testing them manually one by one, knowing that theres gona be more of them released over time.

hanselke avatar Aug 07 '23 02:08 hanselke

Thank you for your contribution. The code has long since diverged from this approach.

TechNickAI avatar Jul 01 '24 17:07 TechNickAI