OpenDevin icon indicating copy to clipboard operation
OpenDevin copied to clipboard

Local API or Gradio Client Support focus.

Open emangamer opened this issue 3 months ago • 5 comments

Gradio clients that run local language models such as “OobaBooga” and allow api support should be a major consideration for the roadmap process. Creating usable model swapping with a cache functionality is feasible. I made an example chart months ago when I saw the potential in MinP greedy sampling that Kalomaze did work on being helpful for memory driven tasked recall due to the token accuracy. image

Please note that current projects like MemoryGPT allow api usage but no widespread application allows for effective model swapping or multi system offloading. It’s also important to note that a side server “chain” of cheaper machines or a GGML focused network solution could allow for more garage labs.

Current Roadblocks are memory management, non-useful hallucinations (effective hallucinations could generate better idea tokens in a agent focus), and ineffective inter model conversation solutions that are actually open source for System prompting style implementation.

The most feasible multi model solution is to allow for most elements to be cpu offloaded but for features like live training a model with a model doing RLHF being a “drop in” use that requires a GPU with enough vram for training. Unless a Traditional ram based training solution is usable with current model base such as mistral.

To summarize, a focus on using API solutions such as chatgpt or Claude will stagnate research on local language model feasibility. Creating a feasible framework for agent structures and Lora based live tuning for memory retention elements on a version based task list will most likely be the best course.

emangamer avatar Mar 13 '24 06:03 emangamer