gpt-engineer icon indicating copy to clipboard operation
gpt-engineer copied to clipboard

[Feature] Using a open-source LLM instead of Open AI

Open mahimairaja opened this issue 2 years ago • 4 comments

mahimairaja avatar Jul 03 '23 06:07 mahimairaja

STEPS:

  1. (optional) if you haven't tried it, download PYGMALION-AI Metharme 1.3B , a 2.9GB high-context AUTOTASKING ENGINE, add the metharme template to the bottom of your prompt <|system|><|user|>Input<|model|>

  2. create prelaunchers

  3. launch textgen, confirm api address

  4. launch gpt-engineer

  5. you're done, have fun

prelauncher script i use for textgen (modify args and gpu settings to meet your individual needs and hardware)

create a fake/dummy/spoofed key, doesn't matter what it is, as long as it matches what you put in gpt-engineer

launch(textgen).sh

#!/bin/bash
export HOST_PORT=7861
export CUDA_VISIBLE_DEVICES=0
export TORCH_USE_CUDA_DSA=1
export TORCH_CUDA_ARCH_LIST=8.6+PTX
export PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.25,max_split_size_mb:128M
export OPENAI_API_KEY="dummy" # export ANY string in this field, including the default textgen key "dummy" it works!

python3 ./server.py --model TheBloke_orca_mini_7B-GPTQ --loader exllama --listen --listen-port 7861 --api --gpu-split 8_20 --trust-remote-code --extensions openai

prelauncher i use for gpt-engineer

launch(gpte).sh

#!/bin/bash
export OPENAI_API_KEY="dummy" # make sure it matches what you export/set in textgen
export OPENAI_API_BASE=http://0.0.0.0:5001/v1 #depending on settings, and ports in use when you launched textgen, this may change slightly if 5001 is in use - confirm base with textgen launch-post, prior to launching gpt-engineer

python3 gpt_engineer/main.py projects/MasterCodeBase/ --temperature 1 --steps tdd+

noxiouscardiumdimidium avatar Jul 03 '23 08:07 noxiouscardiumdimidium

One thing that could be useful is to use the Langchain framework. But remind that actual open sources models that can run on a standard local machine, like tiiuae/falcon-7b are far from beeing enough powerful to be used for this project.

I think that the OpenAI API will remain, in the short term, the only valid option for applications of this type.

ctrimborn-fr avatar Jul 03 '23 14:07 ctrimborn-fr

Yes, Falcon-7b does a poor job at coding. Although, there are other OpenLLM's trained specifically for coding that are possible to run on a RTX 3090+ or users can get a Runpod.io VPS with 48GB-160GB GPU VRAM and run the 30b/40b models.

These models were trained specifically for coding and appear to do pretty well, adding support for these would be pretty great:

  1. WizardCoder-15B-GPTQ (4-bit) - https://huggingface.co/TheBloke/WizardCoder-15B-1.0-GPTQ
  2. FalCoder 7b (Fine-tuned with CodeAlpaca 20k Instructions Dataset): https://huggingface.co/mrm8488/falcoder-7b

mindwellsolutions avatar Jul 06 '23 20:07 mindwellsolutions

STEPS:

  1. (optional) if you haven't tried it, download PYGMALION-AI Metharme 1.3B , a 2.9GB high-context AUTOTASKING ENGINE, add the metharme template to the bottom of your prompt <|system|><|user|>Input<|model|>
  2. create prelaunchers
  3. launch textgen, confirm api address
  4. launch gpt-engineer
  5. you're done, have fun

prelauncher script i use for textgen (modify args and gpu settings to meet your individual needs and hardware)

create a fake/dummy/spoofed key, doesn't matter what it is, as long as it matches what you put in gpt-engineer

launch(textgen).sh

#!/bin/bash
export HOST_PORT=7861
export CUDA_VISIBLE_DEVICES=0
export TORCH_USE_CUDA_DSA=1
export TORCH_CUDA_ARCH_LIST=8.6+PTX
export PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.25,max_split_size_mb:128M
export OPENAI_API_KEY="dummy" # export ANY string in this field, including the default textgen key "dummy" it works!

python3 ./server.py --model TheBloke_orca_mini_7B-GPTQ --loader exllama --listen --listen-port 7861 --api --gpu-split 8_20 --trust-remote-code --extensions openai

prelauncher i use for gpt-engineer

launch(gpte).sh

#!/bin/bash
export OPENAI_API_KEY="dummy" # make sure it matches what you export/set in textgen
export OPENAI_API_BASE=http://0.0.0.0:5001/v1 #depending on settings, and ports in use when you launched textgen, this may change slightly if 5001 is in use - confirm base with textgen launch-post, prior to launching gpt-engineer

python3 gpt_engineer/main.py projects/MasterCodeBase/ --temperature 1 --steps tdd+

is this confirm working by anyone?

aldoyh avatar Jul 09 '23 12:07 aldoyh

This has been addressed in the PRs #639 #644

ATheorell avatar Sep 04 '23 09:09 ATheorell