gpt-engineer
gpt-engineer copied to clipboard
[Feature] Using a open-source LLM instead of Open AI
STEPS:
-
(optional) if you haven't tried it, download PYGMALION-AI Metharme 1.3B , a 2.9GB high-context AUTOTASKING ENGINE, add the metharme template to the bottom of your prompt
<|system|><|user|>Input<|model|> -
create prelaunchers
-
launch textgen, confirm api address
-
launch gpt-engineer
-
you're done, have fun
prelauncher script i use for textgen (modify args and gpu settings to meet your individual needs and hardware)
create a fake/dummy/spoofed key, doesn't matter what it is, as long as it matches what you put in gpt-engineer
launch(textgen).sh
#!/bin/bash
export HOST_PORT=7861
export CUDA_VISIBLE_DEVICES=0
export TORCH_USE_CUDA_DSA=1
export TORCH_CUDA_ARCH_LIST=8.6+PTX
export PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.25,max_split_size_mb:128M
export OPENAI_API_KEY="dummy" # export ANY string in this field, including the default textgen key "dummy" it works!
python3 ./server.py --model TheBloke_orca_mini_7B-GPTQ --loader exllama --listen --listen-port 7861 --api --gpu-split 8_20 --trust-remote-code --extensions openai
prelauncher i use for gpt-engineer
launch(gpte).sh
#!/bin/bash
export OPENAI_API_KEY="dummy" # make sure it matches what you export/set in textgen
export OPENAI_API_BASE=http://0.0.0.0:5001/v1 #depending on settings, and ports in use when you launched textgen, this may change slightly if 5001 is in use - confirm base with textgen launch-post, prior to launching gpt-engineer
python3 gpt_engineer/main.py projects/MasterCodeBase/ --temperature 1 --steps tdd+
One thing that could be useful is to use the Langchain framework. But remind that actual open sources models that can run on a standard local machine, like tiiuae/falcon-7b are far from beeing enough powerful to be used for this project.
I think that the OpenAI API will remain, in the short term, the only valid option for applications of this type.
Yes, Falcon-7b does a poor job at coding. Although, there are other OpenLLM's trained specifically for coding that are possible to run on a RTX 3090+ or users can get a Runpod.io VPS with 48GB-160GB GPU VRAM and run the 30b/40b models.
These models were trained specifically for coding and appear to do pretty well, adding support for these would be pretty great:
- WizardCoder-15B-GPTQ (4-bit) - https://huggingface.co/TheBloke/WizardCoder-15B-1.0-GPTQ
- FalCoder 7b (Fine-tuned with CodeAlpaca 20k Instructions Dataset): https://huggingface.co/mrm8488/falcoder-7b
STEPS:
- (optional) if you haven't tried it, download PYGMALION-AI Metharme 1.3B , a 2.9GB high-context AUTOTASKING ENGINE, add the metharme template to the bottom of your prompt
<|system|><|user|>Input<|model|>- create prelaunchers
- launch textgen, confirm api address
- launch gpt-engineer
- you're done, have fun
prelauncher script i use for textgen (modify args and gpu settings to meet your individual needs and hardware)
create a fake/dummy/spoofed key, doesn't matter what it is, as long as it matches what you put in gpt-engineer
launch(textgen).sh
#!/bin/bash export HOST_PORT=7861 export CUDA_VISIBLE_DEVICES=0 export TORCH_USE_CUDA_DSA=1 export TORCH_CUDA_ARCH_LIST=8.6+PTX export PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.25,max_split_size_mb:128M export OPENAI_API_KEY="dummy" # export ANY string in this field, including the default textgen key "dummy" it works! python3 ./server.py --model TheBloke_orca_mini_7B-GPTQ --loader exllama --listen --listen-port 7861 --api --gpu-split 8_20 --trust-remote-code --extensions openaiprelauncher i use for gpt-engineer
launch(gpte).sh
#!/bin/bash export OPENAI_API_KEY="dummy" # make sure it matches what you export/set in textgen export OPENAI_API_BASE=http://0.0.0.0:5001/v1 #depending on settings, and ports in use when you launched textgen, this may change slightly if 5001 is in use - confirm base with textgen launch-post, prior to launching gpt-engineer python3 gpt_engineer/main.py projects/MasterCodeBase/ --temperature 1 --steps tdd+
is this confirm working by anyone?
This has been addressed in the PRs #639 #644