MiniGPT-4
MiniGPT-4 copied to clipboard
Sharing a template for running from command line (instead of gradio). Also MiniGPT-4 can be run on Windows 10.
Here's a verified template script for running MiniGPT-4 from command line, without any dependency on Gradio. I've also verified that MiniGPT-4 can be run on Windows 10. You just need to install a windows version of bitsandbytes.
import argparse
from minigpt4.common.config import Config
from minigpt4.common.dist_utils import get_rank
from minigpt4.common.registry import registry
from minigpt4.conversation.conversation import Chat, CONV_VISION
# imports modules for registration
from minigpt4.datasets.builders import *
from minigpt4.models import *
from minigpt4.processors import *
from minigpt4.runners import *
from minigpt4.tasks import *
def parse_args():
parser = argparse.ArgumentParser(description="Demo")
parser.add_argument("--cfg_path", required=True, help="path to configuration file.")
parser.add_argument("--img_path", required=True, help="path to an input image.")
parser.add_argument("--gpu_id", type=int, default=0, help="specify the gpu to load the model.")
parser.add_argument("--num_beams", type=int, default=1)
parser.add_argument("--temperature", type=int, default=1)
parser.add_argument(
"--options",
nargs="+",
help="override some settings in the used config, the key-value pair "
"in xxx=yyy format will be merged into config file (deprecate), "
"change to --cfg-options instead.",
)
args = parser.parse_args()
return args
def main():
# ========================================
# Model Initialization
# ========================================
print('Initializing model')
args = parse_args()
cfg = Config(args)
model_config = cfg.model_cfg
model_config.device_8bit = args.gpu_id
model_cls = registry.get_model_class(model_config.arch)
model = model_cls.from_config(model_config).to('cuda:{}'.format(args.gpu_id))
vis_processor_cfg = cfg.datasets_cfg.cc_sbu_align.vis_processor.train
vis_processor = registry.get_processor_class(vis_processor_cfg.name).from_config(vis_processor_cfg)
chat = Chat(model, vis_processor, device='cuda:{}'.format(args.gpu_id))
print('Model Initialization Finished')
# upload image
chat_state = CONV_VISION.copy()
img_list = []
llm_message = chat.upload_img(args.img_path, chat_state, img_list)
print(llm_message)
# ask a question
user_message = "what is this image about?"
chat.ask(user_message, chat_state)
# get answer
llm_message = chat.answer(conv=chat_state,
img_list=img_list,
num_beams=args.num_beams,
temperature=args.temperature,
max_new_tokens=300,
max_length=2000)[0]
print(llm_message)
if __name__ == "__main__":
main()
I got this error
Initializing model usage: ipykernel_launcher.py [-h] --cfg_path CFG_PATH --img_path IMG_PATH [--gpu_id GPU_ID] [--num_beams NUM_BEAMS] [--temperature TEMPERATURE] [--options OPTIONS [OPTIONS ...]] ipykernel_launcher.py: error: the following arguments are required: --cfg_path, --img_path An exception has occurred, use %tb to see the full traceback.
SystemExit: 2
Any help?
I got this error
Initializing model usage: ipykernel_launcher.py [-h] --cfg_path CFG_PATH --img_path IMG_PATH [--gpu_id GPU_ID] [--num_beams NUM_BEAMS] [--temperature TEMPERATURE] [--options OPTIONS [OPTIONS ...]] ipykernel_launcher.py: error: the following arguments are required: --cfg_path, --img_path An exception has occurred, use %tb to see the full traceback.
SystemExit: 2
Any help?
Please read your error log. You didn't provide the cfg_path and img_path arguments.
I appreciate your response. Yes, I did, and I made several attempts to add the proper path but I always received the same issue. What is wrong with path, as i am beginner?
Initializing model usage: ipykernel_launcher.py [-h] --eval_configs/minigpt4_eval.yaml EVAL_CONFIGS/MINIGPT4_EVAL.YAML --image.py IMAGE.PY [--gpu_id GPU_ID] [--num_beams NUM_BEAMS] [--temperature TEMPERATURE] [--options OPTIONS [OPTIONS ...]] ipykernel_launcher.py: error: the following arguments are required: --eval_configs/minigpt4_eval.yaml, --image.py An exception has occurred, use %tb to see the full traceback.
SystemExit: 2
(url)
I appreciate your response. Yes, I did, and I made several attempts to add the proper path but I always received the same issue. What is wrong with path, as i am beginner?
Initializing model usage: ipykernel_launcher.py [-h] --eval_configs/minigpt4_eval.yaml EVAL_CONFIGS/MINIGPT4_EVAL.YAML --image.py IMAGE.PY [--gpu_id GPU_ID] [--num_beams NUM_BEAMS] [--temperature TEMPERATURE] [--options OPTIONS [OPTIONS ...]] ipykernel_launcher.py: error: the following arguments are required: --eval_configs/minigpt4_eval.yaml, --image.py An exception has occurred, use %tb to see the full traceback.
SystemExit: 2
(url)
So this program is supposed to be invoked from command line. I was assuming you were using a command line in which case you would be running the script with the following command:
python program_name.py --cfg_path path_to_cfg --img_path path_to_img
Now, you are using ipy or something I'm not familiar about. In this case you see that the system is running ipykernel_launcher.py without providing any arguments.
2 ways to fix this: easy way: Remove the required=True statements. Argparse will then use the default values you assign them and you don't need to provide them in the command line. more proper way: Find out how ipy intakes argument and provide them.
thanks for your helpful code, and I have an extended question: how to input several images in a multi-turn dialogue?
In newest code, you need to modify CONV_VISION to CONV_VISION_LLama2 (or CONV_VISION_Vicuna0), and add "chat.encode_img(img_list)" after "chat.ask(user_message, chat_state)"
your work is nice?i meet the error"/nfs/volume-512-1/wangchang/MiniGPT-4-2/MiniGPT-4/minigpt4/models/minigpt_base.py:69 in │ │ get_context_emb │ │ │ │ 66 │ │ self.visual_encoder.float() │ │ 67 │ │ │ 68 │ def get_context_emb(self, prompt, img_list): │ │ ❱ 69 │ │ device = img_list[0].device │ │ 70 │ │ prompt_segs = prompt.split('<ImageHere>') │ │ 71 │ │ assert len(prompt_segs) == len(img_list) + 1, "Unmatched numbers of image placeh │ │ 72 │ │ seg_tokens = [ │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ AttributeError: 'str' object has no attribute 'device'"
thanks for your helpful code, and I have an extended question: how to input several images in a multi-turn dialogue?
请问您实现这个了吗
CONV_VISION_LLama2
A quick fix would be, in your parse_args() function, replacing
args = parser.parse_args()
by
args, unknown = parser.parse_known_args()