lmdeploy icon indicating copy to clipboard operation
lmdeploy copied to clipboard

[Feature] support llava1.5 w4a16 model? the model is so slower than origin fp16 model?

Open ganliqiang opened this issue 1 year ago • 2 comments
trafficstars

Motivation

image when i run this pipe = pipeline('liuhaotian/llava-v1.5-13b', chat_template_config=ChatTemplateConfig(model_name='vicuna'), cache_max_entry_count=0.1) some diffrent answeer given, but the origin model donnot occur,my promot is task+choices+format. task = "your task is to find out what actions and events are included in the given image?" choices = '''\nA. Someone are fighting\nB. Climbing the tree\nC. Climbing the wall \nD. occupying roads to management and sell things\nE. Hanging clothes along the street\nF. Someone fell down, lay or sit on the ground \nG. Climbing over a guardrail on the street\nH. Haphazard piles of materials \nI. Talking on phone\nJ. Someone is smoking \nK. None of the above''' format = "\nAnswer with one or more option's letters from the given choices directly." is there some bug in this version or do i use the right way?at the same time the model is 3.0X slower than the origin fp16 model too?

Related resources

No response

Additional context

No response

ganliqiang avatar Mar 26 '24 02:03 ganliqiang

What GPU model are you using?

lzhangzz avatar Mar 26 '24 03:03 lzhangzz

What GPU model are you using?

a100 80g,the complete is from lmdeploy import pipeline, ChatTemplateConfig, GenerationConfig from lmdeploy.vl import load_image

gen_config = GenerationConfig(top_k=1, temperature=0)
pipe = pipeline('liuhaotian/llava-v1.5-13b',
                chat_template_config=ChatTemplateConfig(model_name='vicuna'), cache_max_entry_count=0.1)
    image = load_image(img_path)
    vqa_time = time.time()
    response = pipe((prompt, image), gen_config=gen_config)
    print(f"vqa_tiem:{time.time() - vqa_time}")
    print(response)
    ans=find_ans(response.text)
    print(ans)

i find most time the gpu is idle,so i guess maybe most of time it is preprocessing the image?

ganliqiang avatar Mar 26 '24 05:03 ganliqiang