lightllm issues

[feature request] add prompt styles support

4

There are many different styles of prompts for different LLMs, such like openai/llama2 style (especially support SYSTEM role prompt), pure text style, ziya, etc. From the api_server.py 's parameter, we...

jiacheo

enhancement

Support generate() like vllm, and run without docker?

1

Hi, thanks for your great work. Any plan to support generate() function like vllm or transformer? Without docker, user can also run generation code with python script. Like this： ```...

curname

enhancement

[BUG]LLaMA2-7B服务首token延时异常

5

lightllm commit id：718e6d6dfffc75e7bbfd7ea80ba4afb77aa27726 模型链接：https://huggingface.co/Linly-AI/Chinese-LLaMA-2-7B-hf 启动服务命令：python -m lightllm.server.api_server --model_dir Linly-AI/Chinese-LLaMA-2-7B-hf --host 0.0.0.0 --port 8100 --tp 1 --max_total_token_num 120000 --tokenizer_mode auto --trust_remote_code 测试发现首token延时很高，约3s左右，可以使用上面的模型和启动命令复现问题，辛苦看看是什么原因导致的呢？

yeliang2258

bug

No performance gain with baichuan13B comparing with vllm

7

Hi, I am using baichuan13B to compare performance with vllm, and find in this case lightllm has no performance gain. The same test vllm result is 240s, comparing with lightllm's...

leiwen83

baichuan13B load error

1

Hi, I try use lightllm with baichuan13B model, but get below error. I cannot find any TrainingArguments in the code, so is there anything else need to be configured?... The...

leiwen83

I tried test_llama.py，but.... help...T^T

4

Process Process-8: Process Process-7: Traceback (most recent call last): File "", line 21, in _rms_norm_fwd_fused KeyError: ('2-.-0-.-0-09caff3db89e80ddf0eb4f72675bc8f9-2b0c5161c53c71b37ae20a9996ee4bb8-c1f92808b4e4644c1732e8338187ac87-d962222789c30252d492a16cca3bf467-12f7ac1ca211e037f62a7c0c323d9990-5c5e32ff210f3b7f56c98ca29917c25e-06f0df2d61979d629033f4a22eff5198-0dd03b0bd512a184b3512b278d9dfa59-d35ab04ae841e2714a253c523530b071', (torch.float16, torch.float16, torch.float16, 'i32', 'i32', 'fp32'), (16384,), (True, True, True, (True, False), (True,...

MissQueen

[FEATURE] Add support for true open-source models like MPT

1

Add support for MPT models, which are licensed under Apache 2.0 just light LightLLM. They have 7B and 30B with different context lengths, here is one of them: https://huggingface.co/mosaicml/mpt-7b-8k

casper-hansen

enhancement

输出结果问题

2

1.怎么将输出结果中的多余\n去掉？ 2.怎么多个问题进行推理？ 2.怎么更改batch大小

listenyb

Which version is the comparison of the pressure measurement data provided in the project based on?

5

The comparison data with TGI is based on what TGI version and startup parameters, as well as hardware.

dushulin

Auto convert without tokenizer.json to prevent performance downgrade?

2

As mentioned in #20 , lightllm performance would downgrade a lot if without tokenizer.json. So for those model without this file, shall it be reasonable to add some auto conversion...

leiwen83

lightllm
lightllm copied to clipboard

Metadata

[feature request] add prompt styles support

Support generate() like vllm, and run without docker?

[BUG]LLaMA2-7B服务首token延时异常

No performance gain with baichuan13B comparing with vllm

baichuan13B load error

I tried test_llama.py，but.... help...T^T

[FEATURE] Add support for true open-source models like MPT

输出结果问题

Which version is the comparison of the pressure measurement data provided in the project based on?

Auto convert without tokenizer.json to prevent performance downgrade?

← Metadata

Owner

Metadata

lightllm lightllm copied to clipboard

Metadata

← Metadata

Owner

Metadata

lightllm
lightllm copied to clipboard