lightseq icon indicating copy to clipboard operation
lightseq copied to clipboard

请问下可以支持llama和bloom推理加速吗

Open HuiResearch opened this issue 1 year ago • 4 comments

HuiResearch avatar Apr 18 '23 11:04 HuiResearch

It is not supported currently.

Taka152 avatar Apr 20 '23 10:04 Taka152

It will be supported in May, and it is expected that V100-32G can be deployed.

hexisyztem avatar Apr 24 '23 09:04 hexisyztem

@hexisyztem Hi, can flash attention be used on V100?

frankxyy avatar Jun 12 '23 09:06 frankxyy

As you can see in https://github.com/HazyResearch/flash-attention, flash attention doesn't support V100. From: @.> Date: Mon, Jun 12, 2023, 17:32 Subject: [External] Re: [bytedance/lightseq] 请问下可以支持llama和bloom推理加速吗 (Issue #502) To: @.> Cc: @.>, "Mention"< @.>

@hexisyztem https://github.com/hexisyztem Hi, can flash attention be used on V100?

— Reply to this email directly, view it on GitHub https://github.com/bytedance/lightseq/issues/502#issuecomment-1586956331, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAOAOACCGYKLGTXC26OSE3XK3OYRANCNFSM6AAAAAAXCPO7VA . You are receiving this because you were mentioned.Message ID: @.***>

hexisyztem avatar Jun 12 '23 09:06 hexisyztem