lightseq
lightseq copied to clipboard
请问下可以支持llama和bloom推理加速吗
It is not supported currently.
It will be supported in May, and it is expected that V100-32G can be deployed.
@hexisyztem Hi, can flash attention be used on V100?
As you can see in https://github.com/HazyResearch/flash-attention, flash attention doesn't support V100. From: @.> Date: Mon, Jun 12, 2023, 17:32 Subject: [External] Re: [bytedance/lightseq] 请问下可以支持llama和bloom推理加速吗 (Issue #502) To: @.> Cc: @.>, "Mention"< @.>
@hexisyztem https://github.com/hexisyztem Hi, can flash attention be used on V100?
— Reply to this email directly, view it on GitHub https://github.com/bytedance/lightseq/issues/502#issuecomment-1586956331, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAOAOACCGYKLGTXC26OSE3XK3OYRANCNFSM6AAAAAAXCPO7VA . You are receiving this because you were mentioned.Message ID: @.***>