v2ray
v2ray
Edit: There's a better implementation: https://huggingface.co/keyfan/grok-1-hf https://github.com/LagPixelLOL/grok-1-pytorch
@nivibilla Just tried loading it on 8x A100 80GB and it was using 20GB vram in each GPU. For your case maybe it's because the device_map="auto" miscalculated some usage and...
我也出现了这个问题, 但是我电脑上出现了无法加载, 删除了插件也不能访问, 手机上也不能访问, 更换IP才恢复. 而且如果出现了这个情况, 用无痕模式登录另一个账号也无法加载, 怀疑可能是刷新太多导致IP被限制.
:octocat:
When I load lora for LLaMA 2 70B, I get the same error too (RuntimeError: CUDA error: an illegal memory access was encountered). But I can use lora with no...
> Well, if it works with 7b and 13b it's most likely related to GQA. Everything up until that 70b release has assumed that the number of heads is the...
> Well, it's not exactly a fix, cause it should really work with fused attn, but I'll get to that. What I need though is an example 70b LoRA I...
Can you provide more info, like is there any error messages?
Hmmmm, I'll look into it, thanks.
# NOTICE: NO LONGER MAINTAINED