v2ray

Results 21 comments of v2ray

Edit: There's a better implementation: https://huggingface.co/keyfan/grok-1-hf https://github.com/LagPixelLOL/grok-1-pytorch

@nivibilla Just tried loading it on 8x A100 80GB and it was using 20GB vram in each GPU. For your case maybe it's because the device_map="auto" miscalculated some usage and...

我也出现了这个问题, 但是我电脑上出现了无法加载, 删除了插件也不能访问, 手机上也不能访问, 更换IP才恢复. 而且如果出现了这个情况, 用无痕模式登录另一个账号也无法加载, 怀疑可能是刷新太多导致IP被限制.

When I load lora for LLaMA 2 70B, I get the same error too (RuntimeError: CUDA error: an illegal memory access was encountered). But I can use lora with no...

> Well, if it works with 7b and 13b it's most likely related to GQA. Everything up until that 70b release has assumed that the number of heads is the...

> Well, it's not exactly a fix, cause it should really work with fused attn, but I'll get to that. What I need though is an example 70b LoRA I...

Can you provide more info, like is there any error messages?