hhhhpaa comments

Results 9 comments of


                                            hhhhpaa

RuntimeError: CUDA error: no kernel image is available for execution on the device on 3xP40

I built a wheel under the ```Python 3.10```, ```PyTorch 2.1```, and ```CUDA 11.8``` environment to support the compute_60. For details, please refer to the [link](https://github.com/hhhhpaaa/Mamba-ssm). @SamsongB @DewEfresh

can't find files in mamba/build/ when build from source

I have the same problem in WSL, but I didn't encounter this issue in Nvidia Docker, where it works normally.

LLVM ERROR for benchmark_generation_mamba_simple.py

> > Hi, I have the same issue with you. Could you tell me where did you find the fact that GTX 1080 ti does not support the architecture to...

Syntax error with ESP8266.doHttpPost()

> I am having this issue as well. any fix? def doHttpPost(self,host,path,content_type,content,user_agent='RPi-Pico',port=80):

triton.runtime.autotuner.OutOfResources: out of resource: shared memory, Required: 254208, Hardware limit: 101376.

This is triton's problem. Please uninstall triton and install triton-nigntly. Reference[issues/438](https://github.com/state-spaces/mamba/issues/438) @xypjq @zzzendurance

Small datasets

I'm also doing similar work, applying mamba to small datasets. You can try adding gradient truncation or setting a weight decay on the optimizer, which worked for me. In addition,...

Small datasets

> Interesting, I have a weight decay set at 0.01 and max_grad_norm=1.0. So I found for the 20M model, having a learning rate of 1e-5 makes it overfit and it...

processor.apply_chat_template无法正确产生prompt

> 感谢各位，我昨天试了Qwen/Qwen2.5-VL-2B-Instruct，发现它的 processor 可以进行text的编码。我发现[@JJJYmmm](https://github.com/JJJYmmm) 所采用的3B模型也是instruct版本，所以我在想是不是2B的base版本出了点问题。另外我不知道我后续能不能在instruct版本的基础上用lora微调，我看Qwen Doc介绍里面推荐的是从不带instruct的模型进行微调； > > 另外一个问题是看到 [@smallzhongfeng](https://github.com/smallzhongfeng) 说的视频文本混合输入的问题后我想当我之后要运用3D图片+文本作为输入，这种情况下我是把3D的图片当成多张2D图像作为输入，还是把它当成一个有限帧的视频输入更好一些？感谢感谢，我用Qwen2VL-2B微调时发现了这个问题，很奇怪

Mamba2 instance failed

> What is your causal_conv1d version? Is there a pytorch version? I still get an error: RuntimeError: causal_conv1d with channel last layout requires strides (x.stride(0) and x.stride(2)) to be multiples...