hhhhpaa

Results 9 comments of hhhhpaa

I built a wheel under the ```Python 3.10```, ```PyTorch 2.1```, and ```CUDA 11.8``` environment to support the compute_60. For details, please refer to the [link](https://github.com/hhhhpaaa/Mamba-ssm). @SamsongB @DewEfresh

I have the same problem in WSL, but I didn't encounter this issue in Nvidia Docker, where it works normally.

> > Hi, I have the same issue with you. Could you tell me where did you find the fact that GTX 1080 ti does not support the architecture to...

> I am having this issue as well. any fix? def doHttpPost(self,host,path,content_type,content,user_agent='RPi-Pico',port=80):

This is triton's problem. Please uninstall triton and install triton-nigntly. Reference[issues/438](https://github.com/state-spaces/mamba/issues/438) @xypjq @zzzendurance

I'm also doing similar work, applying mamba to small datasets. You can try adding gradient truncation or setting a weight decay on the optimizer, which worked for me. In addition,...

> Interesting, I have a weight decay set at 0.01 and max_grad_norm=1.0. So I found for the 20M model, having a learning rate of 1e-5 makes it overfit and it...

> 感谢各位,我昨天试了Qwen/Qwen2.5-VL-2B-Instruct,发现它的 processor 可以进行text的编码。我发现[@JJJYmmm](https://github.com/JJJYmmm) 所采用的3B模型也是instruct版本,所以我在想是不是2B的base版本出了点问题。另外我不知道我后续能不能在instruct版本的基础上用lora微调,我看Qwen Doc介绍里面推荐的是从不带instruct的模型进行微调; > > 另外一个问题是看到 [@smallzhongfeng](https://github.com/smallzhongfeng) 说的视频文本混合输入的问题后我想当我之后要运用3D图片+文本作为输入,这种情况下我是把3D的图片当成 多张2D图像 作为输入,还是把它当成一个 有限帧的视频 输入更好一些? 感谢感谢,我用Qwen2VL-2B微调时发现了这个问题,很奇怪

> What is your causal_conv1d version? Is there a pytorch version? I still get an error: RuntimeError: causal_conv1d with channel last layout requires strides (x.stride(0) and x.stride(2)) to be multiples...