Tuo Chen

Results 18 issues of Tuo Chen

A year later, unimim still has not released the source code. A year ago, I spent a long time trying to replicate and analyze this paper, but was unable to...

Your code is ``` if use_bfloat16: scaler.scale(loss).backward() scaler.step(optimizer) scaler.update() ``` I think it should be ``` if use_bfloat16: scaler.scale(loss).backward() scaler.unscale_(optimizer) scaler.step(optimizer) scaler.update() ``` Am I right?

https://github.com/QwenLM/Qwen2-Math

feature request

Each "v2" Mamba block contains out_a and out_b, which is both forward and backward, but in the for loop [here](https://github.com/hustvl/Vim/blob/6143d07b3dd31f904c63840a19e22d95d1124493/vim/models_mamba.py#L483C13-L496C61), we process two Mamba blocks at the same time, each...

**Describe the solution you'd like** Just let users choose the embedding models from lists of ollama. Now only the ollama-nomic-embed-text is available.

feature request

You provide a way of training called 'laodddptrain', but where is the file?

### What happened? the plugin does not return a message and reports an error: net::ERR_INCOMPLETE_CHUNKED_ENCODING 200 (OK). ### Error Statement ![image](https://github.com/user-attachments/assets/a91dd80b-028a-4320-a582-dc5e4ad38d3c) ![image](https://github.com/user-attachments/assets/70a64dc9-c2e6-48cc-8efb-303bf3f8f56b) ### Steps to Reproduce just ask questions ###...

bug

Error when running on 3090 and 2080 Ti, and my version is >NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.0 ```mindsearch-backend | Process Process-1: mindsearch-backend | Traceback (most recent call...