Tuo Chen issues

Results 18 issues of


                                            Tuo Chen

unimim is still unavailable after one year

A year later, unimim still has not released the source code. A year ago, I spent a long time trying to replicate and analyze this paper, but was unable to...

Why there is no unscale_ when you use amp?

Your code is ``` if use_bfloat16: scaler.scale(loss).backward() scaler.step(optimizer) scaler.update() ``` I think it should be ``` if use_bfloat16: scaler.scale(loss).backward() scaler.unscale_(optimizer) scaler.step(optimizer) scaler.update() ``` Am I right?

Need qwen2:math !!

https://github.com/QwenLM/Qwen2-Math

feature request

Why repeat the backward block?

Each "v2" Mamba block contains out_a and out_b, which is both forward and backward, but in the for loop [here](https://github.com/hustvl/Vim/blob/6143d07b3dd31f904c63840a19e22d95d1124493/vim/models_mamba.py#L483C13-L496C61), we process two Mamba blocks at the same time, each...

Support the new ollama embedding model

**Describe the solution you'd like** Just let users choose the embedding models from lists of ollama. Now only the ollama-nomic-embed-text is available.

feature request

Where's ddp train?

You provide a way of training called 'laodddptrain', but where is the file?

error: net::ERR_INCOMPLETE_CHUNKED_ENCODING 200 (OK).

### What happened? the plugin does not return a message and reports an error: net::ERR_INCOMPLETE_CHUNKED_ENCODING 200 (OK). ### Error Statement ![image](https://github.com/user-attachments/assets/a91dd80b-028a-4320-a582-dc5e4ad38d3c) ![image](https://github.com/user-attachments/assets/70a64dc9-c2e6-48cc-8efb-303bf3f8f56b) ### Steps to Reproduce just ask questions ###...

bug

Not supported on old graphics cards?

Error when running on 3090 and 2080 Ti, and my version is >NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.0 ```mindsearch-backend | Process Process-1: mindsearch-backend | Traceback (most recent call...