Tuo Chen
Tuo Chen
A year later, unimim still has not released the source code. A year ago, I spent a long time trying to replicate and analyze this paper, but was unable to...
Your code is ``` if use_bfloat16: scaler.scale(loss).backward() scaler.step(optimizer) scaler.update() ``` I think it should be ``` if use_bfloat16: scaler.scale(loss).backward() scaler.unscale_(optimizer) scaler.step(optimizer) scaler.update() ``` Am I right?
Each "v2" Mamba block contains out_a and out_b, which is both forward and backward, but in the for loop [here](https://github.com/hustvl/Vim/blob/6143d07b3dd31f904c63840a19e22d95d1124493/vim/models_mamba.py#L483C13-L496C61), we process two Mamba blocks at the same time, each...
**Describe the solution you'd like** Just let users choose the embedding models from lists of ollama. Now only the ollama-nomic-embed-text is available.
You provide a way of training called 'laodddptrain', but where is the file?
### What happened? the plugin does not return a message and reports an error: net::ERR_INCOMPLETE_CHUNKED_ENCODING 200 (OK). ### Error Statement   ### Steps to Reproduce just ask questions ###...
Error when running on 3090 and 2080 Ti, and my version is >NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.0 ```mindsearch-backend | Process Process-1: mindsearch-backend | Traceback (most recent call...