SingL3

Results 25 comments of SingL3

As you can see here: https://github.com/mosaicml/composer/blob/f2a2dc820cb75023b9eb7c46fdfd25273712abd0/composer/datasets/in_context_learning_evaluation.py#L145 This mean users should be local and it does not support other format of data like parquet. The benefit of `datasets` may be it...

I have tried using tei to host a bge-m3: ```bash text-embeddings-router --model-id /model/bge-m3 --dtype float32 --pooling cls --max-batch-tokens 4194304 -p 40031 --max-client-batch-size 512 --max-batch-requests 512 --max-concurrent-requests 512 --max-input-length 8192 --tokenization-workers...

I check through `watch -n 0.1 nvidia-smi`: ``` +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.86.10 Driver Version: 535.86.10 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr....

The max batch tokens is set to 4194304 but I am actually running with a batch size 50 and max input length 8192.

@OlivierDehaene > The GPU is stuck at 100% util and the model is not answering? > > > I cannot reach this api any more > > Can you GET...

OK, the max batch size is like 20 for A100.

应该已经修复了:https://github.com/InternLM/InternLM/pull/419

In addition, where can I get the "super pretrain" model of llama?

I interrupted it and the log went out like this: ![image](https://user-images.githubusercontent.com/20473466/224238524-c36072d7-6e00-4756-9651-5ec91fff41f2.png)

Hello @victor-paltz, I am using 12 cores and I am testing on only 50k embeddings which should not take that much time. So i think it is stuck. Actually, the...