SingL3 comments

Results 25 comments of


                                            SingL3

Support datasets from huggingface

As you can see here: https://github.com/mosaicml/composer/blob/f2a2dc820cb75023b9eb7c46fdfd25273712abd0/composer/datasets/in_context_learning_evaluation.py#L145 This mean users should be local and it does not support other format of data like parquet. The benefit of `datasets` may be it...

Support for BAAI/bge-m3 model

I have tried using tei to host a bge-m3: ```bash text-embeddings-router --model-id /model/bge-m3 --dtype float32 --pooling cls --max-batch-tokens 4194304 -p 40031 --max-client-batch-size 512 --max-batch-requests 512 --max-concurrent-requests 512 --max-input-length 8192 --tokenization-workers...

Support for BAAI/bge-m3 model

I check through `watch -n 0.1 nvidia-smi`: ``` +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.86.10 Driver Version: 535.86.10 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr....

SingL3

Support datasets from huggingface

Support for BAAI/bge-m3 model

Support for BAAI/bge-m3 model

Support for BAAI/bge-m3 model

Support for BAAI/bge-m3 model

Support for BAAI/bge-m3 model

'InternLMTokenizer' object has no attribute 'sp_model'

What is the different between "oasst_export" and "augment_oasst"?

build_index take much more time when decreasing max_index_memory_usage

build_index take much more time when decreasing max_index_memory_usage