Zerlion
Results
2
comments of
Zerlion
I seem to have seen multilingual encoders from Universal Sentence Encoder (https://tfhub.dev/google/collections/universal-sentence-encoder/1). I'm typically in the 1991Troy situation, and I'm going to try different packages such as these. There are...
Same issue here, with Llama 3.3 on an instance of 8xA10G GPU with the serving parameters : max_rolling_batch_prefill_tokens = 8192 max_model_len = 32000 enable_prefix_caching = True enable_chunked_prefill = True dtype...