zaobao

Results 5 comments of zaobao

> 1. If your model is traced in fp16, DJL will use fp16 to run your model. > > 2. If you input is fp16, you can create fp16 NDArray....

Both in 'cpu' and 'cuda', the .pt model takes 4-5 times as long as the source model.

@frankfliu I modified your script to enable it running on CUDA but got an error message ``` import os.path import time import torch from sentence_transformers import CrossEncoder from transformers import...

I sound that HNSW-SQ8 has been available on Ziili Cloud. Is that true?