zaobao
zaobao
懒得翻源码了,不知道我理解的对不对:)
> 1. If your model is traced in fp16, DJL will use fp16 to run your model. > > 2. If you input is fp16, you can create fp16 NDArray....
Both in 'cpu' and 'cuda', the .pt model takes 4-5 times as long as the source model.
@frankfliu I modified your script to enable it running on CUDA but got an error message ``` import os.path import time import torch from sentence_transformers import CrossEncoder from transformers import...
I sound that HNSW-SQ8 has been available on Ziili Cloud. Is that true?