djl icon indicating copy to clipboard operation
djl copied to clipboard

DJL对于FP16模型推理的支持情况

Open yongyupei opened this issue 2 years ago • 2 comments

Description

目前djl的Image及内置的相关Translator默认都是FP32精度,查看相关源码及API发现并未找到相关的设置用来支持FP16模型的推理,FP16精度的模型在损失一定的精度下可以提高推理速度,希望后续版本能够提供相关API或者是设置项用来快速的实现FP16模型的推理。

yongyupei avatar Apr 27 '23 07:04 yongyupei

  1. If your model is traced in fp16, DJL will use fp16 to run your model.
  2. If you input is fp16, you can create fp16 NDArray. You can use Float16Utils to help you convert the data
  3. The built-in ImageTranslator doesn't support fp16, but you can write your own Translator to convert the tensor to any data type.

frankfliu avatar May 26 '23 07:05 frankfliu

  1. If your model is traced in fp16, DJL will use fp16 to run your model.

    1. If you input is fp16, you can create fp16 NDArray. You can use Float16Utils to help you convert the data

    2. The built-in ImageTranslator doesn't support fp16, but you can write your own Translator to convert the tensor to any data type.

How to convert huggingface models in fp16 with model_zoo_importer.py

zaobao avatar May 11 '24 05:05 zaobao