ailia-models icon indicating copy to clipboard operation
ailia-models copied to clipboard

added Japanese LLama elyza

Open YToleubay opened this issue 1 year ago • 7 comments

#1294

YToleubay avatar Nov 14 '23 05:11 YToleubay

モデルをアップロードしました。 https://storage.googleapis.com/ailia-models/elyza-japanese-llama-2-7b/decoder_model.onnx

kyakuno avatar Nov 18 '23 08:11 kyakuno

macOSだと実時間で処理が終わらない。

kyakuno avatar Nov 18 '23 11:11 kyakuno

@YToleubay How many time do you need for inference? About ONNX Runtime and ailia?

kyakuno avatar Nov 18 '23 11:11 kyakuno

@YToleubay How many time do you need for inference? About ONNX Runtime and ailia? I did following benchmark with NVIDIA GeForce RTX 3090, 32GB ram, With onnx I have the following output:

processing time 36854 ms
processing time 32836 ms
processing time 31787 ms
processing time 31776 ms
processing time 31774 ms
**Average onnx time is =  33005.4 ms**

with ailia I have the following numbers:

 ailia processing time 1060661 ms
 ailia processing time 1061135 ms 
Average ailia time 1060898 ms

It seems inference runtime is 32 times slower on ailia than onnx

YToleubay avatar Nov 18 '23 11:11 YToleubay

Thanks. I will investigate it.

kyakuno avatar Nov 19 '23 04:11 kyakuno

Thanks. I will investigate it.

Can I help you somehow?

YToleubay avatar Nov 19 '23 05:11 YToleubay

Thank you. We will verify it with the ailia SDK team as it will be the core implementation of the ailia SDK.

kyakuno avatar Nov 19 '23 12:11 kyakuno