dialog-nlu
dialog-nlu copied to clipboard
Add unit tests
Add unit tests
@MahmoudWahdan How to do inferencing from tflite compressed model??
Hi @deathsurgeon1 Please, refer to example script
Firstly, you need to have saved the model in tflite format
nlu.save(save_path, save_tflite=True, conversion_mode="hybrid_quantization")
with conversion_mode can be one of the following modes:
normal
fp16_quantization
hybrid_quantization
Then, based on the conversion mode and your environment, you may need to disable GPU in the beginning of your script.
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
Then, load the model with quantized=True and num_process=1 or any number of processes you want and then do prediction
nlu = TransformerNLU.load(model_path, quantized=True, num_process=1)
utterance = "add sabrina salerno to the grime instrumentals playlist"
result = nlu.predict(utterance)
I hope this will help you. I'm planing to provide more examples and notebooks. Documentation is in our plan.
Kindly, try to post your question in respective issues or open new issue. Thanks.
Thanks a lot for such detailed response :)
@deathsurgeon1 You are much welcomed!