nebuly
nebuly copied to clipboard
APIs for non-Python programming languages
Is there a c++ api of the library?
We currently only support DL Python frameworks (Tensorflow and Pytorch).
We are considering extending the library to other programming languages such as Julia, Swift, and C++, however it will take some time to realize nebullvm's full vision for a programming language & hardware agnostic inference accelerator.
I have renamed this issue to "APIs for non-Python programming languages" in order to allow other community members to specify their preferences on APIs. That way, we will have a way to prioritize the development of programming languages.
My target environment is C++ and I dont think optimizing model in C++ would have any value in my development cycle. Portability matters. Normally I am exporting ONNX if I can and torchscript otherwise.
I'm have the same issue than isgursoy. Is it possible to export the optimised network to Onnx?
Basically the idea is to be able to import the optimised model into C++ (onnxruntime)
Hi @bzisl, the optimised models are compiled and cannot be converted back to onnx, however it is possible to exclude all compilers except onnxruntime during the optimization(using the ignore_compilers
parameter), so that you have an optimised model that is in fact an onnx. Keep in mind that in this way Speedster will only use onnxruntime and possibly quantization to speed up your model, so the results may not be as good as using all the compilers. After optimizing the model, you just have to save the optimized_model
using the save_model()
function, and you will get an onnx model.
Thanks!
One question more, please.
optimized_model = speedster.optimize_model(onnx_path, input_data=input_data, optimization_time="unconstrained")
How we force optimisation for CPU?
Best regards!!!!
You can use:
optimized_model = speedster.optimize_model(
onnx_path,
input_data=input_data,
optimization_time="unconstrained",
device="cpu"
)
I can see that you are optimizing an onnx model, I would suggest that you enable quantization by setting metric_drop_ths=0.1
in the function.
Thanks a lot!