Vito Plantamura comments

Results 150 comments of


                                            Vito Plantamura

sd.cpp: add sampler Euler

hi @vmobilis , I just committed the support for batched inputs, i.e. a `--num` option that allows to specify the number of images to generate. On one of my computers,...

Any model support

hi, OnnxStream is probably already capable of running the models you mentioned. The problem is converting the code that "calls" these models into C++ (for example, in the case of...

llama2 7b model

hi, currently the LLM sample application only supports "TinyLlama-1.1B-Chat-v0.3-fp16" and "Mistral-7B-Instruct-v0.2-fp16". Vito

Since TinyLlama adopts the same architecture and tokenizer as Llama 2, adding Llama 2 support to src/llm.cpp should be fairly simple. It involves exporting the onnx file, running "onnxsim_large_model" on...

llama2 7b model

I will try to reproduce the problem and let you know in the next few days. This problem is typically caused by the fact that the implementation of the HF...

llama2 7b model

I was able to run src/llm.cpp with llama2 exported using your script. The problem is that your script preserves the upcasts (float16->float32) and downcasts (float32->float16) needed in certain parts of...

Using multiple pi's?

hi, I think orchestrating an inference job using a shell script might be possible, but it's not at all the ideal choice 😀 In any case, the type of parallelization...

How to convert a VAE?

no, absolutely nothing special: torch.onnx.export + onnxsim_large_model + onnx2txt (in this order). Can you share the model you are trying to convert and especially the code that calls torch.onnx.export? Vito

How to convert a VAE?

I found the code I originally used to export the VAE model: ``` pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") class VAED(nn.Module): def __init__(self, vae): super(VAED, self).__init__() self.vae = vae def forward(self, latents): self.vae.enable_slicing()...

How to convert a VAE?

I think the first thing to do to try to understand the reason is to compare the two model.txt... specifically searching for different, missing or extra operations at the beginning...