Lukas Kreussel

Results 114 comments of Lukas Kreussel

@abetlen Alright i now have an action that builds me a `libllama.so` binary with and without `avx512`. But if i copy this binary into my docker container and set the...

@gjmulder Alright thats kind of my bad, the `lib` folder is created in a github action and contains the `llama.so` binary i added above. I now also added a `lib`...

This seams to be an error with the tokenizer you are using, i encountered similar issues and decided to refactor and combine many of the conversion scripts for our `rustformers\llm`...

The included GGML tokenizer is very lossy, in `rustformers\llm` we support HugginfaceTokenizers which aren't supported in the ggml implementation. Maybe give those a try, if that doesn't work maybe review...

What will the developer experience be like? Is some sort of "interactive" mode planned where i can see the results an operation will have on a tensor while stepping throught...

> Our execution happens on the GPU and we generate a single set of commands to execute all ops in one go, so breakpoints are difficult to support. Also intermediate...

Hey i finally got some time to take a look at these changes and i will start to play around a bit. Is there somewhere a list of all supported...

I converted a GPT-2 model to ONNX and visualized the operations [here](https://github.com/webonnx/wonnx/assets/65088241/676c145f-5c21-48d3-ab19-aec523fb628c). The main thing still missing are the different shape manipulating operations like "Squeeze, Unsqueez, Concat, Gather" and stuff...

> Forgot to mention two things that might be useful: > > * You can use the `nnx` (`wonnx-cli`) tool to list operations used in an ONNX model (it is...

Got it, my plan was to somehow get a GPT-2 F16 GGML model loaded and executed and then work my way toward some more exotic quantization formats, which will require...