Max Braun

Results 30 issues of Max Braun

This adds support for input images with different encodings. The classification version already does this: https://github.com/google-coral/tflite/blob/master/python/examples/classification/classify_image.py#L103

The [pre-trained Whisper models](https://github.com/openai/whisper#available-models-and-languages) don't work out-of-the-box with the [Google Coral Edge TPU](https://coral.ai/products/). They would need to meet [certain requirements](https://coral.ai/docs/edgetpu/models-intro/#model-requirements) so they can be converted to [TensorFlow Lite](https://www.tensorflow.org/lite), quantized to...

Some ideas from [section 4.5 of the paper](https://cdn.openai.com/papers/whisper.pdf) and [this discussion](https://github.com/openai/whisper/discussions/117#discussioncomment-3727051): - [ ] Use shorter chunks to reduce latency. (Try overlapping chunks by setting `options.prefix` to the transcription of...

Official [JetPack](https://developer.nvidia.com/embedded/jetpack) for Jetson Nano support ends at version [4.6.3](https://developer.nvidia.com/jetpack-sdk-463), which is on Python 3.6. It would prevent some [inelegant workarounds](https://github.com/maxbbraun/whisper-edge#hack) if Python 3.8 was supported with PyTorch and CUDA....

The current implementation works with the 15M parameter version of [`tinyllamas`](https://huggingface.co/karpathy/tinyllamas/tree/main). Just dropping in the next larger one (42M) flashes fine, but freezes at runtime. Would need to look into...

enhancement
good first issue

Once nice thing about microcontrollers is their low power consumption. We should measure it! While running inference and while idle/suspended. I assume it'll consume less power when driven with 3.3V...

documentation

See if there is a way to run text to speech and read the generated text out loud. Ideally, this happens in parallel to the LLM generating tokens (using the...

enhancement

See if there is a way to use the built-in microphone to recognize speech for prompting the LLM (possibly from a very limited vocabulary). This could even make use of...

enhancement

Experimenting with compiler options in branch [`fast-opts`](https://github.com/maxbbraun/llama4micro/compare/main...fast-opts). Switching from `-Os` to `-O3` seems to have significant impact on tokens per second. (`-Ofast` doesn't noticeably add on top.) ```diff ->>> Averaged...

enhancement
good first issue

The main reason I chose [`karpathy/llama2.c`](https://github.com/karpathy/llama2.c) over [`ggerganov/llama.cpp`](https://github.com/ggerganov/llama.cpp) initially was that the former comes out of the box with very small (15M) models. `llama.cpp` and [`ggml`](https://github.com/ggerganov/ggml) more generally is a...

enhancement