Ronsor
Ronsor
Since @josete89 said docker didn't work, here's a list of what we need: - [x] MMX support - [ ] Linux cgroups - [ ] Namespaces (CLONE_NEWNS, etc.)
As the title implies, make wax self-hosting by rewriting it in itself.
When decompiling obfuscated code, it'd be nice if the metadata inserted by the Kotlin compiler could be used to automatically rename classes, properties, etc.
### Is there an existing issue for this? - [X] I have searched the existing issues and checked the recent builds/commits ### What would your feature do ? It would...
Unlike 8-bit LLaMA, it seems that the slow `torch.load` function is used to load the entire model to CPU RAM before sending to VRAM. While I'm not concerned about memory...
`dotprod` extensions aren't available on some ARM CPUs (e.g. Raspberry Pi 4), so check for them and only use them if they're available. Reintroduces the code removed in 84d9015 if...
I propose refactoring `main.cpp` into a library (`llama.cpp`, compiled to `llama.so`/`llama.a`/whatever) and making `main.cpp` a simple driver program. A simple C API should be exposed to access the model, and...
When converting the model + tokenizer, use the vocabulary size returned by the tokenizer rather than assuming 32000. There are ways that special tokens or other new tokens could be...
I'm currently trying to use just the operators defined in `fla.ops`; however, because of the `__init__.py` script for the main package, it's not possible to do this without importing things...
This PR will add backward computations for most operators once completed. - [x] Tanh - [x] Sigmoid - [x] GELU + GELU (quick) - [x] ELU - [x] clamp -...