Philpax
Philpax
In one of my test applications, I use an `InferenceSession` to load in a prompt that I later reuse. However, I realised while doing this that you can't actually clone...
We should publish the crate and its associated applications to `crates.io` (potentially bringing llamacord etc into a GitHub organization, too). Here's what I think's blocking: - [ ] Move `ggml-rs`...
There have been quite a few changes since our last major sync: https://github.com/ggerganov/llama.cpp/compare/904d2a8d6acd667c9633138d45a361d40fbf76d0..HEAD (There may be others we haven't accounted for in the inferencing code). Need to do a more...
This PR reformats the README (making it markdown-lint compliant), adds more instructions, and clarifies the differences between this and `llama.cpp`. It should address #23, #50 and #63.
We're a couple weeks out of date with the current implementation of LLaMA in llama.cpp. There's quite a few changes (including always generating the BOS at the start!) that we...
https://github.com/salesforce/CodeGen https://huggingface.co/docs/transformers/v4.28.1/en/model_doc/codegen#transformers.CodeGenForCausalLM Apparently one of the best for code-generation.
Looking at it now, `GptNeoX` is consistent with the other names - it should be used everywhere, instead of `NeoX`. I tried to be clever and it bit me 😅
The `llm` and `llm-base` crates are getting very-top-level heavy (there are dozens of types on the main page). We should reshape it so that it's much clearer where the entrypoints...
We're getting more developers who are adding new implementations, which is great, but transformers and LLMs are complicated, and it'll require more than Rust knowledge to get to the bottom...
We mention that `llm` should be built as release, but users are likely going to want to know how to build the dependency as release while keeping their own code...