Gabe Goodhart
Gabe Goodhart
## Special Note Since this PR bumps `llama.cpp` past the tip of `master` (`6026da52` as of writing this), it includes the recent changes to overhaul `sampling` and logging. I updated...
## Dependencies This PR is part of a sequence in support of adding Granite Code. It depends on merging the following PRs: - [x] Safetensors: #1255 - [x] Bias tensors:...
### 🚀 The feature, motivation and pitch The request is to extend the [tokenizer](https://github.com/pytorch/torchchat/tree/main/tokenizer) module in `torchchat` to support tokenizers that use the Huggingface [tokenizers](https://github.com/huggingface/tokenizers) library. There are many models...
### 🚀 The feature, motivation and pitch The `torchchat` framework provides an excellent platform for embedding models into many different edge-centric platforms. The [Granite Code models](https://huggingface.co/collections/ibm-granite/granite-code-models-6624c5cec322e4c148c8b330), specifically the [3B-128k](https://huggingface.co/ibm-granite/granite-3b-code-instruct-128k) and...
# What does this PR do? This PR adds support for models using IBM's [GraniteForCausalLM](https://github.com/huggingface/transformers/blob/main/src/transformers/models/granite/modeling_granite.py#L1008) architecture when converting to ONNX. The key changes are: * ~~Allow users to opt into...