torchchat
torchchat copied to clipboard
Support Granite Code 3B/8B
🚀 The feature, motivation and pitch
The torchchat
framework provides an excellent platform for embedding models into many different edge-centric platforms.
The Granite Code models, specifically the 3B-128k and 8B-128k variants, are a family of models from IBM that support a wide variety of code-related tasks. The models are released under the Apache-3 license and are therefore well-suited to embedded use-cases where code intelligence is needed.
The request here is to extend the model support in torchchat
to support running the 3B and 8B long-context variants of Granite Code in order to enable usage of these models across embedded use-cases.
Alternatives
Depending on the goals of the torchchat
framework, extending support to non-llama models may or may not be a project goal. There are other embedded frameworks out there (notably llama.cpp
and the many projects that wrap it), so these can be used to run Granite Code in embedded environments. Our goal at IBM is to provide users with as many choices as possible on how to run all of our Granite family models, so our hope is that torchchat
can be a strong piece of this story!
Additional context
The 3B and 8B models use the llama
architecture in transformers
, so they are close to fully supported as-is. There are a few crucial pieces that are present in the transformers
implementation that are missing in torchchat
:
- Safetensors support: https://github.com/pytorch/torchchat/issues/1249
- Tied word embeddings: https://github.com/pytorch/torchchat/issues/1252
- Bias tensors: https://github.com/pytorch/torchchat/issues/1250
- Non-tiktoken/sentencepiece tokenizers: https://github.com/pytorch/torchchat/issues/1251
RFC (Optional)
I've worked through the initial steps of solving all of these outstanding issues (see the corresponding issues). Once these are solved, the addition of these Granite Code models should consist of the following steps:
- Adding new entries to models.json
- Adding the right set of model-specific params to model_params