torchchat icon indicating copy to clipboard operation
torchchat copied to clipboard

Support Granite Code 3B/8B

Open gabe-l-hart opened this issue 4 months ago • 0 comments

🚀 The feature, motivation and pitch

The torchchat framework provides an excellent platform for embedding models into many different edge-centric platforms.

The Granite Code models, specifically the 3B-128k and 8B-128k variants, are a family of models from IBM that support a wide variety of code-related tasks. The models are released under the Apache-3 license and are therefore well-suited to embedded use-cases where code intelligence is needed.

The request here is to extend the model support in torchchat to support running the 3B and 8B long-context variants of Granite Code in order to enable usage of these models across embedded use-cases.

Alternatives

Depending on the goals of the torchchat framework, extending support to non-llama models may or may not be a project goal. There are other embedded frameworks out there (notably llama.cpp and the many projects that wrap it), so these can be used to run Granite Code in embedded environments. Our goal at IBM is to provide users with as many choices as possible on how to run all of our Granite family models, so our hope is that torchchat can be a strong piece of this story!

Additional context

The 3B and 8B models use the llama architecture in transformers, so they are close to fully supported as-is. There are a few crucial pieces that are present in the transformers implementation that are missing in torchchat:

  • Safetensors support: https://github.com/pytorch/torchchat/issues/1249
  • Tied word embeddings: https://github.com/pytorch/torchchat/issues/1252
  • Bias tensors: https://github.com/pytorch/torchchat/issues/1250
  • Non-tiktoken/sentencepiece tokenizers: https://github.com/pytorch/torchchat/issues/1251

RFC (Optional)

I've worked through the initial steps of solving all of these outstanding issues (see the corresponding issues). Once these are solved, the addition of these Granite Code models should consist of the following steps:

gabe-l-hart avatar Oct 03 '24 16:10 gabe-l-hart