icortex
icortex copied to clipboard
Add Support for Different Model Types
ONNX models are serialised version of the current AI models. They are a bit faster from the normal pytorch or huggingface therefore some users might want to use this type of models.
We already do have a model converted and added to huggingface: https://huggingface.co/TextCortex/codegen-350M-optimized
In total we need to support following model types:
- Pytorch: Filename extension .pt
- Hugginface: Filename extension .bin
- ONNX: Filename extension .onnx
Here is the script for supporting text generation for ONNX models with optimum library from huggingface:
from transformers import AutoTokenizer
from optimum.onnxruntime import ORTModelForCausalLM
# Load models
model = ORTModelForCausalLM.from_pretrained("TextCortex/codegen-350M-optimized")
tokenizer = AutoTokenizer.from_pretrained("TextCortex/codegen-350M-optimized")
def generate_onnx(prompt, min_length=16, temperature=0.1, num_return_sequences=1):
generated_ids=model.generate(input_ids, min_length=min_length, temperature=temperature,
num_return_sequences=num_return_sequences, early_stopping=True)
out = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
return out
For the Vanilla Pytorch models .pt, you can directly use AutoModel class from transformers (which also works for the huggingface .bin model types.)
This feature is mostly implemented, there are some small tasks left to do:
- [ ] HuggingFace model config handling