llm-foundry
llm-foundry copied to clipboard
Onnx Inference Script
Can you please provide an inference script as well for the onnx model... as i can't figure out a way to run the model... After i converted the model to onnx format the input and dim req of the model are Input Name: input_ids Input Shape: [8, 2048] Input Type: 7
Input Name: attention_mask Input Shape: [8, 2048] Input Type: 9
and if its batch size and max seq len... then it should be flexible enough to take in the batch 1-8 and seq 1-2048 but when I send in batch 1 and seq len 10 i get
Got invalid dimensions for input: input_ids for the following indices index: 0 Got: 1 Expected: 8 index: 1 Got: 10 Expected: 2048
and if so other way round means 8 is the seq length and 2048 is the dim.. 8 is seq len which makes no sense and 2048 means it has already been passes through a encoding layer....
So can you please provide an inference script of it.. it would be very helpfull
@nik-mosaic Could you take a look at this please?
You may need to add dynamic_axes in the scripts/inference/convert_hf_to_onnx.py::export_to_onnx method. This will allow you to set a variable batch_size/seq_len. Try something like:
input_names=['input_ids', 'attention_mask'],
output_names=['output'],
opset_version=16,
+ dynamic_axes={
+ 'input_ids' : {0 : 'batch_size', 1 : 'seq_len'},
+ 'attention_mask' : {0 : 'batch_size', 1: 'seq_len'},
+ 'output' : {0 : 'batch_size', 1: 'seq_len'},
+ }
See the torch.onnx.export docs for more information on dynamic_axes.
Closing as stale. Please open a new issue if you are still encountering problems.