llm-foundry icon indicating copy to clipboard operation
llm-foundry copied to clipboard

Onnx Inference Script

Open ShRajSh opened this issue 1 year ago • 2 comments

Can you please provide an inference script as well for the onnx model... as i can't figure out a way to run the model... After i converted the model to onnx format the input and dim req of the model are Input Name: input_ids Input Shape: [8, 2048] Input Type: 7

Input Name: attention_mask Input Shape: [8, 2048] Input Type: 9

and if its batch size and max seq len... then it should be flexible enough to take in the batch 1-8 and seq 1-2048 but when I send in batch 1 and seq len 10 i get

Got invalid dimensions for input: input_ids for the following indices index: 0 Got: 1 Expected: 8 index: 1 Got: 10 Expected: 2048

and if so other way round means 8 is the seq length and 2048 is the dim.. 8 is seq len which makes no sense and 2048 means it has already been passes through a encoding layer....

So can you please provide an inference script of it.. it would be very helpfull

ShRajSh avatar Jun 20 '23 08:06 ShRajSh

@nik-mosaic Could you take a look at this please?

dskhudia avatar Jun 20 '23 15:06 dskhudia

You may need to add dynamic_axes in the scripts/inference/convert_hf_to_onnx.py::export_to_onnx method. This will allow you to set a variable batch_size/seq_len. Try something like:

         input_names=['input_ids', 'attention_mask'],
         output_names=['output'],
         opset_version=16,
+        dynamic_axes={
+            'input_ids' : {0 : 'batch_size', 1 : 'seq_len'},
+            'attention_mask' : {0 : 'batch_size', 1: 'seq_len'},
+            'output' : {0 : 'batch_size', 1: 'seq_len'},
+        }

See the torch.onnx.export docs for more information on dynamic_axes.

nik-mosaic avatar Jun 20 '23 23:06 nik-mosaic

Closing as stale. Please open a new issue if you are still encountering problems.

dakinggg avatar Sep 07 '23 02:09 dakinggg