Michael Royzen issues

Results 10 issues of


Michael Royzen

[BUG] ValueError: Message onnx.ModelProto exceeds maximum protobuf size of 2GB: 2215158447

@jcwchen Optimizing large models fails in the latest release of onnx (1.12.0) even with use_external_data_format=True. In the latest version of onnxruntime, calling `OnnxModel.save(self.model, output_path, use_external_data_format, all_tensors_to_one_file)` fails with the following...

bug

[BUG] ONNX optimization fails when optimizing AlbertXXL despite the weights being under 2GB

### System Info Optimum 1.2.3[onnxruntime-gpu], PyTorch 1.12.0a0+bd13bc6, CUDA 11.6, Ubuntu 18.04, Transformers 4.19.0, Onnxruntime nightly build (ort-nightly-gpu 1.12.0.dev20220616003) because otherwise there's an error: `Traceback (most recent call last): File "run_qa.py",...

bug

t5_bf16 notebooks fails with [ONNXRuntimeError] : 10 : INVALID_GRAPH

I'm running the t5_bf16 notebook with the T0_3B model. Everything works great until ``` enc_fp16_onnx = create_model_for_provider(encoder_model_path, "CUDAExecutionProvider", log_severity=3) enc_fp16_onnx_binding: IOBinding = enc_fp16_onnx.io_binding() dec_onnx = create_model_for_provider(dec_if_model_path, "CUDAExecutionProvider", log_severity=3) dec_onnx_binding: IOBinding...

t5 notebook broken with transformer-deploy 0.5.0

Running the t5.ipynb notebook is broken when using the transformers-deploy 0.5.0 docker container. Specifically, with ``` def get_random_input_encoder() -> Dict[str, torch.Tensor]: max_seq = 128 seq_len = random.randint(a=1, b=max_seq) batch =...

LLaMA

### Model description New model series from Facebook (7B, 33B, 66B) that is broadly competitive with Flan-PALM-540B. https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/ ### Open source status - [X] The model implementation is available -...

New model

Llama Tokenizer uses incorrect indices for PAD

### System Info latest transformer main ### Who can help? @gante ### Information - [ ] The official example scripts - [ ] My own modified scripts ### Tasks -...

Llama support

Would it be possible to run llama using this? Is the gpt2 example hackable to run llama on tensorrt?

LLaMA support

176

Given existing support for GPT-J and its rotary embeddings, is LLaMA supported as well? Huggingface just shipped their implementation: https://github.com/huggingface/transformers/commit/464d420775653885760e30d24d3703e14f4e8a14 @byshiue

enhancement

Cutlass missing from 3rdparty in new 5.2 release

### Branch/Tag/Commit v5.2 ### Docker Image Version nvcr.io/nvidia/pytorch:22.09-py3 ### GPU name A100 ### CUDA Driver 510.47.03 ### Reproduced Steps ```shell Trying to build v5.2 using the T5 guide. Seems that...

bug

Docker images throw error for inference

I tried running both Docker images as described and got the same error each time. From the conceptual-captions directory, I ran `python /conceptual-captions/generate_caption.py test_images.txt test_images/ my_output.txt`. My hardware is 2x...