End-to-End-LLM issues

Issue: 98 - Address already in use, Unable to download MegatronGPT 1.3B, and Triton Server issue

- Issues with downloading the MegatronGPT 1.3B model from google drive which cause a delay running the lab activity 2 notebook. Google drive restrict permission when it sense multiple download...

programmah

Issue: Nemo_primer.ipynb imports not working.

1

In Nemo_primer.ipynb when doing import nemo.collections.asr as nemo_asr, import nemo.collections.nlp as nemo_nlp and import nemo.collections.tts as nemo_tts I get the following error `ImportError: tokenizers>=0.11.1,!=0.11.3, None: 18 # TODO: Remove in...

pablodonisebri

Feature Request - Triton Server Deployment - Hands-on latency and throughput comparison across two models

An important aspect of deployment would be that the model needs to be served to a wide range of users. Understanding the throughout and latency and comparison with additional optimisation...

aswkumar99

Feature Request - Workflow for Integration of New models with TRT-LLM

TRT-LLM does a great job in optimising the supported set of models. But a Notebook/ Section discussing the workflow and steps to integrate a custom model would be very helpful...

aswkumar99

Feature Request - Addition of Benchmarking for TRT-LLM

The current TRT-LLM Materials discusses the Hands-on aspects of getting from a Model to Deployment in a Triton server. Given that TRT-LLM focuses on Performance, we could have a section...

aswkumar99

Feature Request: Validating prompt response from Triton server using NeMo Guardrails

This feature request is about creating a content that demonstrate how to connect nemo guardrails to Llama-2-7b-chat TensorRT engine deployed on Triton Inference Server. This approach helps avoid the need...

programmah

Feature Request: Building TensorRT Engine for Finetuned Llama-2-7B Model

- The feature request is base on the use of TRT-LLM to build a tenssorrt engine from a finetuned llama-2-7b model. - Expatiate on the process built process - Exemplify...

programmah

Feature Request: Fine-tune Llama-2-7B with Custom Dataset

This feature request is required as part of an End-to-End pipeline. The process should include: - dataset preprocessing - use of PEFT method to fine-tune llama-2-7b for text generation task...

programmah

Issue: Many unnecessary files and folders within the NeMo Guardrails lab

Many unnecessary files and folders are included within the NeMo Guardrails lab, making navigation within the lab difficult. The lab should not have the entire clone repository but a folder...

programmah

TRT-LLM & Triton Version Mismatch

There is no version of TRT-LLM & Triton set , so there are version conflicts. solved with #23

aswkumar99

End-to-End-LLM
End-to-End-LLM copied to clipboard

Metadata

Issue: 98 - Address already in use, Unable to download MegatronGPT 1.3B, and Triton Server issue

Issue: Nemo_primer.ipynb imports not working.

Feature Request - Triton Server Deployment - Hands-on latency and throughput comparison across two models

Feature Request - Workflow for Integration of New models with TRT-LLM

Feature Request - Addition of Benchmarking for TRT-LLM

Feature Request: Validating prompt response from Triton server using NeMo Guardrails

Feature Request: Building TensorRT Engine for Finetuned Llama-2-7B Model

Feature Request: Fine-tune Llama-2-7B with Custom Dataset

Issue: Many unnecessary files and folders within the NeMo Guardrails lab

TRT-LLM & Triton Version Mismatch

← Metadata

Owner

Metadata

End-to-End-LLM End-to-End-LLM copied to clipboard

Metadata

← Metadata

Owner

Metadata

End-to-End-LLM
End-to-End-LLM copied to clipboard