End-to-End-LLM
End-to-End-LLM copied to clipboard
This repository is an AI Bootcamp material that consist of a workflow for LLM
- Issues with downloading the MegatronGPT 1.3B model from google drive which cause a delay running the lab activity 2 notebook. Google drive restrict permission when it sense multiple download...
In Nemo_primer.ipynb when doing import nemo.collections.asr as nemo_asr, import nemo.collections.nlp as nemo_nlp and import nemo.collections.tts as nemo_tts I get the following error `ImportError: tokenizers>=0.11.1,!=0.11.3, None: 18 # TODO: Remove in...
An important aspect of deployment would be that the model needs to be served to a wide range of users. Understanding the throughout and latency and comparison with additional optimisation...
TRT-LLM does a great job in optimising the supported set of models. But a Notebook/ Section discussing the workflow and steps to integrate a custom model would be very helpful...
The current TRT-LLM Materials discusses the Hands-on aspects of getting from a Model to Deployment in a Triton server. Given that TRT-LLM focuses on Performance, we could have a section...
This feature request is about creating a content that demonstrate how to connect nemo guardrails to Llama-2-7b-chat TensorRT engine deployed on Triton Inference Server. This approach helps avoid the need...
- The feature request is base on the use of TRT-LLM to build a tenssorrt engine from a finetuned llama-2-7b model. - Expatiate on the process built process - Exemplify...
This feature request is required as part of an End-to-End pipeline. The process should include: - dataset preprocessing - use of PEFT method to fine-tune llama-2-7b for text generation task...
Many unnecessary files and folders are included within the NeMo Guardrails lab, making navigation within the lab difficult. The lab should not have the entire clone repository but a folder...
There is no version of TRT-LLM & Triton set , so there are version conflicts. solved with #23