NeMo icon indicating copy to clipboard operation
NeMo copied to clipboard

support save tensorrt_llm checkpoint

Open QiJune opened this issue 1 year ago • 0 comments

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Collection: [Note which collection this PR will affect]

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR. To re-run CI remove and add the label again. To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • [ ] Make sure you read and followed Contributor guidelines
  • [ ] Did you write any new necessary tests?
  • [ ] Did you add or update any necessary documentation?
  • [ ] Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • [ ] Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • [ ] New Feature
  • [ ] Bugfix
  • [ ] Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed. Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

QiJune avatar Jun 27 '24 02:06 QiJune

Retievaled nothing even using self-rag. Could you list the user question and the relevant content in knowledge base in order to analyze the reason of nothing retrievaled?

KevinHuSh avatar Jun 20 '24 01:06 KevinHuSh

Here is some additional detailed entries from when simply trying to index the data using RAG to summarize content (it appears none of the content is being passed back).

100.80.54.128 - - [20/Jun/2024:10:59:27 -0400] "POST /api/embeddings HTTP/1.1" 200 20585 "-" "ollama-python/0.1.8 (x86_64 linux) Python/3.10.12" "{\x22model\x22: \x22mxbai-embed-large\x22, \x22prompt\x22: \x22There are no paragraphs provided for me to summarize. Please provide the paragraphs you would like me to summarize, and I'll be happy to assist you!\x22, \x22options\x22: {}, \x22keep_alive\x22: null}" 100.80.54.128 - - [20/Jun/2024:10:59:27 -0400] "POST /api/embeddings HTTP/1.1" 200 20585 "-" "ollama-python/0.1.8 (x86_64 linux) Python/3.10.12" "{\x22model\x22: \x22mxbai-embed-large\x22, \x22prompt\x22: \x22There are no paragraphs provided for me to summarize. Please provide the paragraphs you would like me to summarize, and I'll be happy to assist you!\x22, \x22options\x22: {}, \x22keep_alive\x22: null}" 100.80.54.128 - - [20/Jun/2024:10:59:27 -0400] "POST /api/embeddings HTTP/1.1" 200 20684 "-" "ollama-python/0.1.8 (x86_64 linux) Python/3.10.12" "{\x22model\x22: \x22mxbai-embed-large\x22, \x22prompt\x22: \x22There are no paragraphs provided for me to summarize. Please provide the actual text, and I'll be happy to help!\x22, \x22options\x22: {}, \x22keep_alive\x22: null}" 100.80.54.128 - - [20/Jun/2024:11:00:15 -0400] "POST /api/chat HTTP/1.1" 200 745 "-" "ollama-python/0.1.8 (x86_64 linux) Python/3.10.12" "{\x22model\x22: \x22llama3:70b-instruct-fp16\x22, \x22messages\x22: [{\x22role\x22: \x22system\x22, \x22content\x22: \x22You're a helpful assistant.\x22}, {\x22role\x22: \x22user\x22, \x22content\x22: \x22Please summarize the following paragraphs. Be careful with the numbers, do not make things up. Paragraphs as following:\x5Cn There\x5CnThere\x5CnThe above is the content you need to summarize.\x22}], \x22stream\x22: false, \x22format\x22: \x22\x22, \x22options\x22: {\x22temperature\x22: 0.3, \x22num_predict\x22: 1000}, \x22keep_alive\x22: -1}" 100.80.54.128 - - [20/Jun/2024:11:00:24 -0400] "POST /api/chat HTTP/1.1" 200 708 "-" "ollama-python/0.1.8 (x86_64 linux) Python/3.10.12" "{\x22model\x22: \x22llama3:70b-instruct-fp16\x22, \x22messages\x22: [{\x22role\x22: \x22system\x22, \x22content\x22: \x22You're a helpful assistant.\x22}, {\x22role\x22: \x22user\x22, \x22content\x22: \x22Please summarize the following paragraphs. Be careful with the numbers, do not make things up. Paragraphs as following:\x5Cn There\x5CnThe above is the content you need to summarize.\x22}], \x22stream\x22: false, \x22format\x22: \x22\x22, \x22options\x22: {\x22temperature\x22: 0.3, \x22num_predict\x22: 1000}, \x22keep_alive\x22: -1}" 100.80.54.128 - - [20/Jun/2024:11:00:26 -0400] "POST /api/embeddings HTTP/1.1" 200 20627 "-" "ollama-python/0.1.8 (x86_64 linux) Python/3.10.12" "{\x22model\x22: \x22mxbai-embed-large\x22, \x22prompt\x22: \x22I apologize, but there is no content provided for me to summarize. The text only contains three instances of \x5C\x22There\x5C\x22 without any actual paragraphs or information. If you could provide the actual content, I'd be happy to help you with summarizing it!\x22, \x22options\x22: {}, \x22keep_alive\x22: null}" 100.80.54.128 - - [20/Jun/2024:11:00:26 -0400] "POST /api/embeddings HTTP/1.1" 200 20635 "-" "ollama-python/0.1.8 (x86_64 linux) Python/3.10.12" "{\x22model\x22: \x22mxbai-embed-large\x22, \x22prompt\x22: \x22I apologize, but there are no paragraphs provided for me to summarize. The text only says \x5C\x22There\x5C\x22 and then nothing else. If you could provide the actual paragraphs, I'd be happy to help you with summarizing them!\x22, \x22options\x22: {}, \x22keep_alive\x22: null}"

rhudock avatar Jun 20 '24 15:06 rhudock