llama issues

ValidationError: Input validation error: `inputs` must have less than 4096 tokens. Given: 4545

## Describe the bug i was using meta-llama/Llama-2-7b-chat-hf from hugging face in a RAG model and it used to work perfectly, bur then i suddenly recieved this error : ```...

asma-10

I have requested access at https://llama.meta.com/llama-downloads/ and waited over the last two weeks to access the Llama models for my MS Thesis research, using both my university and personal gmail...

felix-hh

model-access

Will the cache kv become invalid?

In a multi-threaded situation, if the GPU server resources are insufficient, will cache kv preemption occur? For example, there are two conversations at the same time, both of which are...

oslijunw

RuntimeError: ProcessGroupNCCL is only supported with GPUs, no GPUs found?

18

What is the reason behind and how to fix the error: ```shell RuntimeError: ProcessGroupNCCL is only supported with GPUs, no GPUs found! ``` ? I'm trying to run `example_text_completion.py` with:...

Jesparzarom

download-install

bash error:Downloading LICENSE and Acceptable Usage Policy

bash download.sh is not worked.

ganjiaqi0325

torch.distributed.elastic.multiprocessing.errors.ChildFailedError

1

**Before submitting a bug, please make sure the issue hasn't been already addressed by searching through the [FAQs](https://ai.meta.com/llama/faq/) and [existing/past issues](https://github.com/facebookresearch/llama/issues)** ## Describe the bug I only have 1 GPU,...

Qiqing-Fu

Update MODEL_CARD.md

2

Fixed a small doc error "evaluation were also performed on third-party cloud compute --->> evaluation were also performed on third-party cloud comput**ing**"

ShorthillsAI

CLA Signed

Éxito

**Before submitting a bug, please make sure the issue hasn't been already addressed by searching through the [FAQs](https://ai.meta.com/llama/faq/) and [existing/past issues](https://github.com/facebookresearch/llama/issues)** ## Describe the bug ### Minimal reproducible example ```python...

Karliz24

After adding tokens, the model doubles in size.

l use the code below to add about 100 tokens to model and tokenizer,and the model doubles in size. ``` from transformers import AutoTokenizer,AutoModel tokenizer = AutoTokenizer.from_pretrained("llama-7b-model") model = AutoModel.from_pretrained("llama-7b-model")...

supech

llama
llama copied to clipboard

Metadata

ValidationError: Input validation error: `inputs` must have less than 4096 tokens. Given: 4545

How can i inference in C ？

Unable to access the model

Will the cache kv become invalid?

RuntimeError: ProcessGroupNCCL is only supported with GPUs, no GPUs found?

bash error:Downloading LICENSE and Acceptable Usage Policy

torch.distributed.elastic.multiprocessing.errors.ChildFailedError

Update MODEL_CARD.md

Éxito

After adding tokens, the model doubles in size.

← Metadata

Owner

Metadata

llama llama copied to clipboard

Metadata

← Metadata

Owner

Metadata

llama
llama copied to clipboard