John D. Kanu
John D. Kanu
LLaMA-2 models have a maximum input size of 4096 tokens [[original paper](https://arxiv.org/pdf/2307.09288.pdf), [meta llama github repo](https://github.com/meta-llama/llama/issues/267#issuecomment-1659440955)]. When prompting `meta/llama-2-70b` through replicate, however, the maximum size of the model is, strangely,...
I am getting an error that the prompt length exceeds the maximum input length when calling `meta/llama-2-70b` through the API. I have included the error log from the Replicate dashboard...
Calls to meta/llama-2-70b are sometimes succeeding, but sometimes failing. It is very unreliable. This is the code ``` output = replicate.run( "meta/llama-2-70b", input={ "prompt": "Q: Would a pear sink in...
Running this code ``` import os import replicate from dotenv import load_dotenv load_dotenv() REPLICATE_API_TOKEN = os.getenv("REPLICATE_API_TOKEN") prompt = "Q: What is 10*10? A: " output = replicate.run( "meta/llama-2-7b", input={ "prompt":...