John D. Kanu issues

Results 4 issues of


                                            John D. Kanu

`meta/llama-2-70b` maximum input size (1024) differs from the LLaMA-2 maximum context size (4096 tokens)

LLaMA-2 models have a maximum input size of 4096 tokens [[original paper](https://arxiv.org/pdf/2307.09288.pdf), [meta llama github repo](https://github.com/meta-llama/llama/issues/267#issuecomment-1659440955)]. When prompting `meta/llama-2-70b` through replicate, however, the maximum size of the model is, strangely,...

Do `replicate-internal/staging-llama-2-70b-mlc` and `replicate-internal/llama-2-70b-triton` have different maximum input lengths?

I am getting an error that the prompt length exceeds the maximum input length when calling `meta/llama-2-70b` through the API. I have included the error log from the Replicate dashboard...

Predictions often fail on meta/llama-2-70b

Calls to meta/llama-2-70b are sometimes succeeding, but sometimes failing. It is very unreliable. This is the code ``` output = replicate.run( "meta/llama-2-70b", input={ "prompt": "Q: Would a pear sink in...

Invalid stop_str in conversion template json file.

Running this code ``` import os import replicate from dotenv import load_dotenv load_dotenv() REPLICATE_API_TOKEN = os.getenv("REPLICATE_API_TOKEN") prompt = "Q: What is 10*10? A: " output = replicate.run( "meta/llama-2-7b", input={ "prompt":...