beyondllm
beyondllm copied to clipboard
Added Cerebras LLM for ~2000 TPS inference
Purpose
Cerebras using its 3rd generation Wafer-Scale-Engine (WSE-3) can make inference at ~2000 TPS.
So I added Cerebras API via cerebras-cloud-sdk
Usage
import os
from beyondllm.llms.cerebras import CerebrasModel
os.environ['CEREBRAS_API_KEY'] = "YOUR_CEREBRAS_API_KEY"
llm = CerebrasModel()
# Generate a prediction
prompt = "Tell me a about machine learning"
response = llm.predict(prompt)
print(response)
@lucifertrj