openui icon indicating copy to clipboard operation
openui copied to clipboard

Groq Supported?

Open AGI-Bingo opened this issue 1 year ago • 8 comments

Hi there! Was wondering if this supports Groq api out of the box or has anyone already forked and applied it?

Lemme know if it's already implemented Or if I should make a PR

Thanks and all the best! Awesome repo!

AGI-Bingo avatar May 04 '24 18:05 AGI-Bingo

I tried to make it work with groq (mistral) via docker-compose. But I could not get it to work.

OpenAI does not allow prepaid credit cards anymore for some reason. (It's a Microsoft thing...) So I am locked out of using OpenAI now.

Claude and Groq would be great alternatives. But most AI software uses OpenAI by default, unfortunately.

I would like to see other LLM's implemented.

Ollama CPU Inferenc via Docker was unfortunately unsuccessful :-(

I would love to see you facilitate more LLM's.

mmuyakwa avatar May 05 '24 00:05 mmuyakwa

Hey guys, if an API supports the OpenAI standard it should be easy to add. I'll look into making it configurable.

vanpelt avatar May 05 '24 09:05 vanpelt

Here some info groq provides to this topic. https://console.groq.com/docs/openai

mmuyakwa avatar May 07 '24 18:05 mmuyakwa

My not so elegant way to have it run with GROQ

But it works for me.

I bruteforced the code to run with GROQ instead of OpenAI.

File assisants.py in the OpenAI package

In the .venv/lib/python3.12/site-packages/openai/resources/beta/assistants.py I removed all OpenAI-models and entered the model I choose to use with Groq.

.venv/lib/python3.12/site-packages/openai/resources/beta/assistants.py

I prefere mixtral-8x7b-32768 because of the high context window.

    def create(
        self,
        *,
        model: Union[
            str,
            Literal[
                "mixtral-8x7b-32768", # Replaced all OpenAI models with the Groq-model
            ],
        ],
        description: Optional[str] | NotGiven = NOT_GIVEN,
        instructions: Optional[str] | NotGiven = NOT_GIVEN,
        metadata: Optional[object] | NotGiven = NOT_GIVEN,
        ...

File server.py in the folder backend/openui

I set the base_url to the GROQ API

I set th base_url to the GROQ API and removed the OpenAI client and replaced it with the AsyncOpenAI client.

URL: "https://api.groq.com/openai/v1"

openai = AsyncOpenAI(
    base_url="https://api.groq.com/openai/v1"
)  # AsyncOpenAI(base_url="http://127.0.0.1:11434/v1")
ollama = AsyncClient()
router = APIRouter()
session_store = DBSessionStore()
github_sso = GithubSSO(
    config.GITHUB_CLIENT_ID, config.GITHUB_CLIENT_SECRET, f"{config.HOST}/v1/callback"
)

Changes in the chat_completions function

backend/openui/server.py

Thn I removed the if-statments checking if the name contains "openai" and only left the model I choose to use with Groq.

async def chat_completions(
    request: Request,
    # chat_request: CompletionCreateParams,  # TODO: lots' fo weirdness here, just using raw json
    # ctx: Any = Depends(weave_context),
):
    if request.session.get("user_id") is None:
        raise HTTPException(status_code=401, detail="Login required to use OpenUI")
    user_id = request.session["user_id"]
    yesterday = datetime.now() - timedelta(days=1)
    tokens = Usage.tokens_since(user_id, yesterday.date())
    if config.ENV == config.Env.PROD and tokens > config.MAX_TOKENS:
        raise HTTPException(
            status_code=429,
            detail="You've exceeded our usage quota, come back tomorrow to generate more UI.",
        )
    try:
        data = await request.json()  # chat_request.model_dump(exclude_unset=True)
        input_tokens = count_tokens(data["messages"])
        # TODO: we always assume 4096 max tokens (random fudge factor here)
        data["max_tokens"] = 32768 - input_tokens - 20 # !!!Changed from 4096 to 32768!!!
        if data.get("model").startswith("mixtral"):
            response: AsyncStream[ChatCompletionChunk] = (
                await openai.chat.completions.create(
                    **data,
                )
            )
            # gpt-4 tokens are 20x more expensive
            multiplier = 20 # !!!!!Changed from 1 to 20!!!!!
            return StreamingResponse(
                openai_stream_generator(response, input_tokens, user_id, multiplier),
                media_type="text/event-stream",
            )
        raise HTTPException(status=404, detail="Invalid model")
    except (ResponseError, APIStatusError) as e:
        traceback.print_exc()
        logger.exception("Known Error: %s", str(e))
        msg = str(e)
        if hasattr(e, "message"):
            msg = e.message
        raise HTTPException(status_code=e.status_code, detail=msg)

Entered the GROQ API key as "OPENAI_API_KEY"

export OPENAI_API_KEY="gsk_..."

mmuyakwa avatar May 07 '24 20:05 mmuyakwa

I was able to fork and add support for Groq and other OpenAI-compatible apis via two new environment variables: GROQ_API_KEY and OPENAI_BASE_URL.

https://github.com/tlo9/openui

tlo9 avatar May 08 '24 06:05 tlo9

Seems like this is in the bag

@tlo9 can you add https://api.groq.com/openai/v1 To automatically be used if model contains "groq/***"

(Or at least add the url to the readme and docs. Ideally both)

Other than that I think you should make a pull request so it can be reviewed and merged - quickly.. unless @vanpelt, @mmuyakwa or someone has a more complete/configurable solution

Awesome work guys! Happy to see this is working Hope to see this merged soon

fire17 avatar May 09 '24 17:05 fire17

I forgot to mention that I included another environment variable, GROQ_BASE_URL, that defaults to https://api.groq.com/openai/v1. The Groq models will appear in the settings if a valid GROQ_API_KEY is set. I'll update the readme with this info.

tlo9 avatar May 10 '24 23:05 tlo9

@tlo9 nice:) so now just waiting on a PR Please reference this issue :)

AGI-Bingo avatar May 11 '24 13:05 AGI-Bingo