NeMo-Guardrails icon indicating copy to clipboard operation
NeMo-Guardrails copied to clipboard

I want to add Nemo Guardrail as layer on top of vLLM hosted LLM.

Open SinhaPrateek opened this issue 11 months ago • 5 comments

I want to add Nemo Guardrail as a layer on top of vLLM hosted LLM. That means, I need to replace openai model with my vLLM hosted LLM. One approahc seems to be to update the config.yml, but is it possible and what values of engine and model should be taken ? models:

  • type: main engine: openai model: gpt-3.5-turbo-instruct

SinhaPrateek avatar Mar 12 '24 10:03 SinhaPrateek

@SinhaPrateek : yes, it is possible. See this example: https://github.com/NVIDIA/NeMo-Guardrails/blob/develop/examples/configs/llama_guard/config.yml. There, the Llama Guard model is configured through vLLM, but you can use a similar configuration for the main LLM as well.

models:
- type: main
  engine: vllm_openai
  parameters:
    openai_api_base: "http://localhost:5000/v1"
    model_name: "..."

drazvan avatar Mar 20 '24 21:03 drazvan

Hi @drazvan - is there a way to set the openai_api_base, model_name, and authorization header values via environment variables? Specifically when engine: vllm_openai

ChuckHend avatar Aug 07 '24 14:08 ChuckHend

The only way I can think of is to initialize the LLM manually and register it as a custom LLM. To achieve this, you need to add a config.py at the root of your config directory with something along the lines:

import os
from nemoguardrails import LLMRails, RailsConfig
from nemoguardrails.llm.helpers import get_llm_instance_wrapper
from nemoguardrails.llm.providers import  register_llm_provider
from langchain_community.llms import VLLM

def init(llm_rails: LLMRails):
    config = llm_rails.config

    # Initialization copied from https://python.langchain.com/v0.2/docs/integrations/llms/vllm/
    llm = VLLM(
        model=os.environ.get("VLLM_MODEL_NAME"),    # <-- Fetch from env var like this
        trust_remote_code=True,  # mandatory for hf models
        max_new_tokens=128,
        top_k=10,
        top_p=0.95,
        temperature=0.8,
    )
    provider = get_llm_instance_wrapper(
        llm_instance=llm, llm_type="vllm_custom"
    )
    register_llm_provider("vllm_custom", provider)

This was not tested, but you should get the idea. Let me know if this works for you.

drazvan avatar Aug 07 '24 19:08 drazvan

Thanks! That gave me another idea which ended up working too, it seems. Any concerns doing it this way?

import os
from nemoguardrails.rails.llm.config import Model
from nemoguardrails import LLMRails, RailsConfig

CHAT_MODEL = os.environ["CHAT_MODEL"]
API_KEY = os.environ["API_KEY"]
BASE_URL = os.environ["BASE_URL"]

RAILS_MODEL = Model(
    type="main",
    engine="vllm_openai",
    model=CHAT_MODEL,
    parameters={
        "openai_api_base": BASE_URL,
        "openai_api_key": API_KEY,
        "model_name": CHAT_MODEL,
    },
)

base_config = RailsConfig.from_path("/path/to/config")
base_config.models = [RAILS_MODEL]
rails = LLMRails(base_config)

ChuckHend avatar Aug 07 '24 19:08 ChuckHend

I should add that /path/to/config.yaml, in my example above, does not contain any values for models, but it does seem to work the way I want it to.

ChuckHend avatar Aug 07 '24 20:08 ChuckHend

Thanks! That gave me another idea which ended up working too, it seems. Any concerns doing it this way?

import os
from nemoguardrails.rails.llm.config import Model
from nemoguardrails import LLMRails, RailsConfig

CHAT_MODEL = os.environ["CHAT_MODEL"]
API_KEY = os.environ["API_KEY"]
BASE_URL = os.environ["BASE_URL"]

RAILS_MODEL = Model(
    type="main",
    engine="vllm_openai",
    model=CHAT_MODEL,
    parameters={
        "openai_api_base": BASE_URL,
        "openai_api_key": API_KEY,
        "model_name": CHAT_MODEL,
    },
)

base_config = RailsConfig.from_path("/path/to/config")
base_config.models = [RAILS_MODEL]
rails = LLMRails(base_config)

Hi @ChuckHend ,

Great that I came across your post . Could you answer ,y some questions below ? :

  1. What is CHAT_MODEL here ? The name of HF model ?
  2. Is vLLM model created under hood ? Any way to use already existing vLLM model already defined in previous code ?

Thanks

snassimr avatar Aug 12 '24 13:08 snassimr

What is CHAT_MODEL here ? The name of HF model ?

Yes CHAT_MODEL is the name of the model, e.g. meta-llama/Meta-Llama-3-8B-Instruct

Is vLLM model created under hood ? Any way to use already existing vLLM model already defined in previous code ?

vLLM is just running in a separate docker container. I think the previous definition of vLLM would work.

ChuckHend avatar Aug 12 '24 13:08 ChuckHend

Thanks! That gave me another idea which ended up working too, it seems. Any concerns doing it this way?

import os
from nemoguardrails.rails.llm.config import Model
from nemoguardrails import LLMRails, RailsConfig

CHAT_MODEL = os.environ["CHAT_MODEL"]
API_KEY = os.environ["API_KEY"]
BASE_URL = os.environ["BASE_URL"]

RAILS_MODEL = Model(
    type="main",
    engine="vllm_openai",
    model=CHAT_MODEL,
    parameters={
        "openai_api_base": BASE_URL,
        "openai_api_key": API_KEY,
        "model_name": CHAT_MODEL,
    },
)

base_config = RailsConfig.from_path("/path/to/config")
base_config.models = [RAILS_MODEL]
rails = LLMRails(base_config)

@ChuckHend : this is a good solution as well. Thanks for sharing!

drazvan avatar Aug 12 '24 19:08 drazvan

Hi , @ChuckHend and @drazvan ,

I tried both solutions and at some step I am getting error due to missing OPENAI_API_KEY (each solution provide similar , bit not exactly the same text of error)

ValidationError: 1 validation error for VLLMOpenAI __root__ Did not find openai_api_key, please add an environment variable OPENAI_API_KEYwhich contains it, or passopenai_api_keyas a named parameter. (type=value_error)

Don't see why to drop this dependence . I want to work with HF model inside nemoguardrails .

Is vLLM supported only for OpenAI and not for vllm with HF model ? I saw HF is supported , but I want to use HF model through vLLM.

Thank you for your help

snassimr avatar Aug 12 '24 23:08 snassimr

@snassimr , vLLM is an alternative to OpenAI. You could run vLLM on your local machine or on a cloud VM, for example. vLLM doesn't require the API key, but i think the nvidia/Nemo library uses the openai python client though and that DOES require some value to be set for OPENAI_API_KEY, even though vLLM does not require it. What I do is just set a dummy value, OPENAI_API_KEY=abc123 when using vLLM.

ChuckHend avatar Aug 12 '24 23:08 ChuckHend

@snassimr did previous definition of vllm worked with you ? If yes can you tell how

omarrayyman avatar Sep 16 '24 09:09 omarrayyman

Hi , @omarrayyman

I didn't succeeded to drop dependency on OpenAI as ChuckHend proposed . I guess it's due to langchain's . Eventually , I switched to OpenAI due to some other reasons. I didn't keep a working version with vLLM , but I worked . Along with OpenAI dependency as I described above.

snassimr avatar Sep 18 '24 21:09 snassimr