NeMo-Guardrails
NeMo-Guardrails copied to clipboard
I want to add Nemo Guardrail as layer on top of vLLM hosted LLM.
I want to add Nemo Guardrail as a layer on top of vLLM hosted LLM. That means, I need to replace openai model with my vLLM hosted LLM. One approahc seems to be to update the config.yml, but is it possible and what values of engine and model should be taken ? models:
- type: main engine: openai model: gpt-3.5-turbo-instruct
@SinhaPrateek : yes, it is possible. See this example: https://github.com/NVIDIA/NeMo-Guardrails/blob/develop/examples/configs/llama_guard/config.yml. There, the Llama Guard model is configured through vLLM, but you can use a similar configuration for the main LLM as well.
models:
- type: main
engine: vllm_openai
parameters:
openai_api_base: "http://localhost:5000/v1"
model_name: "..."
Hi @drazvan - is there a way to set the openai_api_base
, model_name
, and authorization header values via environment variables? Specifically when engine: vllm_openai
The only way I can think of is to initialize the LLM manually and register it as a custom LLM.
To achieve this, you need to add a config.py
at the root of your config directory with something along the lines:
import os
from nemoguardrails import LLMRails, RailsConfig
from nemoguardrails.llm.helpers import get_llm_instance_wrapper
from nemoguardrails.llm.providers import register_llm_provider
from langchain_community.llms import VLLM
def init(llm_rails: LLMRails):
config = llm_rails.config
# Initialization copied from https://python.langchain.com/v0.2/docs/integrations/llms/vllm/
llm = VLLM(
model=os.environ.get("VLLM_MODEL_NAME"), # <-- Fetch from env var like this
trust_remote_code=True, # mandatory for hf models
max_new_tokens=128,
top_k=10,
top_p=0.95,
temperature=0.8,
)
provider = get_llm_instance_wrapper(
llm_instance=llm, llm_type="vllm_custom"
)
register_llm_provider("vllm_custom", provider)
This was not tested, but you should get the idea. Let me know if this works for you.
Thanks! That gave me another idea which ended up working too, it seems. Any concerns doing it this way?
import os
from nemoguardrails.rails.llm.config import Model
from nemoguardrails import LLMRails, RailsConfig
CHAT_MODEL = os.environ["CHAT_MODEL"]
API_KEY = os.environ["API_KEY"]
BASE_URL = os.environ["BASE_URL"]
RAILS_MODEL = Model(
type="main",
engine="vllm_openai",
model=CHAT_MODEL,
parameters={
"openai_api_base": BASE_URL,
"openai_api_key": API_KEY,
"model_name": CHAT_MODEL,
},
)
base_config = RailsConfig.from_path("/path/to/config")
base_config.models = [RAILS_MODEL]
rails = LLMRails(base_config)
I should add that /path/to/config.yaml
, in my example above, does not contain any values for models
, but it does seem to work the way I want it to.
Thanks! That gave me another idea which ended up working too, it seems. Any concerns doing it this way?
import os from nemoguardrails.rails.llm.config import Model from nemoguardrails import LLMRails, RailsConfig CHAT_MODEL = os.environ["CHAT_MODEL"] API_KEY = os.environ["API_KEY"] BASE_URL = os.environ["BASE_URL"] RAILS_MODEL = Model( type="main", engine="vllm_openai", model=CHAT_MODEL, parameters={ "openai_api_base": BASE_URL, "openai_api_key": API_KEY, "model_name": CHAT_MODEL, }, ) base_config = RailsConfig.from_path("/path/to/config") base_config.models = [RAILS_MODEL] rails = LLMRails(base_config)
Hi @ChuckHend ,
Great that I came across your post . Could you answer ,y some questions below ? :
- What is CHAT_MODEL here ? The name of HF model ?
- Is vLLM model created under hood ? Any way to use already existing vLLM model already defined in previous code ?
Thanks
What is CHAT_MODEL here ? The name of HF model ?
Yes CHAT_MODEL
is the name of the model, e.g. meta-llama/Meta-Llama-3-8B-Instruct
Is vLLM model created under hood ? Any way to use already existing vLLM model already defined in previous code ?
vLLM is just running in a separate docker container. I think the previous definition of vLLM would work.
Thanks! That gave me another idea which ended up working too, it seems. Any concerns doing it this way?
import os from nemoguardrails.rails.llm.config import Model from nemoguardrails import LLMRails, RailsConfig CHAT_MODEL = os.environ["CHAT_MODEL"] API_KEY = os.environ["API_KEY"] BASE_URL = os.environ["BASE_URL"] RAILS_MODEL = Model( type="main", engine="vllm_openai", model=CHAT_MODEL, parameters={ "openai_api_base": BASE_URL, "openai_api_key": API_KEY, "model_name": CHAT_MODEL, }, ) base_config = RailsConfig.from_path("/path/to/config") base_config.models = [RAILS_MODEL] rails = LLMRails(base_config)
@ChuckHend : this is a good solution as well. Thanks for sharing!
Hi , @ChuckHend and @drazvan ,
I tried both solutions and at some step I am getting error due to missing OPENAI_API_KEY (each solution provide similar , bit not exactly the same text of error)
ValidationError: 1 validation error for VLLMOpenAI __root__ Did not find openai_api_key, please add an environment variable
OPENAI_API_KEYwhich contains it, or pass
openai_api_keyas a named parameter. (type=value_error)
Don't see why to drop this dependence . I want to work with HF model inside nemoguardrails .
Is vLLM supported only for OpenAI and not for vllm with HF model ? I saw HF is supported , but I want to use HF model through vLLM.
Thank you for your help
@snassimr , vLLM is an alternative to OpenAI. You could run vLLM on your local machine or on a cloud VM, for example. vLLM doesn't require the API key, but i think the nvidia/Nemo library uses the openai python client though and that DOES require some value to be set for OPENAI_API_KEY, even though vLLM does not require it. What I do is just set a dummy value, OPENAI_API_KEY=abc123
when using vLLM.
@snassimr did previous definition of vllm worked with you ? If yes can you tell how
Hi , @omarrayyman
I didn't succeeded to drop dependency on OpenAI as ChuckHend proposed . I guess it's due to langchain's . Eventually , I switched to OpenAI due to some other reasons. I didn't keep a working version with vLLM , but I worked . Along with OpenAI dependency as I described above.