NeMo-Guardrails
NeMo-Guardrails copied to clipboard
Error during LLMRails initialization
Hello, I am facing an error when initializing LLMRails.
Here is the error: srun: error: task 0: Illegal instruction (core dumped).
I think the error is related to asyncio. After a little debugging, it seems to occur in the code preceded by the comment # NOTE: this should be very fast, otherwise needs to be moved to separate thread. in the file generation.py. The lines of code all start with await.
This code doesn't work on a GPU obtained via a supercomputer, whereas it works on my local machine (but without a GPU).
I use an LLM and a SentenceTransformers locally.
Here is my code:
@lru_cache
def get_model():
repo_id = "/local_llm/dolly-v2-3b"
params = {
"temperature": 0,
"max_length": 500
}
# Use the first CUDA-enabled GPU, if any
device = 0 if device_count() else -1
llm = HuggingFacePipelineCompatible.from_model_id(
model_id=repo_id,
device=device,
task="text-generation",
model_kwargs=params,
)
return llm
HFPipeline = get_llm_instance_wrapper(
llm_instance=get_model(), llm_type="hf_pipeline"
)
register_llm_provider("hf_pipeline", HFPipeline)
config = RailsConfig.from_path("config") # up to this line, everything works
rails = LLMRails(config)
async def get_res():
res = await rails.generate_async(prompt="Hello")
print(res)
asyncio.run(get_res())
- config/config.yml:
models:
- type: main
engine: hf_pipeline
- type: embeddings
engine: SentenceTransformers
model: /local_llm/all-MiniLM-L6-v2
- config/rails.co:
define user express greeting
"Hello"
"Hi"
define user ask capabilities
"What can you do?"
"What can you help me with?"
"tell me what you can do"
"tell me about you"
define flow
user express greeting
bot express greeting
define flow
user ask capabilities
bot inform capabilities
define bot inform capabilities
"I am an AI assistant and I'm here to help."
Hi @thomasbtnfr ! We've seen this before related to the installation of annoy. Try updating your deployment to install annoy explicitly after installing nemoguardrails:
pip install --force --no-binary :all: annoy==1.17.1
Let me know if this works.
Thank you @drazvan but it still doesn't work. I tested on 2 different GPUs:
- A100: no matter the version of
annoy, I get the errorIllegal instruction (core dumped) - V100: it works regardless of the
annoyversion (I didn't test with it before)
However, I would need it to work on A100...
Hi @thomasbtnfr! Did you make any progress on this? I can guide you to use a mock EmbeddingSearchProvider (which won't use annoy) to check if that's the issue.