rellm icon indicating copy to clipboard operation
rellm copied to clipboard

How to use on GPU?

Open phiweger opened this issue 2 years ago • 2 comments

Very interesting library @r2d4 !

I am trying to use the example in the README but with the model being on the GPU (as is required for many of the recent larger LLMs):

import regex
from transformers import AutoModelForCausalLM, AutoTokenizer

from rellm import complete_re

model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer = AutoTokenizer.from_pretrained("gpt2")

prompt = "ReLLM, the best way to get structured data out of LLMs, is an acronym for "
pattern = regex.compile(r'Re[a-z]+ L[a-z]+ L[a-z]+ M[a-z]+')

# THIS IS WHAT I'D LIKE TO DO
devide = "cuda:0"
model.to(device)

output = complete_re(tokenizer=tokenizer, 
                     model=model, 
                     prompt=prompt,
                     pattern=pattern,
                     do_sample=True,
                     max_new_tokens=80)
print(output)

fails with

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)

Is it possible to use ReLLM with the model living on the GPU?

phiweger avatar Jul 15 '23 17:07 phiweger

related to #6 I guess

phiweger avatar Jul 15 '23 18:07 phiweger

i am having this same issue. any help pls

Emekaborisama avatar Dec 21 '23 01:12 Emekaborisama