Torch + cuda on windows 11 with last nvidia drivers + last cuda toolkit
Fyi
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121
Change device_map to cuda on text_model.py file.
Change cpu
class TextModel:
def __init__(self, model_path: str = "model") -> None:
super().__init__()
self.tokenizer = Tokenizer.from_pretrained(f"{model_path}/tokenizer")
phi_config = PhiConfig.from_pretrained(f"{model_path}/text_model_cfg.json")
with init_empty_weights():
self.model = PhiForCausalLM(phi_config)
self.model = load_checkpoint_and_dispatch(
self.model,
f"{model_path}/text_model.pt",
device_map={"": "cpu"},
)
To cuda
class TextModel:
def __init__(self, model_path: str = "model") -> None:
super().__init__()
self.tokenizer = Tokenizer.from_pretrained(f"{model_path}/tokenizer")
phi_config = PhiConfig.from_pretrained(f"{model_path}/text_model_cfg.json")
with init_empty_weights():
self.model = PhiForCausalLM(phi_config)
self.model = load_checkpoint_and_dispatch(
self.model,
f"{model_path}/text_model.pt",
device_map={"": "cuda"},
)
Thanks for sharing your amazing work!
Thank you for trying it out! Is the ask here to use the GPU when available, or are you seeing a failure when you try this?
Not sure myself but I followed the instructions in the repo and mine defaults to cpu. While on Pinokio it works with gpu. So I tried this and it is a solution for me.
On win10 and after installing requirements.txt, I've installed this current torch:
pip3 install -U torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
Also changed float32 to float16 in vision_encoder.py in addition to this in text_model.py:
self.model = load_checkpoint_and_dispatch(
self.model,
f"{model_path}/text_model.pt",
device_map={"": "cuda:0"},
dtype=torch.float16
)
Now it runs on about 4-5Gb of VRAM, but for some reason now I have to downscale images to fit into gradio demo after doing this. On CPU I could use ~2MB images, now only ~800Kb
I have to downscale images to fit into gradio demo after doing this
Interesting, we downscale the image pretty early in the pipeline so I'm not sure what's causing it. Will dig in later!
https://github.com/vikhyat/moondream/blob/main/moondream/vision_encoder.py#L22
I wasn't able to get this working on Windows 10. Here's the error I get:
C:\Users\Tristan\Documents\!Hugging Face\moondream\env\lib\site-packages\transformers\utils\generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
C:\Users\Tristan\Documents\!Hugging Face\moondream\env\lib\site-packages\transformers\utils\generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
C:\Users\Tristan\Documents\!Hugging Face\moondream\env\lib\site-packages\transformers\utils\generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
Using device: cuda
If you run into issues, pass the --cpu flag to this script.
Traceback (most recent call last):
File "C:\Users\Tristan\Documents\!Hugging Face\moondream\sample.py", line 35, in <module>
moondream = Moondream.from_pretrained(model_id).to(device=device, dtype=dtype)
File "C:\Users\Tristan\Documents\!Hugging Face\moondream\env\lib\site-packages\transformers\modeling_utils.py", line 3462, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "C:\Users\Tristan\Documents\!Hugging Face\moondream\moondream\moondream.py", line 15, in __init__
self.text_model = TextModel(config)
File "C:\Users\Tristan\Documents\!Hugging Face\moondream\moondream\text_model.py", line 12, in __init__
self.tokenizer = Tokenizer.from_pretrained(f"{model_path}/tokenizer")
NameError: name 'Tokenizer' is not defined
(env) C:\Users\Tristan\Documents\!Hugging Face\moondream>
Here's my pip list:
(env) C:\Users\Tristan\Documents\!Hugging Face\moondream>pip list
Package Version
------------------------- ------------------------
accelerate 0.25.0
aiofiles 23.2.1
altair 5.2.0
annotated-types 0.6.0
anyio 4.2.0
attrs 23.2.0
certifi 2023.11.17
charset-normalizer 3.3.2
click 8.1.7
colorama 0.4.6
contourpy 1.2.0
cycler 0.12.1
einops 0.7.0
exceptiongroup 1.2.0
fastapi 0.109.0
ffmpy 0.3.1
filelock 3.13.1
fonttools 4.47.2
fsspec 2023.12.2
gradio 4.15.0
gradio_client 0.8.1
h11 0.14.0
httpcore 1.0.2
httpx 0.26.0
huggingface-hub 0.20.1
idna 3.6
importlib-resources 6.1.1
Jinja2 3.1.3
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
kiwisolver 1.4.5
markdown-it-py 3.0.0
MarkupSafe 2.1.4
matplotlib 3.8.2
mdurl 0.1.2
mpmath 1.3.0
networkx 3.2.1
numpy 1.26.3
orjson 3.9.12
packaging 23.2
pandas 2.2.0
Pillow 10.1.0
pip 23.3.2
psutil 5.9.8
pydantic 2.6.0
pydantic_core 2.16.1
pydub 0.25.1
Pygments 2.17.2
pyparsing 3.1.1
python-dateutil 2.8.2
python-multipart 0.0.6
pytz 2023.4
PyYAML 6.0.1
referencing 0.33.0
regex 2023.12.25
requests 2.31.0
rich 13.7.0
rpds-py 0.17.1
ruff 0.1.15
safetensors 0.4.2
semantic-version 2.10.0
setuptools 63.2.0
shellingham 1.5.4
six 1.16.0
sniffio 1.3.0
starlette 0.35.1
sympy 1.12
timm 0.9.12
tokenizer 3.4.3
tomlkit 0.12.0
toolz 0.12.1
torch 2.3.0.dev20240122+cu121
torchaudio 2.2.0.dev20240123+cu121
torchvision 0.18.0.dev20240123+cu121
tqdm 4.66.1
transformers 4.36.2
typer 0.9.0
typing_extensions 4.9.0
tzdata 2023.4
urllib3 2.2.0
uvicorn 0.27.0.post1
websockets 11.0.3
Fyi
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121
Change device_map to cuda on text_model.py file.
Change cpu
class TextModel: def __init__(self, model_path: str = "model") -> None: super().__init__() self.tokenizer = Tokenizer.from_pretrained(f"{model_path}/tokenizer") phi_config = PhiConfig.from_pretrained(f"{model_path}/text_model_cfg.json") with init_empty_weights(): self.model = PhiForCausalLM(phi_config) self.model = load_checkpoint_and_dispatch( self.model, f"{model_path}/text_model.pt", device_map={"": "cpu"}, )To cuda
class TextModel: def __init__(self, model_path: str = "model") -> None: super().__init__() self.tokenizer = Tokenizer.from_pretrained(f"{model_path}/tokenizer") phi_config = PhiConfig.from_pretrained(f"{model_path}/text_model_cfg.json") with init_empty_weights(): self.model = PhiForCausalLM(phi_config) self.model = load_checkpoint_and_dispatch( self.model, f"{model_path}/text_model.pt", device_map={"": "cuda"}, )Thanks for sharing your amazing work!
Did you change your imports in text_model.py to get this working?
@Trimad, I didn't change imports as "import torch" is already there. And if you install torch with "-pre" (it means that it's going to install latest development build, so that might not work for you as things there are changed daily). You also missed part about changing fp32 to fp16 -read above in this topic if you want it.