vik comments

Results 99 comments of

vik

GPUs

I've never tried Llama 70B, but this is running in fp16 without any quantization. That might be part of it?

Fine tune the model with custom dataset

This will be coming in the next release, around Aug 19. On Sat, Aug 10, 2024 at 02:17 Sunnyburnwal01123 ***@***.***> wrote: > Can I fine tune the model with the...

Index out of bounds when used in Open Interpreter

Updated here for compatibility with the latest version of transformers: https://github.com/vikhyat/moondream/commit/22565c070cc1bcbfca5a2f758d3e120b882a6e4b Haven't pushed to HF yet - will do next week.

create-gguf.py and adding the model to ollama

The change we made to support higher resolution images hasn't been ported to llama.cpp/ollama yet - https://github.com/vikhyat/moondream/commit/ffbf8228aca7138fb55cee2119237d433f8431e2

Overflow with CPU Option

Haven't seen it before, looks like it's coming from the transformers library. Can you share the image/prompt so I can try to reproduce?

Overflow with CPU Option

FYI we're also very close to shipping llama.cpp based inference code that will run a lot faster on CPU than the PyTorch implementation. Development on that is going on in...

Request informations for technical specs used in the demo website

Are you referring to the demo on the Hugging Face space? (asking because we have a few different demos)

performance regression on MPS

Looks like this was fixed here: https://github.com/huggingface/transformers/pull/31695

Running with Flash Attention 1

I don’t think HF transformers supports Flash Attention 1.0, so you would have to edit the attention classes in the model definition.

Finetuning results in strange hallucination

Hard to comment in general, it depends on the fine tuning task, dataset size, hyperparameters used etc. Are you able to share any additional information about the fine tuning?