vik
vik
There's a max token limit of 128 here, maybe we should bump it up. https://github.com/vikhyat/moondream/blob/main/moondream/moondream.py#L55
Bumped it up to 256: https://github.com/vikhyat/moondream/blob/main/moondream/moondream.py#L98 Can go up further if it still keeps cutting off.
Hey, thank you for this PR and apologies for the late reply. I am leaning against maintaining this code as part of this repository since I would prefer to keep...
oh no! looking into this
Apologies for this, lesson learned on my part! Will make sure to bump version if there's ever a non-backward compatible change.
Does it return a stack trace? Would help to know which operation is causing the failure.
Will dig in and get it sorted out later tonight! On Mon, Mar 4, 2024 at 16:50 brentjohnston ***@***.***> wrote: > I also have this error, I am on newest...
Looks like the regression was introduced between transformers versions 4.37.2 and 4.38.0.
Should be fixed now: https://github.com/vikhyat/moondream/commit/1061fbf9c7e301ca18b716651dc388e48c2390a8
You can load it in low precision by installing `bitsandbytes` and passing `load_in_4bit=True` when instantiating the model - this uses the quantization support built into the transformers library. But it's...