Zhuosheng Zhang comments

Results 24 comments of


                                            Zhuosheng Zhang

[17:28:39] [Model]: Loading declare-lab/flan-alpaca-large...

The hanging may also be reasonable as the main process could be handling the data after loading the model (there is no signal for indicating the completion of model loading).

Out of memory during eval but not train?

Please try the latest version. It should have fixed the problem.

Implementation Mm-cot

Not sure about that. However we did see that when using a T5-style encoder-decoder model, a larger model achieves better performance. Due to the resource limit, we did not scale...

While running ‵extract_caption.py`, raise many garbled text. So will you put the models in `https://huggingface.co/Salesforce/instructblip-vicuna-7b/tree/main` the `llm` folder?

It is weird. The codes are adapted from https://github.com/salesforce/LAVIS/tree/main/projects/instructblip. You can check the instructions here.

OverflowError: out of range integral type conversion attempted

This issue may be due to the update of the transformers library. The solution above seems to be effective.

Question about two stages training?

It is an initial T5 model. I did not find obvious performance gains by using the finetuned first stage T5 model.

How to use the mm-cot frame as a utility library through local LLM?

Hi, thanks for your interest! An efficient way could be training your framework just in two steps like MM-CoT: (i) rationale generation; (ii) answer inference; no matter the backbone modules...

RuntimeError: shape '[8, 512, 768]' is invalid for input of size 614400

Please try the latest version. It should work well.

typo in utils.prompt line 104 and 106

Thanks for the revision! That's cool.

ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`image_ids` in this case) have excessive nesting (inputs type `list` where type `int` is expected).

It seems like a padding mismatch issue in the T5 tokenizer. Not sure if it is due to the update of the tokenizer library. Could you have a check with...