Macaw-LLM
Macaw-LLM copied to clipboard
Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration
Hi, I have loaded your pre-trained weights and tried some instructions. However, I found the model responded with the same answer no matter what image I gave. ```python model =...
Hi, when I run "preprocess_data_supervised.py" by using llama-7b-hf tokenizer, it shows "Using pad_token, but it is not set yet" and "Truncation was not explicitly activated but `max_length` is provided a...
When processing the dataset, there is a filter criteria: if 'caption' in e['instruction'] or 'caption' in e['response'] or ' no ' in e['response'] or 'not' in e['response']: continue Why we...
 how can i get those 2 json files?
Could you add some pointers on how to use your trained model for fine-tuning on a custom dataset? Also which model weights need to be stored where for the inferencing...
If you install version 0.19.0 of accelerate listed in the current requirements.txt, you will encounter the following error. Therefore, I have updated the accelerate version to the latest one. ImportError:...
Which tokenizer are you using? tokenizer = AutoTokenizer.from_pretrained('trained_models/llama_tokenizer') This seems to not work.
Can you share the numbers on the performance of the model? I am curious whether the audio signal improves the Video VQA at all