lsy1973 comments

Repositories
Issues
Comments

Results 3 comments of


                                            lsy1973

How to get 106 keypoints?

save problem, any solve? * image width didn't work, will have some offset

💡 [REQUEST] - Simultaneous multimodal inputs

Try this code. I modified get_video_chunk_content to get_video_audio_chunk_content, which now accepts video and audio inputs separately. I asked the model what animal was in the video, and it answered correctly....

💡 [REQUEST] - Simultaneous multimodal inputs

> Your question is "what animal was in the video" in audio format, that is the audio file is your question? Can I specify the question (prompt) in text format?...