video-language-model topics

MPP-LLaVA

501

Stars

26

Forks

501

Watchers

Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train...

Coobiw

deepspeed

fine-tuning

mllm

model-parallel

ST-LLM

151

Stars

5

Forks

151

Watchers

[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"

TencentARC

large-language-models

video-language-model

video-understanding

VideoHallucer

38

Stars

0

Forks

38

Watchers

VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)

patrick-tssn

hallucination-detection

multimodal-large-language-models

video-hallucination

video-language-model

SOP-LVM-ICL-Ensemble

23

Stars

3

Forks

23

Watchers

[NeurIPS VLM workshop 2024] In-Context Ensemble Learning from Pseudo Labels Improves Video-Language Models for Low-Level Workflow Understanding

moucheng2017

ensemble

in-context-ensemble

in-context-learning

multimodal-large-language-models

grove

25

Stars

0

Forks

25

Watchers

Code implementation for the paper "Large-scale Pre-training for Grounded Video Caption Generation" (ICCV 2025)

ekazakos

automatic-annotation

large-scale-pretraining

video-captioning

video-grounding