hmd78

Results 1 comments of hmd78

How about getting multi-modal embedding? Something like output of QFormer in BLIP which i think is the output of Qllama in your proposed work.