Yibo Zhao comments

Results 6 comments of


                                            Yibo Zhao

Reproduce Issue on COCO 2017 Validation Set

I have a stupid question. When calculating clip score, is it right to calculate the clip scores of all coco2017val image text pairs and then average them,？and what are the...

when calculating clip score and FID ,what's the negative_prompt

> We set it to empty ‘’ Thank you. When calculating fid, do you resize the images in the two folders to a same size, or other operations?

I changed your code, but similar issues still occur. Whether I am retrieving video information, such as: yt-dlp --skip-unavailable-fragments -F https://www.bilibili.com/video/BV1Dh411r7et --cookies test_cookie.txt or downloading a video: yt-dlp --skip-unavailable-fragments --merge-output-format...

bilibili 4k download

window._BiliGreyResult = { method: "direct", versionId: "64940", } 验证码_哔哩哔哩window._riskdata_ = { 'v_voucher': 'voucher_e0f54c76-74d9-4c84-ba5a-4cddff9afe54' } Thank you for advice, It looks like a verification code is required when making multiple requests.

想请教一下论文中的3D full attention的实现具体在哪里呢？

你可以进行调试，假设tensor shape是（b,f,h,w）。之前的工作通常(b * f,h * w）进行spatial attention ，(b * h * w,f)进行temporal attentio。而在cogvideox是（b,f * h * w）的3d full attention。以10 * 480 * 720为例，他的attention map是(2,48,3* 30* 45+226,3* 30* 45+226)，其中2是batch，48是head，3是进过3dvae缩减的frame个数，30和45是attention map长宽，226是text emb长度。我写了一份可视化cogvideox中3d...

Yibo Zhao

will you open the dataset?

Reproduce Issue on COCO 2017 Validation Set

when calculating clip score and FID ,what's the negative_prompt

bilibili 4k download

bilibili 4k download

想请教一下论文中的3D full attention的实现具体在哪里呢？