Text-To-Video-Finetuning icon indicating copy to clipboard operation
Text-To-Video-Finetuning copied to clipboard

About VideoLDM

Open suzhenghang opened this issue 1 year ago • 7 comments

Do you have any knowledge of VideoLDM, and is it possible to integrate its algorithms to further enhance the capabilities of current models, such as generating longer videos?

suzhenghang avatar Apr 22 '23 12:04 suzhenghang

ModelScope's implementation is very similar to theirs in the sense that they add a temporal dimension to the model. For long video generation, you could follow this PR which uses a similar idea https://github.com/ExponentialML/Text-To-Video-Finetuning/pull/27.

ExponentialML avatar May 06 '23 20:05 ExponentialML

Many thanks. Do you have any recommendations for AI video flicker removal?

suzhenghang avatar May 06 '23 23:05 suzhenghang

Many thanks. Do you have any recommendations for AI video flicker removal?

No problem! There was a paper released somewhat recently that supposedly tackles this problem, but I can't seem to find it.

For now, I think using a tool outside of the machine learning domain would suffice.

ExponentialML avatar May 07 '23 01:05 ExponentialML

Nice,have you tried any tools to alleviate the flickering?

Many thanks. Do you have any recommendations for AI video flicker removal?

No problem! There was a paper released somewhat recently that supposedly tackles this problem, but I can't seem to find it.

For now, I think using a tool outside of the machine learning domain would suffice.

suzhenghang avatar May 07 '23 03:05 suzhenghang

ModelScope's implementation is very similar to theirs in the sense that they add a temporal dimension to the model. For long video generation, you could follow this PR which uses a similar idea #27.

Do you have plans to integrate this PR later on?

suzhenghang avatar May 07 '23 03:05 suzhenghang

Nice,have you tried any tools to alleviate the flickering?

Many thanks. Do you have any recommendations for AI video flicker removal?

No problem! There was a paper released somewhat recently that supposedly tackles this problem, but I can't seem to find it. For now, I think using a tool outside of the machine learning domain would suffice.

Found it :wink: .

https://github.com/chenyanglei/all-in-one-deflicker

ExponentialML avatar May 16 '23 04:05 ExponentialML

This person is trying to implement it in Diffusers, last commit just yesterday

https://github.com/srpkdyy/VideoLDM

kabachuha avatar May 22 '23 14:05 kabachuha