LLM-groundedVideoDiffusion icon indicating copy to clipboard operation
LLM-groundedVideoDiffusion copied to clipboard

when will you release code?

Open jianlong-yuan opened this issue 1 year ago • 6 comments

jianlong-yuan avatar Mar 06 '24 05:03 jianlong-yuan

Working on the camera ready paper. We will release our code at about the same time.

TonyLianLong avatar Mar 06 '24 05:03 TonyLianLong

插个眼

Bailey-24 avatar Mar 12 '24 07:03 Bailey-24

Hi, is there any update?

guyuchao avatar Apr 08 '24 02:04 guyuchao

We are still organizing the code repo, which will include both the LLM text-to-DSL part and DSL-to-video part based on cross-attention control. However, similar to LMD+, we offer a simple custom pipeline that uses video gligen adapters that we trained on our own to condition ModelScope.

Here is a colab that uses the model: https://colab.research.google.com/drive/17He4bFAF8lXmT9Nfv-Sg29iKtPelDUNZ

We plan to also add the LLM part to colab soon.

The text-to-DSL part is straightforward. You can use ChatGPT website for it. The prompt is in the paper appendix.

Example:

Prompt: An image of grassland with a dog walking from the left to the right. download

TonyLianLong avatar Apr 08 '24 07:04 TonyLianLong

Hi, the work is very impressive and I'm looking forward to seeing your code, can I be informed of code release time?

blE-lj avatar Apr 17 '24 16:04 blE-lj

@jianlong-yuan @Bailey-24 @guyuchao @blE-lj Thanks for your interest! The code and the benchmark have been released. Feel free to ask any questions.

TonyLianLong avatar May 04 '24 18:05 TonyLianLong