meshed-memory-transformer
meshed-memory-transformer copied to clipboard
How to inference video with this model?
I have produce the result as tips, but i want to use this model to inference real video