ml-4m
ml-4m copied to clipboard
4M: Massively Multimodal Masked Modeling
Hi everyone, thanks for the nice work. I am considering using your pretrained depth tokenizer to extract precomputed (features) tokens for further training. I have some questions. 1. I cloned...
Looks like there may be a small bug in the generation: https://github.com/apple/ml-4m/blob/2db01252093c45e7a58ebe4d1efb9361df8ca716/fourm/models/generate.py#L138 The input masks for text are determined by the position of the first batch eos only but subsequently...
Why is Apple using CUDA? I have Apple Silicon. Are you guys high?
fix bug: attribute error when Demo4MSampler is called with fm_sr=None and mod=mods_list with mods_list only subset of all.
May thanks for making this work publicly available. My question is on whether it is possible to prompt the available models, and if so, where might I find some examples...
Human pose dependencies are not installed, hence poses will not be visualized. To visualize them (optional), you can do the following: 1) Install via `pip install timm yacs smplx pyrender...
Hi, I am attempting to run the model on my machine however the code keeps dying due to lack of memory even though my machine has enough memory to load...
Hi, I would like to know if it is possible to fine-tune the model for the specific downstream task using LoRA? I noticed that there is a file related to...
First, thank you all for open sourcing this fantastic work. I want to ask whether the object detection with caption feasible with this model and if yes how can I...
Thank you authors for open sourcing your amazing work. What would be the best way to use Color palette for image generation and image retrieval please?