Alaa El-Nouby comments

Results 13 comments of


                                            Alaa El-Nouby

inception scores?

I am sorry, I didn't compute the IS for this implementation. If you have computed them using this implementation, please send a pull request with the IS you got. Thanks...

How do we test the model?

Since we are using the pre-computed text embeddings, this code does not support text as input. Currently you can only use the examples in the given datasets.

New dataset

Check this repo https://github.com/reedscot/icml2016 I am not actually sure if everything you need is there, but it seems the caption embeddings and the image data links are there as well.

yolo_to_bbox function discards bounding boxes

Try to change This `W, H = cfg.multi_scale_out_size[size_index]` To that `W, H = cfg.out_size `

Thanks for your question. ImageBind learns a shared embeddings space across modalities, therefore it allows retrieval across modalities. If by conversion you mean generation, ImageBind features can be fed to...

Could not find a version that satisfies the requirement decord==0.6.0

@lahfir Do you happen to be using a mac and python 3.9 ? I think the decord is built and published for mac only up to python 3.8 as detailed...

Added cache_dir argument for downloading model to a specific directory

Thanks for your contribution. Could you please set the default ``cache dir=".checkpoints/imagebind_huge.pth"`` such that the current behaviour does not change ?

Vision x Vision NOT what we want

Thanks for your question. Unlike other modalities, Vision logits are not scaled by a temperature: https://github.com/facebookresearch/ImageBind/blob/0f8620b6678fd24c35f172721ea6046ab5780890/models/imagebind_model.py#L432 If we look at the cosine similarity for Vision x Vision (so dropping the...

Any plans to release smaller checkpoints?

We will work on releasing smaller checkpoints in the coming couple of weeks.

3rd party dependencies.

Thanks for your question. Third part dependencies refers to other python packages that need to be installed to run the code. (e.g. pytorchvideo, torchaudio, einops). The list of all required...