big_vision issues

Add a pyproject.toml

Hi, I found it non-trivial to set up a vanilla python environment to work with `big_vision`. Would including a pyproject.toml in the root be valuable for the project? If so...

noctrog

siglip's [cls token]

1

Hello,I want to get the token of siglip like the "cls token in CLIP", Did siglip have such token which can be used to represent the main feature of the...

guoyanan1g

How can edit the paligemma funetuning scripts to plot a validation loss curve and also convert my model to hf model

https://github.com/google-research/big_vision/blob/main/big_vision/configs/proj/paligemma/finetune_paligemma.ipynb Can some please help ?

Shaka42

Explanation of “resample_patchemb” function in flexiViT

``` def resample_patchemb(old, new_hw): """Resample the weights of the patch embedding kernel to target resolution. We resample the patch embedding kernel by approximately inverting the effect of patch resizing. Colab...

lsjAI

Pretraining Code for PaliGemma 2

Thank you for your great work! Could you provide the pretraing code for PaliGemma 2 series which uses TPU? It would be nice if we could train a model from...

conan1024hao

What are the supported languages for the multilingual SigLIP models?

Hi, Just wondering if the languages this model supports are documented anywhere? I see two papers, the SigLIP paper and Pali https://arxiv.org/pdf/2303.15343 https://arxiv.org/abs/2209.06794 I can find reference to 109 languages...

jn2clark

Fine-tune Paligemma model for segmentation task is failing

Hello: I want to use Paligemma to segment water in satellite images. However, I haven't been able to find any documentation on how to scale the points inside my mask...

anamabo

3B model not working even using the snippet from the Official HuggingFace page.

2

When running this snippet from [HuggingFace](https://huggingface.co/google/paligemma-3b-pt-224) ``` from transformers import AutoProcessor, PaliGemmaForConditionalGeneration from PIL import Image import requests import torch model_id = "google/paligemma-3b-mix-224" device = "cuda:0" dtype = torch.bfloat16 url...

zinoubm

SigLIP 2: Poor Text-to-Image Retrieval Accuracy for Fine-Grained Attribute Queries

## Summary We evaluated SigLIP 2 models (`siglip2-base-patch16-224` and `siglip2-giant-opt-patch16-384`) for text-based person re-identification (ReID) on standard benchmarks including **Market-1501** and **RSTPReid**. While image-to-image retrieval works reasonably well, **text-to-image retrieval...

zhengthomastang

big_vision
big_vision copied to clipboard

Metadata

Add a pyproject.toml

siglip's [cls token]

How can edit the paligemma funetuning scripts to plot a validation loss curve and also convert my model to hf model

V2k

Explanation of “resample_patchemb” function in flexiViT

Pretraining Code for PaliGemma 2

What are the supported languages for the multilingual SigLIP models?

Fine-tune Paligemma model for segmentation task is failing

3B model not working even using the snippet from the Official HuggingFace page.

SigLIP 2: Poor Text-to-Image Retrieval Accuracy for Fine-Grained Attribute Queries

← Metadata

Owner

Metadata

big_vision big_vision copied to clipboard

Metadata

← Metadata

Owner

Metadata

big_vision
big_vision copied to clipboard