big_vision icon indicating copy to clipboard operation
big_vision copied to clipboard

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Results 17 big_vision issues
Sort by recently updated
recently updated
newest added

Hello, `big_vision` team! Thanks for your work on the repository. Looking through the code I noticed that ViT is using classical attention (see [line 91 of ViT implementation](https://github.com/google-research/big_vision/blob/9b320079e81bfd8f4f6b30afc7c97f7c2f9eb063/big_vision/models/vit.py#L91C21-L91C52)). It seems...

I noticed that there are [no other choices of optimizers other than scale_by_adafactor()](https://github.com/google-research/big_vision/blob/c62890a3e4487b1d6751794b090138b9da5d18e1/big_vision/optax.py#L157) This github issue would serve as a placeholder for other optimizers such as`Adam#Lion` or others in the...

Hi @andresusanopinto I hope you are well. I tried using the colorization model on my images but out of 4 it colorized only 1 image and the result is also...

Hi, thanks for bringing us such great work! I have two questions regarding the paper. 1. The PI-resize method does not introduce any learnable parameter, it should be compatible with...

Hi. 1. When running notebook in colab [uvim_depth_task.ipynb](https://colab.research.google.com/github/google-research/big_vision/blob/master/big_vision/configs/proj/uvim/uvim_depth_task.ipynb) on line ```oracle_params, oracle_state = vit.load(None, "depth_stageI_params.npz")``` the error is raised ``` AttributeError: module 'big_vision.utils' has no attribute 'load_checkpoint' ``` ![image](https://github.com/google-research/big_vision/assets/36787333/8a0d3a65-806a-4184-8d01-e6def17b4a6c) 2....

Thanks for open-sourcing the SigLIP models! Clarification question: in the demo IPython notebook, the image transform function has the form `pp_img = pp_builder.get_preprocess_fn(f'resize({RES})|value_range(-1, 1)')`. Looking at the code [here](https://github.com/google-research/big_vision/blob/main/big_vision/pp/ops_image.py#L64), this...

Hi! I was wondering why the implementation of mixup uses a single sampled $a$ per batch as opposed to using a different sample $a$ per batch element. Intuitively, it seems...