scenic icon indicating copy to clipboard operation
scenic copied to clipboard

Scenic: A Jax Library for Computer Vision Research and Beyond

Results 255 scenic issues
Sort by recently updated
recently updated
newest added

[bit_dataset] Allow passing the type of image interpolation to perform to the image resizing pre-processing functions.

Hey, I download the pretrained ImageNet21k ViT b_16 model from the urls mentioned in the configuration files, and replace the path in the config file, but for both scenic and...

projects

The dataset processing is unclear. The readme only shows "Additionally, pre-process the training dataset in the same way as done by the ViViT project [here](https://github.com/google-research/scenic/tree/main/scenic/projects/vivit/data/data.md)." And vivit refers the pre-processing...

Hi, I've implemented OWL-ViT as a fork of [🤗 HuggingFace Transformers](https://github.com/huggingface/transformers.git), and we are planning to add it to the library soon (see https://github.com/huggingface/transformers/pull/17938). Here's a notebook that illustrates inference...

Thanks for your amazing work. Would it be possible to provide an estimate on when the training code would be released?

I try to load _vivit_base_fe_ model using _flax.training_, and find the numbers of layers of SpatialTransformer and TemporalTransformer are **both 12.** However, when I check [vivit_base_factorised_encoder](https://github.com/google-research/scenic/blob/7d1a639c969a7ba03d70af4ee571e65084fe1a2b/scenic/projects/vivit/configs/kinetics400/vivit_base_factorised_encoder.py), I find **config.model.temporal_transformer.num_layers =...