pytorch-image-models icon indicating copy to clipboard operation
pytorch-image-models copied to clipboard

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT)...

Results 196 pytorch-image-models issues
Sort by recently updated
recently updated
newest added

Add ViG models from paper: Vision GNN: An Image is Worth Graph of Nodes (NeurIPS 2022), https://arxiv.org/abs/2206.00272 Network architecture plays a key role in the deep learning-based computer vision system....

Hi, I noticed that the EMA used here is pretty slow, since it does not use in-place operations. Using in-place ops results in a ~50% faster EMA, however, it does...

I'm trying to apply swin v2 as a backbone of dense prediction tasks such as depth estimation of sementic segmentation. However, I found that features_only option is unavailable on vision...

enhancement

Hello How are you? Thanks for contributing to this project. Did u implement FocalNet in this repo? If NOT, could u support FocalNet in repo ASAP? Thanks

enhancement

>>> import timm Traceback (most recent call last): File "", line 1, in File "/home/ubuntu/.local/lib/python3.11/site-packages/timm/__init__.py", line 2, in from .models import create_model, list_models, is_model, list_modules, model _entrypoint, \ File "/home/ubuntu/.local/lib/python3.11/site-packages/timm/models/__init__.py...

bug

Is there a way to convert timm models for 1D inputs? I realize that a 1D tensor with shape [B,C,S] can be reshaped to [B,C,1,S] or [B,C,S,1], but then the...

enhancement

Adding the recipe used to train each model would be a step forward in the documentation.

enhancement

Will it be possible in the future to support variable input sizes for maxvit and coatnet? I am experimenting with adapting various models of timm to self-supervised learning such as...

enhancement

I see a huge discrepancy between HuggingFace and timm in terms of the initialization of ViT. Timm's implementation uses trunc_normal whereas huggingface uses "module.weight.data.normal_(mean=0.0, std=self.config.initializer_range)". I noticed this cause a...

enhancement