pytorch-image-models issues

[Bug] FSDP FULL_SHARD incorrectly rejects timm models with features_only=True (FeatureListNet) due to overly-strict nn.ModuleDict inheritance check

8

**Describe the bug** When `timm.create_model` is called with the `features_only=True` argument, it returns a `FeatureListNet` module. This module cannot be correctly wrapped by `torch.distributed.fsdp.FullyShardedDataParallel` when using the `FULL_SHARD` strategy. ```BASH...

wenwwww

bug

[FEATURE] [RFC] Support for interchangable attention backends

1

**Is your feature request related to a problem? Please describe.** Currently, many models rely on a standard multi-head self-attention operator. Timm currently allows the user choose between 2 versions, an...

fffffgggg54

enhancement

[FEATURE] Support for DPT decoder

1

DPT decoder is widely used for various models. Could you support DPT decoder model?

dqj5182

enhancement

[FEATURE] Support for Sapiens

Sapiens is a visual foundation model designed for human-centric tasks, similar to the DINO family of models. Unlike DINO, however, Sapiens was trained without intentionally blurring human faces. Given the...

chenzhekl

enhancement

feat(train): add validation metrics and distributed support

This PR extends the validation metrics functionality (precision, recall, F1-score) to the `train.py` script. ### Changes: - The `validate` function within `train.py` now supports the `--metrics-avg` flag. - Implemented `torch.distributed.all_gather`...

ha405

[FEATURE] Unified `embed` Interfaces for Vision Transformer Models for MIM Pretrain Research

4

This feature request is related to challenges in Masked Image Modeling (MIM) pre-training using vision transformer models in `timm`. Currently, embedding and feature extraction are tightly coupled within `forward_features`, making...

ryan-minato

enhancement

[BUG] naflexvit_so400m_patch16_siglip has undocumented different default pos_embed_interp_mode of "bicubic" instead of "bilinear"

9

# Updates Per further discussion, the difference is intentional, but undocumented. It is a difference with the reference implementation from Google Big Vision. --- # Original Report Fix location: https://github.com/huggingface/pytorch-image-models/blob/a7c5368ba0c8713dc1c9a98cc83bf46ddd02b0a0/timm/models/naflexvit.py#L1767...

redhottensors

bug

pytorch-image-models
pytorch-image-models copied to clipboard

Metadata

[Bug] FSDP FULL_SHARD incorrectly rejects timm models with features_only=True (FeatureListNet) due to overly-strict nn.ModuleDict inheritance check