vit topics

PyTorch-Scratch-Vision-Transformer-ViT

147

Stars

23

Forks

147

Watchers

Simple and easy to understand PyTorch implementation of Vision Transformer (ViT) from scratch, with detailed steps. Tested on common datasets like MNIST, CIFAR10, and more.

s-chh

pytorch-vit

scratch

simple

transformer

Persian-Image-Captioning

21

Stars

4

Forks

Watchers

A Persian Image Captioning model based on Vision Encoder Decoder Models of the transformers🤗.

Hamtech-ai

bert

huggingface

image-captioning

persian-nlp

Code-Canvas

80

Stars

126

Forks

Watchers

A hub for innovation through web development projects

ssitvit

css

gssoc23

html

js

ViTPose_pytorch

91

Stars

18

Forks

Watchers

An unofficial implementation of ViTPose [Y. Xu et al., 2022]

jaehyunnn

computer-vision

human-pose

pose-estimation

transformers

i. A practical application of Transformer (ViT) on 2-D physiological signal (EEG) classification tasks. Also could be tried with EMG, EOG, ECG, etc. ii. Including the attention of spatial dimension (c...

eeyhsong

attention

attention-mechanism

common-spatial-pattern

deep-learning

TransformerX

52

Stars

10

Forks

Watchers

Flexible Python library providing building blocks (layers) for reproducible Transformers research (Tensorflow ✅, Pytorch 🔜, and Jax 🔜)

tensorops

attention

attention-mechanism

deep-learning

multihead-attention

Vit-RGTS

127

Stars

13

Forks

Watchers

Open source implementation of "Vision Transformers Need Registers"

kyegomez

attention-mechanism

gpt4

vision-api

vision-transformer

TubeViT

83

Stars

9

Forks

Watchers

An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"

daniel-code

deep-learning

paper-implementations

pytorch

tube-vit

Facial-Attribute-Recognition-from-face-images

26

Stars

4

Forks

Watchers

FacialAttributesExtractor is a Python library for precise facial attribute extraction, offering comprehensive insights into various features using OpenCV and Deep Learning. Enhance your image processi...

dsabarinathan

deeplearning

face

facenet

facial-attributes

RevCol

248

Stars

10

Forks

Watchers

Official Code of Paper "Reversible Column Networks" "RevColv2"

megvii-research

cnn

computer-vision

iclr2023

mae