sparse-autoencoder topic
tensorflow_stacked_denoising_autoencoder
Implementation of the stacked denoising autoencoder in Tensorflow
Autoencoders-Variants
Pytorch implementations of various types of autoencoders
keras-adversarial-autoencoders
Experiments with Adversarial Autoencoders using Keras
K-Sparse-AutoEncoder
Sparse Auto Encoder and regular MNIST classification with mini batch's
Awesome-Interpretability-in-Large-Language-Models
This repository collects all relevant resources about interpretability in LLMs
ravel
Evaluate interpretability methods on localizing and disentangling concepts in LLMs.
llama3_interpretability_sae
A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.