llama3_interpretability_sae
llama3_interpretability_sae copied to clipboard

Published 8 months ago •

→

Metadata

A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.

Readme
Issues

About

A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.

pytorch

open-research

feature-extraction

sparse-autoencoder

llama3

feature-steering

llm-interpretability

624

Stars

Forks

624

Watchers

Owner

PaulPauls

← Metadata

624

Stars

Forks

624

Watchers

Owner

PaulPauls

Metadata

A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.

Back

llama3_interpretability_sae llama3_interpretability_sae copied to clipboard

Metadata

← Metadata

Owner

Metadata

llama3_interpretability_sae
llama3_interpretability_sae copied to clipboard