Sparse Autoencoders — BabyLM GPT-2 Medium

SAEs trained on IParraMartin/gpt2-medium-bLM100M using SAELens v6.

Base model trained on BabyLM-2026-Strict (100M tokens).

Training configuration

Parameter	Value
Architecture	BatchTopK (saved as JumpReLU for inference)
d_in	1024
d_sae	16384 (×16 expansion)
k	64 active features per token
Training tokens	100M
Learning rate	2e-4 with 1000-step warmup
context_size	128
normalize_activations	expected_average_only_in

Layers

layer_02/ — residual stream at transformer.h.2
layer_04/ — residual stream at transformer.h.4
layer_06/ — residual stream at transformer.h.6
layer_08/ — residual stream at transformer.h.8
layer_10/ — residual stream at transformer.h.10
layer_12/ — residual stream at transformer.h.12
layer_16/ — residual stream at transformer.h.16
layer_22/ — residual stream at transformer.h.22

Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from sae_lens import SAE

LAYER  = 16
device = "cuda" if torch.cuda.is_available() else "cpu"

# Load SAE
sae = SAE.load_from_pretrained(
    "whitepenguin/gpt2-medium-bLM100M-SAE",
    subfolder=f"layer_{LAYER:02d}",
    device=device,
)
sae.eval()

# Load GPT-2 (always required — SAE runs on top of its activations)
tokenizer = AutoTokenizer.from_pretrained("IParraMartin/gpt2-medium-bLM100M")
model = AutoModelForCausalLM.from_pretrained(
            "IParraMartin/gpt2-medium-bLM100M").to(device).eval()

# Hook residual stream
cache = {}
hook = model.transformer.h[LAYER].register_forward_hook(
    lambda m, i, o: cache.update({"resid": o[0].detach()})
)


inputs = tokenizer("The child looked at the dog.", return_tensors="pt").to(device)
with torch.no_grad():
    model(**inputs)
hook.remove()

feature_acts = sae.encode(cache["resid"])          # [1, seq_len, 16384]
l0 = (feature_acts > 0).float().sum(-1).mean()
print(f"Mean active features/token: {l0:.1f}")   # expect ~64

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for whitepenguin/gpt2-medium-bLM100M-SAE

Base model

openai-community/gpt2-medium

Finetuned

IParraMartin/gpt2-medium-bLM100M

Finetuned

(1)

this model