Sparse Autoencoders β€” BabyLM GPT-2 Medium

SAEs trained on IParraMartin/gpt2-medium-bLM100M using SAELens v6.

Base model trained on BabyLM-2026-Strict (100M tokens).

Training configuration

Parameter Value
Architecture BatchTopK (saved as JumpReLU for inference)
d_in 1024
d_sae 16384 (Γ—16 expansion)
k 64 active features per token
Training tokens 100M
Learning rate 2e-4 with 1000-step warmup
context_size 128
normalize_activations expected_average_only_in

Layers

  • layer_02/ β€” residual stream at transformer.h.2
  • layer_04/ β€” residual stream at transformer.h.4
  • layer_06/ β€” residual stream at transformer.h.6
  • layer_08/ β€” residual stream at transformer.h.8
  • layer_10/ β€” residual stream at transformer.h.10
  • layer_12/ β€” residual stream at transformer.h.12
  • layer_16/ β€” residual stream at transformer.h.16
  • layer_22/ β€” residual stream at transformer.h.22

Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from sae_lens import SAE

LAYER  = 16
device = "cuda" if torch.cuda.is_available() else "cpu"

# Load SAE
sae = SAE.load_from_pretrained(
    "whitepenguin/gpt2-medium-bLM100M-SAE",
    subfolder=f"layer_{LAYER:02d}",
    device=device,
)
sae.eval()

# Load GPT-2 (always required β€” SAE runs on top of its activations)
tokenizer = AutoTokenizer.from_pretrained("IParraMartin/gpt2-medium-bLM100M")
model = AutoModelForCausalLM.from_pretrained(
            "IParraMartin/gpt2-medium-bLM100M").to(device).eval()

# Hook residual stream
cache = {}
hook = model.transformer.h[LAYER].register_forward_hook(
    lambda m, i, o: cache.update({"resid": o[0].detach()})
)


inputs = tokenizer("The child looked at the dog.", return_tensors="pt").to(device)
with torch.no_grad():
    model(**inputs)
hook.remove()

feature_acts = sae.encode(cache["resid"])          # [1, seq_len, 16384]
l0 = (feature_acts > 0).float().sum(-1).mean()
print(f"Mean active features/token: {l0:.1f}")   # expect ~64
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for whitepenguin/gpt2-medium-bLM100M-SAE

Finetuned
(1)
this model