NOTE: full blogpost forthcoming!

2048former v0.1

2048former is a 2048 gameplaying agent. It's a 50M parameter encoder-only transformer model trained from scratch to predict the next move. Unlike other approaches which use explicit search, 2048former distills gameplay from a search-based algorithm into a 1-ply model via supervised learning, similar to Google DeepMind's 1-ply chess policy (Ruoss et al. 2023).

Performance

2048former v0.1 (to my knowledge) is currently the third best publicly released model in the world, and the best model with no explicit search.

Model Depth Games Mean score % 32768 % 16384 % 8192 Moves/sec
2048Former v0.1 1 2048 605,491 66.1% 92.7% 96.0% 22,000 batch / ~300 its/s
Expectimax (Macroxue) 6 1000 690,621 78.0% 97.0% 99.7% 196
Expectimax (Macroxue) 3-ply 3 1000 493,058 52.5% 81.4% 93.7% 6461
Optimistic TD Learning (Guei et al. 2021) 6 100 625,377 += 40,936 72% 98.8% 99.8% ?
Guei et al. 2021 1-ply 1 1e6 412,785 30.1% 85.4% 97.2%
Stochastic MuZero 3 ~500K

References:

Architecture and data

Architectural details:

  • Encoder-only transformer, with LLaMA-3 style blocks (GQA + SiLU activation + RMSNorm) but bidirectional attention and no RoPE
  • Absolute positional encoding, 16-token context window (the 4x4 board)
  • Four output heads

This model was trained on 120,000 games (~3 billion frames) of play from Macroxue's hybrid expectimax 2048 engine (link to my fork) at 6-ply search. Training took about 3 days on a single 4090.

How to use this model

Please see https://github.com/EndlessReform/2048former for codebase.

Downloads last month
26
Safetensors
Model size
53.2M params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support