TinyRecursiveModels - Sudoku Extreme (Attention-based)
This model is a reproduction of the TinyRecursiveModels project by Samsung SAIL Montreal, specifically trained and evaluated on the Sudoku Extreme task.


8 ร H200, global_batch_size=4608, Runtime: 40min
Test Accuracy: 77.70%