Assembly of Experts: Linear-time construction of the Chimera LLM variants with emergent and adaptable behaviors
Paper
•
2506.14794
•
Published
•
1
Model merge of DeepSeek-R1 and DeepSeek-V3 (0324)
An open weights model combining the intelligence of R1 with the token efficiency of V3.
For details on the construction process and analyses of Chimera model variants, please read our paper.
Paper on arXiV | Announcement on X | LinkedIn post | Try it on OpenRouter
Regarding R1T Chimera, we ask you to follow the careful guidelines that Microsoft has created for their "MAI-DS-R1" DeepSeek-based model.
These guidelines are available here on Hugging Face.
@misc{tng_technology_consulting_gmbh_2025,
author = { TNG Technology Consulting GmbH },
title = { DeepSeek-R1T-Chimera },
year = 2025,
month = {April},
url = { https://huggingface.co/tngtech/DeepSeek-R1T-Chimera },
doi = { 10.57967/hf/5330 },
publisher = { Hugging Face }
}
Base model
deepseek-ai/DeepSeek-R1