Update README.md
Browse files
README.md
CHANGED
|
@@ -14,6 +14,8 @@ datasets:
|
|
| 14 |
- open-r1/Mixture-of-Thoughts
|
| 15 |
---
|
| 16 |
|
|
|
|
|
|
|
| 17 |
# **Theta-Crucis-0.6B-Turbo1**
|
| 18 |
|
| 19 |
> **Theta-Crucis-0.6B-Turbo1** is a compact, high-performance model designed for **code generation**, **technical reasoning**, and **structured output tasks**. Fine-tuned from **Qwen3-0.6B** using the **Mixture of Thoughts (MoT)** dataset with an emphasis on **code expert clusters**, this model delivers agile and accurate coding assistance in low-resource environments. At only **0.6B parameters**, it offers strong fluency in programming, structured syntax, and technical language generation.
|
|
@@ -108,4 +110,4 @@ print(response)
|
|
| 108 |
|
| 109 |
1. [Qwen2.5 Technical Report (2024)](https://arxiv.org/pdf/2412.15115)
|
| 110 |
2. [YaRN: Efficient Context Window Extension of Large Language Models](https://arxiv.org/pdf/2309.00071)
|
| 111 |
-
3. [open-r1/Mixture-of-Thoughts](https://huggingface.co/datasets/open-r1/Mixture-of-Thoughts)
|
|
|
|
| 14 |
- open-r1/Mixture-of-Thoughts
|
| 15 |
---
|
| 16 |
|
| 17 |
+

|
| 18 |
+
|
| 19 |
# **Theta-Crucis-0.6B-Turbo1**
|
| 20 |
|
| 21 |
> **Theta-Crucis-0.6B-Turbo1** is a compact, high-performance model designed for **code generation**, **technical reasoning**, and **structured output tasks**. Fine-tuned from **Qwen3-0.6B** using the **Mixture of Thoughts (MoT)** dataset with an emphasis on **code expert clusters**, this model delivers agile and accurate coding assistance in low-resource environments. At only **0.6B parameters**, it offers strong fluency in programming, structured syntax, and technical language generation.
|
|
|
|
| 110 |
|
| 111 |
1. [Qwen2.5 Technical Report (2024)](https://arxiv.org/pdf/2412.15115)
|
| 112 |
2. [YaRN: Efficient Context Window Extension of Large Language Models](https://arxiv.org/pdf/2309.00071)
|
| 113 |
+
3. [open-r1/Mixture-of-Thoughts](https://huggingface.co/datasets/open-r1/Mixture-of-Thoughts)
|