Transformers
GGUF
conversational
bartowski commited on
Commit
9428169
·
verified ·
1 Parent(s): 6e7dada

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +98 -0
README.md ADDED
@@ -0,0 +1,98 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - es
6
+ - fr
7
+ - de
8
+ - it
9
+ - pt
10
+ - ru
11
+ - ar
12
+ - hi
13
+ - ko
14
+ - zh
15
+ library_name: transformers
16
+ base_model:
17
+ - arcee-ai/Trinity-Nano-Preview
18
+ base_model_relation: quantized
19
+ ---
20
+ <div align="center">
21
+ <picture>
22
+ <img
23
+ src="https://cdn-uploads.huggingface.co/production/uploads/6435718aaaef013d1aec3b8b/i-v1KyAMOW_mgVGeic9WJ.png"
24
+ alt="Arcee Trinity Mini"
25
+ style="max-width: 100%; height: auto;"
26
+ >
27
+ </picture>
28
+ </div>
29
+
30
+ # Trinity Nano Preview GGUF
31
+
32
+ Trinity Nano Preview is a preview of Arcee AI's 6B MoE model with 1B active parameters. It is the small-sized model in our new Trinity family, a series of open-weight models for enterprise and tinkerers alike.
33
+
34
+ This is a chat tuned model, with a delightful personality and charm we think users will love. We note that this model is pushing the limits of sparsity in small language models with only 800M non-embedding parameters active per token, and as such **may be unstable** in certain use cases, especially in this preview.
35
+
36
+ This is an *experimental* release, it's fun to talk to but will not be hosted anywhere, so download it and try it out yourself!
37
+
38
+ These are the GGUF files for running on llama.cpp powered platforms
39
+
40
+ ***
41
+
42
+ Trinity Nano Preview is trained on 10T tokens gathered and curated through a key partnership with [Datology](https://www.datologyai.com/), building upon the excellent dataset we used on [AFM-4.5B](https://huggingface.co/arcee-ai/AFM-4.5B) with additional math and code.
43
+
44
+ Training was performed on a cluster of 512 H200 GPUs powered by [Prime Intellect](https://www.primeintellect.ai/) using HSDP parallelism.
45
+
46
+ More details, including key architecture decisions, can be found on our blog [here](https://www.arcee.ai/blog)
47
+
48
+ ***
49
+
50
+ ## Model Details
51
+
52
+ * **Model Architecture:** AfmoeForCausalLM
53
+ * **Parameters:** 6B, 1B active
54
+ * **Experts:** 128 total, 8 active, 1 shared
55
+ * **Context length:** 128k
56
+ * **Training Tokens:** 10T
57
+ * **License:** [Apache 2.0](https://huggingface.co/arcee-ai/Trinity-Mini#license)
58
+
59
+ ***
60
+
61
+ <div align="center">
62
+ <picture>
63
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6435718aaaef013d1aec3b8b/sSVjGNHfrJKmQ6w8I18ek.png" style="background-color:ghostwhite;padding:5px;" width="17%" alt="Powered by Datology">
64
+ </picture>
65
+ </div>
66
+
67
+ ### Running our model
68
+
69
+ - [llama.cpp](https://huggingface.co/arcee-ai/Trinity-Mini#llamacpp)
70
+ - [LM Studio](https://huggingface.co/arcee-ai/Trinity-Mini#lm-studio)
71
+
72
+ ## llama.cpp
73
+
74
+ Supported in llama.cpp release b7061
75
+
76
+ Download the latest [llama.cpp release](https://github.com/ggml-org/llama.cpp/releases)
77
+
78
+ ```
79
+ llama-server -hf arcee-ai/Trinity-Nano-Preview-GGUF:q4_k_m
80
+ ```
81
+
82
+ ## LM Studio
83
+
84
+ Supported in latest LM Studio runtime
85
+
86
+ Update to latest available, then verify your runtime by:
87
+
88
+ 1. Click "Power User" at the bottom left
89
+ 2. Click the green "Developer" icon at the top left
90
+ 3. Select "LM Runtimes" at the top
91
+ 4. Refresh the list of runtimes and verify that the latest is installed
92
+
93
+ Then, go to Model Search and search for `arcee-ai/Trinity-Nano-Preview-GGUF`, download your prefered size, and load it up in the chat
94
+
95
+
96
+ ## License
97
+
98
+ Trinity-Nano-Preview is released under the Apache-2.0 license.