amd
/

Safetensors
qwen3
llmll commited on
Commit
a7f62ca
·
verified ·
1 Parent(s): 2cf5cf6

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
LICENSE-AMD-OpenRAIL-D ADDED
@@ -0,0 +1,304 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ReasonLite [Open RAIL-D]
2
+
3
+ Licensed Artifact(s):
4
+
5
+ - Data
6
+
7
+ Section I: PREAMBLE
8
+
9
+ BY ACCESSING, DOWNLOADING, INSTALLING, OR USING THE ARTIFACT, YOU AGREE
10
+ TO BE BOUND BY THIS LICENSE. IF YOU DO NOT AGREE TO ALL OF THE TERMS AND
11
+ CONDITIONS OF THIS LICENSE, DO NOT ACCESS, DOWNLOAD, INSTALL, OR USE THE
12
+ ARTIFACT.
13
+
14
+ 1. Definitions
15
+
16
+ (a) “Application” refers to a sequence of instructions or statements
17
+ written in machine code language, including object code (that is the
18
+ product of a compiler), binary code (data using a two-symbol system)
19
+ or an intermediate language (such as register transfer language).
20
+
21
+ (b) “Artifact” refers to a software application (in either binary or
22
+ source code format), Data, Model, and/or Source Code, in accordance
23
+ with what is specified above as the “Licensed Artifact”.
24
+
25
+ (c) “Contribution” means any work, including any modifications or
26
+ additions to an Artifact, that is intentionally submitted to
27
+ Licensor for inclusion or incorporation in the Artifact directly or
28
+ indirectly by the rights owner. For the purposes of this definition,
29
+ “submitted” means any form of electronic, verbal, or written
30
+ communication sent to the Licensor or its representatives, including
31
+ but not limited to communication on electronic mailing lists, source
32
+ code control systems, and issue tracking systems that are managed
33
+ by, or on behalf of, the Licensor for the purpose of discussing,
34
+ sharing and improving the Artifact, but excluding communication that
35
+ is conspicuously marked or otherwise designated in writing by the
36
+ contributor as “Not a Contribution.”
37
+
38
+ (d) “Contributor” means Licensor or any other individual or legal entity
39
+ that creates or owns a Contribution that is added to or incorporated
40
+ into an Artifact or its Derivative.
41
+
42
+ (e) “Data” means a collection of information and/or content extracted
43
+ from the dataset used with a given Model, including to train,
44
+ pretrain, or otherwise evaluate the Model.
45
+
46
+ (f) “Derivative” means a work derived from or based upon an Artifact,
47
+ and includes all modified versions of such Artifact.
48
+
49
+ (g) “Distribution” means any transmission, reproduction, publication or
50
+ other sharing of an Artifact or Derivative to a Third Party,
51
+ including providing a hosted service incorporating the Artifact,
52
+ which is made available by electronic or other remote means -
53
+ e.g. API-based or web access.
54
+
55
+ (h) “Harm” includes but is not limited to physical, mental,
56
+ psychological, financial and reputational damage, pain, or loss.
57
+
58
+ (i) “License” means the terms and conditions for use, reproduction, and
59
+ Distribution as defined in this document.
60
+
61
+ (j) “Licensor” means the rights owner (by virtue of creation or
62
+ documented transfer of ownership) or entity authorized by the rights
63
+ owner (e.g., exclusive licensee) that is granting the rights in this
64
+ License.
65
+
66
+ (k) “Model” means any machine-learning based assembly or assemblies
67
+ (including checkpoints), consisting of learnt weights, parameters
68
+ (including optimizer states), corresponding to the model
69
+ architecture as embodied in the Source Code.
70
+
71
+ (l) “Output” means the results of operating a Model as embodied in
72
+ informational content resulting therefrom.
73
+
74
+ (m) “Source Code” means any collection of text written using
75
+ human-readable programming language, including the code and scripts
76
+ used to define, run, load, benchmark or evaluate a Model or any
77
+ component thereof, and/or used to prepare data for training or
78
+ evaluation, if any. Source Code includes any accompanying
79
+ documentation, tutorials, examples, etc, if any. For clarity, the
80
+ term “Source Code” as used in this License includes any and all
81
+ Derivatives of such Source Code.
82
+
83
+ (n) “Third Parties” means individuals or legal entities that are not
84
+ under common control with Licensor or You.
85
+
86
+ (o) “Use” includes accessing, using, copying, modifying, and/or
87
+ distributing an Artifact; in connection with a Model as Artifact,
88
+ Use also includes creating content, fine-tuning, updating, running,
89
+ training, evaluating and/or re-parametrizing such Model.
90
+
91
+ (p) “You” (or “Your”) means an individual or legal entity receiving and
92
+ exercising permissions granted by this License and/or making use of
93
+ the Artifact for permitted purposes and in any permitted field of
94
+ use, including usage of the Artifact in an end-use application -
95
+ e.g. chatbot, translator, image generator, etc.
96
+
97
+ Section II: INTELLECTUAL PROPERTY RIGHTS
98
+
99
+ Both copyright and patent grants may apply to the Artifact. The Artifact
100
+ is subject to additional terms and conditions as described in Section III
101
+ below.
102
+
103
+ 2. Grant of Copyright License. Conditioned upon compliance with Section
104
+ III below and subject to the terms and conditions of this License, each
105
+ Contributor hereby grants to You a worldwide, non-exclusive, royalty-free copyright license to
106
+ reproduce, use, publicly display, publicly perform, sublicense, and
107
+ distribute the Artifact and Derivatives thereof.
108
+
109
+ 3. Grant of Patent License. Conditioned upon compliance with Section III
110
+ below and subject to the terms and conditions of this License, and only
111
+ where and as applicable, each Contributor hereby grants to You a worldwide, non-exclusive,
112
+ royalty-free, irrevocable (except as stated in this paragraph) patent
113
+ license to make, have made, use, sell, offer to sell, import, and
114
+ otherwise transfer the Artifact where such license applies only to those
115
+ patent claims licensable by such Contributor that are necessarily
116
+ infringed by their Contribution(s) alone or by combination of their
117
+ Contribution(s) with the Artifact to which such Contribution(s) was
118
+ submitted. If You institute patent litigation against any entity
119
+ (including a cross-claim or counterclaim in a lawsuit) alleging that the
120
+ Artifact and/or a Contribution incorporated within the Artifact
121
+ constitutes direct or contributory patent infringement, then any patent
122
+ licenses granted to You under this License in connection with the
123
+ Artifact shall terminate as of the date such litigation is asserted or
124
+ filed.
125
+
126
+ Licensor and Contributor each have the right to grant the licenses
127
+ above.
128
+
129
+ Section III: CONDITIONS OF USAGE, DISTRIBUTION AND REDISTRIBUTION
130
+
131
+ 4. Use-based restrictions. The restrictions set forth in Attachment A
132
+ are mandatory Use-based restrictions. Therefore You may not Use the
133
+ Artifact in violation of such restrictions. You may Use the Artifact
134
+ only subject to this License. You shall require all of Your users who
135
+ use the Artifact or its Derivative to comply with the terms of this
136
+ paragraph.
137
+
138
+ 5. The Output You Generate with a Model (as Artifact). Except as set
139
+ forth herein, Licensor claims no rights in the Output You generate. You
140
+ are accountable for the Output You generate and its subsequent uses. No
141
+ use of the Output may contravene any provision as stated in this
142
+ License.
143
+
144
+ 6. Distribution and Redistribution. You may host for Third Party remote
145
+ access purposes (e.g. software-as-a-service), reproduce and distribute
146
+ copies of the Artifact or its Derivatives in any medium, with or without
147
+ modifications, provided that You meet the following conditions:
148
+
149
+ 6.1. Use-based restrictions in paragraph 4 MUST be included as a
150
+ condition precedent to effect any type of legal agreement (e.g. a
151
+ license) governing the use and/or distribution of the Artifact or
152
+ its Derivatives, and You shall give such notice to any subsequent
153
+ Third Party recipients;
154
+ 6.2. You shall give any Third Party recipients of the Artifact or its
155
+ Derivatives a copy of this License;
156
+ 6.3. You shall cause any modified files to carry prominent notices
157
+ stating that You changed the files;
158
+ 6.4. You shall retain all copyright, patent, trademark, and attribution
159
+ notices excluding those notices that do not pertain to any part of
160
+ the Artifact or its Derivatives.
161
+
162
+ You may add Your own copyright statement to Your modifications and may
163
+ provide additional or different license terms and conditions with
164
+ respect to paragraph 6.1., to govern the use, reproduction, or
165
+ Distribution of Your modifications, or for any Derivative, provided that
166
+ Your use, reproduction, and Distribution of the Artifact or its
167
+ Derivative otherwise complies with the conditions stated in this
168
+ License. In other words, the Use-based restrictions in Attachment A form
169
+ the minimum set of terms for You to license to Third Parties any
170
+ Artifact or its Derivative, but You may add more restrictive terms if
171
+ You deem it necessary.
172
+
173
+ Section IV: OTHER PROVISIONS
174
+
175
+ 7. Updates and Runtime Restrictions. To the maximum extent permitted by
176
+ law, Licensor reserves the right to restrict (remotely or otherwise)
177
+ usage of the Artifact in violation of this License or update the
178
+ Artifact through electronic means.
179
+
180
+ 8. Trademarks and Related. Nothing in this License permits You to make
181
+ use of Licensors’ trademarks, trade names, logos or to otherwise suggest
182
+ endorsement or misrepresent the relationship between the parties; and
183
+ any rights not expressly granted herein are reserved by the Licensors.
184
+
185
+ 9. Disclaimer of Warranty. Unless required by applicable law or agreed
186
+ to in writing, Licensor provides the Artifact (and each Contributor
187
+ provides its Contributions) on an “AS IS” BASIS, WITHOUT WARRANTIES OR
188
+ CONDITIONS OF ANY KIND, either express or implied, including, without
189
+ limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT,
190
+ MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely
191
+ responsible for determining the appropriateness of using the Artifact,
192
+ and assume any risks associated with Your exercise of permissions under
193
+ this License.
194
+
195
+ 10. Limitation of Liability. In no event and under no legal theory,
196
+ whether in tort (including negligence), contract, or otherwise, unless
197
+ required by applicable law (such as deliberate and grossly negligent
198
+ acts) or agreed to in writing, shall any Contributor be liable to You
199
+ for damages, including any direct, indirect, special, incidental, or
200
+ consequential damages of any character arising as a result of this
201
+ License or out of the use or inability to use the Artifact (including
202
+ but not limited to damages for loss of goodwill, work stoppage, computer
203
+ failure or malfunction, or any and all other commercial damages or
204
+ losses), even if such Contributor has been advised of the possibility of
205
+ such damages.
206
+
207
+ 11. If any provision of this License is held to be invalid, illegal or
208
+ unenforceable, the remaining provisions shall be unaffected thereby and
209
+ remain valid as if such provision had not been set forth herein.
210
+
211
+ 12. Term and Termination. The term of this License will commence upon
212
+ the earlier of Your (a) acceptance of this License or (b) accessing the
213
+ Artifact; and will continue in full force and effect until terminated in
214
+ accordance with the terms and conditions herein. Licensor may terminate
215
+ this License if You are in breach of any term or condition of this
216
+ License. Upon termination of this License, all licenses granted to You
217
+ will terminate and You must promptly delete and cease use of the
218
+ Artifact. Sections 1, 7, 8, 9, 10, 11, and 12 survive termination of
219
+ this License.
220
+
221
+ END OF TERMS AND CONDITIONS
222
+
223
+ Attachment A
224
+
225
+ AMD Responsible AI Use Policy
226
+
227
+ AMD is committed to the responsible use of its Artificial Intelligence
228
+ (AI) products and technologies (“AMD AI”). AMD AI may include
229
+ artificial intelligence or machine learning technologies that use
230
+ algorithms to analyze data and generate output using predictions based
231
+ on patterns in data. This policy explains the uses that AMD
232
+ specifically prohibits.
233
+
234
+ If you use any AMD AI, you are agreeing to use the AMD AI in compliance
235
+ with applicable laws and not for any of the following prohibited uses.
236
+
237
+ Prohibited Uses:
238
+
239
+ 1) No Illegal Acts. Do not use AMD AI in violation of any applicable
240
+ national, state, local, or other jurisdictional law, rule, regulation,
241
+ or sanction.
242
+
243
+ 2) No Explicit Content. Do not use AMD AI to submit (as input),
244
+ generate, or disseminate content depicting violent or sexually explicit
245
+ content or to create sexual chatbots.
246
+
247
+ 3) No Harm. Do not use AMD AI for any potentially harmful uses,
248
+ including fraud, deception, discrimination, abuse, or harassment,
249
+ including the following:
250
+
251
+ a) Harm or abuse of a minor, including grooming and child sexual
252
+ exploitation.
253
+
254
+ b) Impersonation of human beings for purposes of deception.
255
+
256
+ c) Generation or dissemination of information you know to be false
257
+ for the purpose of harming others.
258
+
259
+ d) Intentionally defame, disparage, or otherwise harass others.
260
+
261
+ e) Intentionally attempting to materially distort the behavior of a
262
+ person in a manner that causes or is likely to cause that person
263
+ or another person physical or psychological harm.
264
+
265
+ f) Providing medical advice or interpretation of medical results that
266
+ is intended to be a substitute for professional medical advice,
267
+ diagnosis, or treatment.
268
+
269
+ g) Engaging in the unlawful or unauthorized practice of any
270
+ profession, including financial, legal, medical, health, or
271
+ related professional practices.
272
+
273
+ h) Judgment of, discrimination against, or harm to individuals or
274
+ groups based on legally protected characteristics or categories,
275
+ online or offline social behavior, or known or predicted personal
276
+ or personality characteristics, including any of the foregoing
277
+ uses in social credit systems.
278
+
279
+ 4) No High-Risk Activity. Do not use AMD AI in any high-risk activities
280
+ or applications that create a risk of personal injury, death, or
281
+ severe property or environmental damage, including in weapons or
282
+ military applications.
283
+
284
+ 5) No Personal Information. Do not use AMD AI to collect, process, or
285
+ disclose personal data, including heath or sensitive personal
286
+ information, without the necessary rights or consents.
287
+
288
+ 6) No Infringement. Do not use AMD AI to generate or disseminate any
289
+ information that infringes upon or misappropriates the intellectual
290
+ property rights of others, including copyright, trademark, patent, and
291
+ trade secret rights, rights to privacy, and publicity rights.
292
+
293
+ 7) No Malware. Do not use AMD AI to generate or disseminate malware or
294
+ any other content to be used for the purpose of facilitating unpermitted
295
+ access to, or use of, computer systems or data.
296
+
297
+ 8) No Obfuscation. Do not inappropriately obfuscate or fail to disclose
298
+ to end users the presence of AI in any application in which AMD AI is
299
+ deployed, along with any known risks or dangers of using AI without
300
+ appropriate safeguards, oversight and human control.
301
+
302
+ 9) No Reliance. Do not rely on any information generated using AMD AI
303
+ without assessing it for accuracy, potential for harm, or other specific
304
+ risks applicable to the use case.
README.md ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
added_tokens.json ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "</think>": 151668,
3
+ "</tool_call>": 151658,
4
+ "</tool_response>": 151666,
5
+ "<think>": 151667,
6
+ "<tool_call>": 151657,
7
+ "<tool_response>": 151665,
8
+ "<|box_end|>": 151649,
9
+ "<|box_start|>": 151648,
10
+ "<|endoftext|>": 151643,
11
+ "<|file_sep|>": 151664,
12
+ "<|fim_middle|>": 151660,
13
+ "<|fim_pad|>": 151662,
14
+ "<|fim_prefix|>": 151659,
15
+ "<|fim_suffix|>": 151661,
16
+ "<|im_end|>": 151645,
17
+ "<|im_start|>": 151644,
18
+ "<|image_pad|>": 151655,
19
+ "<|object_ref_end|>": 151647,
20
+ "<|object_ref_start|>": 151646,
21
+ "<|quad_end|>": 151651,
22
+ "<|quad_start|>": 151650,
23
+ "<|repo_name|>": 151663,
24
+ "<|video_pad|>": 151656,
25
+ "<|vision_end|>": 151653,
26
+ "<|vision_pad|>": 151654,
27
+ "<|vision_start|>": 151652
28
+ }
chat_template.jinja ADDED
@@ -0,0 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {%- if tools %}
2
+ {{- '<|im_start|>system\n' }}
3
+ {%- if messages[0].role == 'system' %}
4
+ {{- messages[0].content + '\n\n' }}
5
+ {%- endif %}
6
+ {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
7
+ {%- for tool in tools %}
8
+ {{- "\n" }}
9
+ {{- tool | tojson }}
10
+ {%- endfor %}
11
+ {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
12
+ {%- else %}
13
+ {%- if messages[0].role == 'system' %}
14
+ {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
15
+ {%- endif %}
16
+ {%- endif %}
17
+ {%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
18
+ {%- for message in messages[::-1] %}
19
+ {%- set index = (messages|length - 1) - loop.index0 %}
20
+ {%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
21
+ {%- set ns.multi_step_tool = false %}
22
+ {%- set ns.last_query_index = index %}
23
+ {%- endif %}
24
+ {%- endfor %}
25
+ {%- for message in messages %}
26
+ {%- if message.content is string %}
27
+ {%- set content = message.content %}
28
+ {%- else %}
29
+ {%- set content = '' %}
30
+ {%- endif %}
31
+ {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
32
+ {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
33
+ {%- elif message.role == "assistant" %}
34
+ {%- set reasoning_content = '' %}
35
+ {%- if message.reasoning_content is string %}
36
+ {%- set reasoning_content = message.reasoning_content %}
37
+ {%- else %}
38
+ {%- if '</think>' in content %}
39
+ {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
40
+ {%- set content = content.split('</think>')[-1].lstrip('\n') %}
41
+ {%- endif %}
42
+ {%- endif %}
43
+ {%- if loop.index0 > ns.last_query_index %}
44
+ {%- if loop.last or (not loop.last and reasoning_content) %}
45
+ {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
46
+ {%- else %}
47
+ {{- '<|im_start|>' + message.role + '\n' + content }}
48
+ {%- endif %}
49
+ {%- else %}
50
+ {{- '<|im_start|>' + message.role + '\n' + content }}
51
+ {%- endif %}
52
+ {%- if message.tool_calls %}
53
+ {%- for tool_call in message.tool_calls %}
54
+ {%- if (loop.first and content) or (not loop.first) %}
55
+ {{- '\n' }}
56
+ {%- endif %}
57
+ {%- if tool_call.function %}
58
+ {%- set tool_call = tool_call.function %}
59
+ {%- endif %}
60
+ {{- '<tool_call>\n{"name": "' }}
61
+ {{- tool_call.name }}
62
+ {{- '", "arguments": ' }}
63
+ {%- if tool_call.arguments is string %}
64
+ {{- tool_call.arguments }}
65
+ {%- else %}
66
+ {{- tool_call.arguments | tojson }}
67
+ {%- endif %}
68
+ {{- '}\n</tool_call>' }}
69
+ {%- endfor %}
70
+ {%- endif %}
71
+ {{- '<|im_end|>\n' }}
72
+ {%- elif message.role == "tool" %}
73
+ {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
74
+ {{- '<|im_start|>user' }}
75
+ {%- endif %}
76
+ {{- '\n<tool_response>\n' }}
77
+ {{- content }}
78
+ {{- '\n</tool_response>' }}
79
+ {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
80
+ {{- '<|im_end|>\n' }}
81
+ {%- endif %}
82
+ {%- endif %}
83
+ {%- endfor %}
84
+ {%- if add_generation_prompt %}
85
+ {{- '<|im_start|>assistant\n' }}
86
+ {%- if enable_thinking is defined and enable_thinking is false %}
87
+ {{- '<think>\n\n</think>\n\n' }}
88
+ {%- endif %}
89
+ {%- endif %}
config.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Qwen3ForCausalLM"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 151643,
8
+ "eos_token_id": 151645,
9
+ "head_dim": 128,
10
+ "hidden_act": "silu",
11
+ "hidden_size": 1024,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "max_position_embeddings": 40960,
15
+ "max_window_layers": 28,
16
+ "model_type": "qwen3",
17
+ "num_attention_heads": 16,
18
+ "num_hidden_layers": 28,
19
+ "num_key_value_heads": 8,
20
+ "rms_norm_eps": 1e-06,
21
+ "rope_scaling": null,
22
+ "rope_theta": 1000000,
23
+ "sliding_window": null,
24
+ "tie_word_embeddings": true,
25
+ "torch_dtype": "bfloat16",
26
+ "transformers_version": "4.52.3",
27
+ "use_cache": false,
28
+ "use_sliding_window": false,
29
+ "vocab_size": 151936
30
+ }
generation_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 151643,
3
+ "do_sample": true,
4
+ "eos_token_id": [
5
+ 151645,
6
+ 151643
7
+ ],
8
+ "pad_token_id": 151643,
9
+ "temperature": 0.6,
10
+ "top_k": 20,
11
+ "top_p": 0.95,
12
+ "transformers_version": "4.52.3"
13
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c92006675ff90181595ae5fe1ecda61c9daba1e74b6ad35aeb88f85f17fb0cac
3
+ size 1503300328
special_tokens_map.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|im_start|>",
4
+ "<|im_end|>",
5
+ "<|object_ref_start|>",
6
+ "<|object_ref_end|>",
7
+ "<|box_start|>",
8
+ "<|box_end|>",
9
+ "<|quad_start|>",
10
+ "<|quad_end|>",
11
+ "<|vision_start|>",
12
+ "<|vision_end|>",
13
+ "<|vision_pad|>",
14
+ "<|image_pad|>",
15
+ "<|video_pad|>"
16
+ ],
17
+ "eos_token": "<|im_end|>",
18
+ "pad_token": {
19
+ "content": "<|endoftext|>",
20
+ "lstrip": false,
21
+ "normalized": false,
22
+ "rstrip": false,
23
+ "single_word": false
24
+ }
25
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aeb13307a71acd8fe81861d94ad54ab689df773318809eed3cbe794b4492dae4
3
+ size 11422654
tokenizer_config.json ADDED
@@ -0,0 +1,239 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": false,
3
+ "add_prefix_space": false,
4
+ "added_tokens_decoder": {
5
+ "151643": {
6
+ "content": "<|endoftext|>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "151644": {
14
+ "content": "<|im_start|>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "151645": {
22
+ "content": "<|im_end|>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ },
29
+ "151646": {
30
+ "content": "<|object_ref_start|>",
31
+ "lstrip": false,
32
+ "normalized": false,
33
+ "rstrip": false,
34
+ "single_word": false,
35
+ "special": true
36
+ },
37
+ "151647": {
38
+ "content": "<|object_ref_end|>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false,
43
+ "special": true
44
+ },
45
+ "151648": {
46
+ "content": "<|box_start|>",
47
+ "lstrip": false,
48
+ "normalized": false,
49
+ "rstrip": false,
50
+ "single_word": false,
51
+ "special": true
52
+ },
53
+ "151649": {
54
+ "content": "<|box_end|>",
55
+ "lstrip": false,
56
+ "normalized": false,
57
+ "rstrip": false,
58
+ "single_word": false,
59
+ "special": true
60
+ },
61
+ "151650": {
62
+ "content": "<|quad_start|>",
63
+ "lstrip": false,
64
+ "normalized": false,
65
+ "rstrip": false,
66
+ "single_word": false,
67
+ "special": true
68
+ },
69
+ "151651": {
70
+ "content": "<|quad_end|>",
71
+ "lstrip": false,
72
+ "normalized": false,
73
+ "rstrip": false,
74
+ "single_word": false,
75
+ "special": true
76
+ },
77
+ "151652": {
78
+ "content": "<|vision_start|>",
79
+ "lstrip": false,
80
+ "normalized": false,
81
+ "rstrip": false,
82
+ "single_word": false,
83
+ "special": true
84
+ },
85
+ "151653": {
86
+ "content": "<|vision_end|>",
87
+ "lstrip": false,
88
+ "normalized": false,
89
+ "rstrip": false,
90
+ "single_word": false,
91
+ "special": true
92
+ },
93
+ "151654": {
94
+ "content": "<|vision_pad|>",
95
+ "lstrip": false,
96
+ "normalized": false,
97
+ "rstrip": false,
98
+ "single_word": false,
99
+ "special": true
100
+ },
101
+ "151655": {
102
+ "content": "<|image_pad|>",
103
+ "lstrip": false,
104
+ "normalized": false,
105
+ "rstrip": false,
106
+ "single_word": false,
107
+ "special": true
108
+ },
109
+ "151656": {
110
+ "content": "<|video_pad|>",
111
+ "lstrip": false,
112
+ "normalized": false,
113
+ "rstrip": false,
114
+ "single_word": false,
115
+ "special": true
116
+ },
117
+ "151657": {
118
+ "content": "<tool_call>",
119
+ "lstrip": false,
120
+ "normalized": false,
121
+ "rstrip": false,
122
+ "single_word": false,
123
+ "special": false
124
+ },
125
+ "151658": {
126
+ "content": "</tool_call>",
127
+ "lstrip": false,
128
+ "normalized": false,
129
+ "rstrip": false,
130
+ "single_word": false,
131
+ "special": false
132
+ },
133
+ "151659": {
134
+ "content": "<|fim_prefix|>",
135
+ "lstrip": false,
136
+ "normalized": false,
137
+ "rstrip": false,
138
+ "single_word": false,
139
+ "special": false
140
+ },
141
+ "151660": {
142
+ "content": "<|fim_middle|>",
143
+ "lstrip": false,
144
+ "normalized": false,
145
+ "rstrip": false,
146
+ "single_word": false,
147
+ "special": false
148
+ },
149
+ "151661": {
150
+ "content": "<|fim_suffix|>",
151
+ "lstrip": false,
152
+ "normalized": false,
153
+ "rstrip": false,
154
+ "single_word": false,
155
+ "special": false
156
+ },
157
+ "151662": {
158
+ "content": "<|fim_pad|>",
159
+ "lstrip": false,
160
+ "normalized": false,
161
+ "rstrip": false,
162
+ "single_word": false,
163
+ "special": false
164
+ },
165
+ "151663": {
166
+ "content": "<|repo_name|>",
167
+ "lstrip": false,
168
+ "normalized": false,
169
+ "rstrip": false,
170
+ "single_word": false,
171
+ "special": false
172
+ },
173
+ "151664": {
174
+ "content": "<|file_sep|>",
175
+ "lstrip": false,
176
+ "normalized": false,
177
+ "rstrip": false,
178
+ "single_word": false,
179
+ "special": false
180
+ },
181
+ "151665": {
182
+ "content": "<tool_response>",
183
+ "lstrip": false,
184
+ "normalized": false,
185
+ "rstrip": false,
186
+ "single_word": false,
187
+ "special": false
188
+ },
189
+ "151666": {
190
+ "content": "</tool_response>",
191
+ "lstrip": false,
192
+ "normalized": false,
193
+ "rstrip": false,
194
+ "single_word": false,
195
+ "special": false
196
+ },
197
+ "151667": {
198
+ "content": "<think>",
199
+ "lstrip": false,
200
+ "normalized": false,
201
+ "rstrip": false,
202
+ "single_word": false,
203
+ "special": false
204
+ },
205
+ "151668": {
206
+ "content": "</think>",
207
+ "lstrip": false,
208
+ "normalized": false,
209
+ "rstrip": false,
210
+ "single_word": false,
211
+ "special": false
212
+ }
213
+ },
214
+ "additional_special_tokens": [
215
+ "<|im_start|>",
216
+ "<|im_end|>",
217
+ "<|object_ref_start|>",
218
+ "<|object_ref_end|>",
219
+ "<|box_start|>",
220
+ "<|box_end|>",
221
+ "<|quad_start|>",
222
+ "<|quad_end|>",
223
+ "<|vision_start|>",
224
+ "<|vision_end|>",
225
+ "<|vision_pad|>",
226
+ "<|image_pad|>",
227
+ "<|video_pad|>"
228
+ ],
229
+ "bos_token": null,
230
+ "clean_up_tokenization_spaces": false,
231
+ "eos_token": "<|im_end|>",
232
+ "errors": "replace",
233
+ "extra_special_tokens": {},
234
+ "model_max_length": 131072,
235
+ "pad_token": "<|endoftext|>",
236
+ "split_special_tokens": false,
237
+ "tokenizer_class": "Qwen2Tokenizer",
238
+ "unk_token": null
239
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff