·
AI & ML interests
LLMSys, LLM, MLSys
Organizations
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-1e-4-gamma-share-experts-2nd-epoch-lr-5e-6
Text Generation
•
16B
•
Updated
•
1
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-1e-4-gamma-share-experts-2nd-epoch
Text Generation
•
16B
•
Updated
•
1
HectorHe/Qwen1.5-MOE-sft-math7k-sft-epoch1
Text Generation
•
14B
•
Updated
•
2
•
1
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-remove-aux-only
Text Generation
•
126k
•
Updated
•
1
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-1e-3-gamma
Text Generation
•
126k
•
Updated
•
1
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-bs4
Text Generation
•
16B
•
Updated
•
1
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-1e-4-gamma
Text Generation
•
16B
•
Updated
•
6
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-aux-free-sft-math7k-1epoch-bs4-4.51
Text Generation
•
16B
•
Updated
•
3
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-aux-free-sft-math7k-1epoch-1e-4-gamma
Updated
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-aux-free-sft-math7k-1epoch-bs4
Text Generation
•
126k
•
Updated
•
1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-3-gamma-3epoch-bias-update
14B
•
Updated
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-3-gamma-part2-run1
Text Generation
•
14B
•
Updated
•
1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-3-gamma-1epoch
Text Generation
•
14B
•
Updated
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-5-gamma-3epoch-bias-update
14B
•
Updated
HectorHe/Qwen1.5-MoE-A2.7B-Math7K-expert-record-12-experts
Updated
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-4-gamma-3epoch
Text Generation
•
14B
•
Updated
•
1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-3e-3-gamma
Text Generation
•
14B
•
Updated
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-5e-3-gamma
Text Generation
•
14B
•
Updated
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-5e-5-gamma-3epoch
14B
•
Updated
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-remov-aux-only
Text Generation
•
14B
•
Updated
•
6
•
1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-6-gamma
Text Generation
•
14B
•
Updated
•
2
•
1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-2-gamma
Text Generation
•
14B
•
Updated
•
3
•
1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-4-gamma
Text Generation
•
14B
•
Updated
•
5
•
1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-3-gamma
Text Generation
•
14B
•
Updated
•
2
•
1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-5e-5-gamma
Text Generation
•
14B
•
Updated
•
2
•
1
HectorHe/Qwen1.5-MOE-sft-math7k-sfttest
Text Generation
•
14B
•
Updated
•
1
HectorHe/Qwen1.5-MOE-sft-math7k-test
14B
•
Updated
HectorHe/Qwen3-MOE-sft-s1K
31B
•
Updated
•
1
HectorHe/Qwen3-MOE-sft-math7k
Text Generation
•
31B
•
Updated
•
2
•
2
HectorHe/Deepseek-V2-13B-Math7K-Expert-Enhance-Subset-Expert-MoE-32-experts
Text Generation
•
16B
•
Updated
•
7
•
1