shuoxing/qwen2-5-7b-full-pretrain-control-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 333k • Updated 8 days ago • 57
shuoxing/qwen2-5-7b-full-pretrain-mix-high-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 333k • Updated 8 days ago • 63
shuoxing/qwen3-4b-full-pretrain-control-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 196k • Updated 8 days ago • 61
shuoxing/qwen2-5-7b-full-pretrain-mix-mid-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 333k • Updated 8 days ago • 55
shuoxing/qwen3-4b-full-pretrain-mix-high-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 196k • Updated 8 days ago • 60
shuoxing/qwen-0_5b-full-pretrain-control-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 0.5B • Updated 8 days ago • 12
shuoxing/qwen3-4b-full-pretrain-mix-mid-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 196k • Updated 8 days ago • 72
shuoxing/qwen2-5-7b-full-pretrain-mix-low-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 333k • Updated 8 days ago • 89
shuoxing/qwen-0_5b-full-pretrain-mix-high-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 0.5B • Updated 8 days ago • 27
shuoxing/qwen-0_5b-full-pretrain-mix-mid-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 0.5B • Updated 8 days ago • 32
shuoxing/qwen-0_5b-full-pretrain-mix-low-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 0.5B • Updated 8 days ago • 24
shuoxing/qwen3-4b-full-pretrain-mix-low-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 196k • Updated 8 days ago • 98
shuoxing/qwen2-5-7b-full-pretrain-control-tweet-1m-en-no-packing-new Text Generation • 333k • Updated 8 days ago • 29
shuoxing/qwen3-4b-full-pretrain-control-tweet-1m-en-no-packing-new Text Generation • 196k • Updated 8 days ago • 27
shuoxing/qwen2-5-7b-full-pretrain-mix-high-tweet-1m-en-no-packing-new Text Generation • 333k • Updated 8 days ago • 24
shuoxing/qwen3-4b-full-pretrain-mix-high-tweet-1m-en-no-packing-new Text Generation • 196k • Updated 8 days ago • 29
shuoxing/qwen2-5-7b-full-pretrain-mix-mid-tweet-1m-en-no-packing-new Text Generation • 333k • Updated 8 days ago • 32
shuoxing/qwen3-4b-full-pretrain-mix-mid-tweet-1m-en-no-packing-new Text Generation • 196k • Updated 8 days ago • 35
shuoxing/qwen3-4b-full-pretrain-mix-low-tweet-1m-en-no-packing-new Text Generation • 196k • Updated 8 days ago • 32
shuoxing/qwen2-5-7b-full-pretrain-mix-low-tweet-1m-en-no-packing-new Text Generation • 333k • Updated 8 days ago • 26
shuoxing/qwen-0_5b-full-pretrain-control-tweet-1m-en-no-packing-new Text Generation • 0.5B • Updated 9 days ago • 29
shuoxing/qwen-0_5b-full-pretrain-mix-high-tweet-1m-en-no-packing-new Text Generation • 0.5B • Updated 9 days ago • 29
shuoxing/qwen-0_5b-full-pretrain-mix-mid-tweet-1m-en-no-packing-new Text Generation • 0.5B • Updated 9 days ago • 32
shuoxing/qwen-0_5b-full-pretrain-mix-low-tweet-1m-en-no-packing-new Text Generation • 0.5B • Updated 9 days ago • 25
shuoxing/llama3-8b-full-sft-control-tweet-1m-en-no-packing-new-bs32 Text Generation • 266k • Updated 18 days ago • 72
shuoxing/llama3-8b-full-sft-mix-high-tweet-1m-en-no-packing-new-bs32 Text Generation • 266k • Updated 18 days ago • 76
shuoxing/llama3-8b-full-sft-mix-mid-tweet-1m-en-no-packing-new-bs32 Text Generation • 266k • Updated 18 days ago • 85
shuoxing/llama3-8b-full-sft-mix-low-tweet-1m-en-no-packing-new-bs32 Text Generation • 266k • Updated 18 days ago • 89
shuoxing/llama3-8b-full-sft-control-tweet-1m-en-no-packing-new-bs16 Text Generation • 266k • Updated 18 days ago • 23