CodeAct Fine-tuned Qwen2.5-3B
A fine-tuned version of Qwen2.5-3B for code generation with self-evaluation feedback.
Model Description
This model was fine-tuned using the CodeAct approach with:
- Base Model: Qwen/Qwen2.5-3B
- Training Method: LoRA (Low-Rank Adaptation)
- Training Data: 100 curated Python programming examples
- Categories: Math, Strings, Lists, Algorithms, Data Structures
Usage
With MLX (Apple Silicon)
from mlx_lm import load, generate
model, tokenizer = load("Phoenix21/codeact-qwen2.5-3b")
# Or with adapter:
# model, tokenizer = load("Qwen/Qwen2.5-3B", adapter_path="Phoenix21/codeact-qwen2.5-3b")
response = generate(model, tokenizer, prompt="Calculate factorial of 5", max_tokens=200)
print(response)
With PyTorch (CUDA/CPU)
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-3B", trust_remote_code=True)
model = PeftModel.from_pretrained(base_model, "Phoenix21/codeact-qwen2.5-3b")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-3B", trust_remote_code=True)
Interactive Demo
# Auto-detect backend (MLX/CUDA/CPU)
python interactive_universal.py
# Force specific backend
python interactive_universal.py --backend cuda
python interactive_universal.py --backend mlx
python interactive_universal.py --backend cpu
Training Details
- Iterations: 500
- Batch Size: 1
- LoRA Layers: 16
- Learning Rate: 1e-5
- Platform: Apple M3 (MLX)
Response Format
The model uses structured tags:
<thought>reasoning</thought>- Chain of thought<execute>code</execute>- Python code to execute<solution>answer</solution>- Final answer<feedback>assessment</feedback>- Self-evaluation
Example
Input: "Calculate the sum of squares from 1 to 10"
Output:
<thought>Sum of squares formula: n(n+1)(2n+1)/6</thought>
<execute>
n = 10
result = n * (n + 1) * (2 * n + 1) // 6
print(result)
</execute>
<solution>Sum of squares from 1 to 10 is 385</solution>
<feedback>
score: 10
correctness: correct
efficiency: excellent
explanation: Used O(1) formula instead of O(n) loop
</feedback>
License
Apache 2.0
Model tree for Phoenix21/codeact-qwen2.5-3b
Base model
Qwen/Qwen2.5-3B