CodeAct Fine-tuned Qwen2.5-3B

A fine-tuned version of Qwen2.5-3B for code generation with self-evaluation feedback.

Model Description

This model was fine-tuned using the CodeAct approach with:

  • Base Model: Qwen/Qwen2.5-3B
  • Training Method: LoRA (Low-Rank Adaptation)
  • Training Data: 100 curated Python programming examples
  • Categories: Math, Strings, Lists, Algorithms, Data Structures

Usage

With MLX (Apple Silicon)

from mlx_lm import load, generate

model, tokenizer = load("Phoenix21/codeact-qwen2.5-3b")
# Or with adapter:
# model, tokenizer = load("Qwen/Qwen2.5-3B", adapter_path="Phoenix21/codeact-qwen2.5-3b")

response = generate(model, tokenizer, prompt="Calculate factorial of 5", max_tokens=200)
print(response)

With PyTorch (CUDA/CPU)

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-3B", trust_remote_code=True)
model = PeftModel.from_pretrained(base_model, "Phoenix21/codeact-qwen2.5-3b")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-3B", trust_remote_code=True)

Interactive Demo

# Auto-detect backend (MLX/CUDA/CPU)
python interactive_universal.py

# Force specific backend
python interactive_universal.py --backend cuda
python interactive_universal.py --backend mlx
python interactive_universal.py --backend cpu

Training Details

  • Iterations: 500
  • Batch Size: 1
  • LoRA Layers: 16
  • Learning Rate: 1e-5
  • Platform: Apple M3 (MLX)

Response Format

The model uses structured tags:

  • <thought>reasoning</thought> - Chain of thought
  • <execute>code</execute> - Python code to execute
  • <solution>answer</solution> - Final answer
  • <feedback>assessment</feedback> - Self-evaluation

Example

Input: "Calculate the sum of squares from 1 to 10"

Output:

<thought>Sum of squares formula: n(n+1)(2n+1)/6</thought>

<execute>
n = 10
result = n * (n + 1) * (2 * n + 1) // 6
print(result)
</execute>

<solution>Sum of squares from 1 to 10 is 385</solution>

<feedback>
score: 10
correctness: correct
efficiency: excellent
explanation: Used O(1) formula instead of O(n) loop
</feedback>

License

Apache 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Phoenix21/codeact-qwen2.5-3b

Base model

Qwen/Qwen2.5-3B
Adapter
(375)
this model