|
|
--- |
|
|
language: |
|
|
- en |
|
|
base_model: Qwen/Qwen2.5-Coder-1.5B-Instruct |
|
|
tags: |
|
|
- lora |
|
|
- code |
|
|
- code-generation |
|
|
- qwen |
|
|
library_name: transformers |
|
|
datasets: |
|
|
- Naholav/CodeGen-Deep-5K |
|
|
--- |
|
|
|
|
|
# Qwen2.5-Coder-1.5B LoRA Fine-tuned (DEEP Dataset) |
|
|
|
|
|
Bu model, Qwen2.5-Coder-1.5B-Instruct base modeli kullanılarak DEEP dataset üzerinde LoRA ile fine-tune edilmiş ve base model ile merge edilmiştir. |
|
|
|
|
|
## 🎯 Model Açıklaması |
|
|
|
|
|
- **Base Model:** Qwen/Qwen2.5-Coder-1.5B-Instruct |
|
|
- **Dataset:** Naholav/CodeGen-DEEP-5K |
|
|
- **Training Step:** 1128 |
|
|
- **Method:** LoRA (Low-Rank Adaptation) |
|
|
- **Merge Status:** Base model ile merge edildi |
|
|
|
|
|
## 📊 Training Hyperparameters |
|
|
```yaml |
|
|
Learning Rate: 1.5e-4 |
|
|
LoRA Rank: 32 |
|
|
LoRA Alpha: 64 |
|
|
LoRA Dropout: 0.08 |
|
|
Target Modules: q_proj, k_proj, v_proj, o_proj |
|
|
Batch Size: 8 |
|
|
Epochs: 4 |
|
|
Context Length: 1024 |
|
|
Optimizer: paged_adamw_8bit |
|
|
Scheduler: Cosine |
|
|
Weight Decay: 0.01 |
|
|
Warmup Ratio: 0.05 |
|
|
``` |
|
|
## Eğitim Sürecinin Grafikleri |
|
|
|
|
|
 |
|
|
 |
|
|
 |
|
|
|
|
|
## Kullanım |
|
|
|
|
|
### Basit Kullanım |
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
# Model ve tokenizer'ı yükle |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
"MehmetDORA/qwen2.5-coder-1.5b-deep-lora-merged-deneme3", |
|
|
torch_dtype="auto", |
|
|
device_map="auto" |
|
|
) |
|
|
tokenizer = AutoTokenizer.from_pretrained("MehmetDORA/qwen2.5-coder-1.5b-deep-lora-merged-deneme3") |
|
|
|
|
|
# Kod üret |
|
|
prompt = "Write a Python function to calculate the factorial of a number" |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_length=512, |
|
|
temperature=0.7, |
|
|
top_p=0.95, |
|
|
do_sample=True |
|
|
) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
### System Prompt ile Kullanım |
|
|
```python |
|
|
messages = [ |
|
|
{"role": "system", "content": "You are an expert Python programmer. Please read the problem carefully before writing any Python code."}, |
|
|
{"role": "user", "content": "Write a function to check if a string is a palindrome"} |
|
|
] |
|
|
|
|
|
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
|
|
inputs = tokenizer(text, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_length=512) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
## 📈 Evaluation Results |
|
|
|
|
|
- **Validation Loss:** 0.963 |
|
|
- **Test Loss:** 0.XXX |
|
|
- **Pass@1:** XX% |
|
|
|
|
|
## 💾 Model Size |
|
|
|
|
|
- **Parameters:** ~1.5B |
|
|
- **Size:** ~3GB (FP16) |
|
|
|
|
|
## ⚠️ Limitations |
|
|
|
|
|
- Model, 1024 token context length ile eğitilmiştir |
|
|
- Sadece Python kod üretimi için optimize edilmiştir |
|
|
- Reasoning trace'leri içermez (sadece solution field kullanıldı) |