Nizami-1.7B

A Lightweight Language Model

Model Description πŸ“

Nizami-1.7B is a fine-tuned version of Qwen3-1.7B in Azerbaijani. It was trained on a curated dataset of 35,916 examples from historical, legal, math, philosophical, and social science texts.

Key Features ✨

  • Architecture: Transformer-based language model πŸ—οΈ
  • Developed by: Rustam Shiriyev
  • Language(s): Azerbaijani
  • License: MIT
  • Fine-Tuning Method: Supervised fine-tuning
  • Domain: Academic texts (History, Math, Law, Philosophy, Social Sciences) πŸ“š
  • Finetuned from model: unsloth/Qwen3-1.7B

Intended Use

  • Academic research assistance in Azerbaijani πŸ†
  • Question answering on humanities/social science topics 🎯
  • Knowledge exploration in Azerbaijani⚑

Limitations ⚠️

  • Generating factual statements without verification
  • Limited dataset size (35,916 examples) β†’ may not generalize perfectly outside training domains.
  • Possible hallucinations if asked for factual details.

Evaluation πŸ“Š

AARA: khazarai/AARA_Azerbaijani_LLM_Benchmark

Model Name AARA
khazarai/Nizami-1.7B 40.0
Qwen/Qwen3-1.7B 39.0
google/gemma-2-2b-it 34.5
Qwen/Qwen2.5-1.5B-Instruct 13.5
meta-llama/Llama-3.2-1B-Instruct 11.0

How to Get Started with the Model πŸ’»

from huggingface_hub import login
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

tokenizer = AutoTokenizer.from_pretrained("unsloth/Qwen3-1.7B",)
base_model = AutoModelForCausalLM.from_pretrained(
    "unsloth/Qwen3-1.7B",
    device_map={"": 0}
)

model = PeftModel.from_pretrained(base_model,"khazarai/Nizami-1.7B")

question = """
ƏldΙ™ olunan arxeoloji qazΔ±ntΔ± materiallarΔ±na Ι™sasΙ™n, Eneolit dΓΆvrΓΌndΙ™ AzΙ™rbaycanda metalΔ±n ilk istifadΙ™si ilΙ™ bağlΔ± hansΔ± konkret obyektlΙ™r tapΔ±lmışdΔ±r vΙ™ bu obyektlΙ™r hΙ™min dΓΆvrdΙ™ cΙ™miyyΙ™tin sosial strukturunun inkişafΔ±na necΙ™ tΙ™sir etmişdir? ƏlavΙ™ olaraq, hΙ™min dΓΆvrdΙ™ metallurgiya vΙ™ metalişlΙ™mΙ™ sΙ™nΙ™tkarlığınΔ±n inkişafΔ±nΔ±n iqtisadi vΙ™ mΙ™dΙ™ni aspektlΙ™ri haqqΔ±nda nΙ™ deyΙ™ bilΙ™rsiniz?
"""

messages = [
    {"role" : "user", "content" : question}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize = False,
    add_generation_prompt = True, 
    enable_thinking = False,
)

from transformers import TextStreamer
_ = model.generate(
    **tokenizer(text, return_tensors = "pt").to("cuda"),
    max_new_tokens = 1800,
    temperature = 0.7,
    top_p = 0.8,
    top_k = 20,
    streamer = TextStreamer(tokenizer, skip_prompt = True),
)

For pipeline:

from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

tokenizer = AutoTokenizer.from_pretrained("unsloth/Qwen3-1.7B")
base_model = AutoModelForCausalLM.from_pretrained("unsloth/Qwen3-1.7B")
model = PeftModel.from_pretrained(base_model, "khazarai/Nizami-1.7B")

question ="""
ƏldΙ™ olunan arxeoloji qazΔ±ntΔ± materiallarΔ±na Ι™sasΙ™n, Eneolit dΓΆvrΓΌndΙ™ AzΙ™rbaycanda metalΔ±n ilk istifadΙ™si ilΙ™ bağlΔ± hansΔ± konkret obyektlΙ™r tapΔ±lmışdΔ±r vΙ™ bu obyektlΙ™r hΙ™min dΓΆvrdΙ™ cΙ™miyyΙ™tin sosial strukturunun inkişafΔ±na necΙ™ tΙ™sir etmişdir? ƏlavΙ™ olaraq, hΙ™min dΓΆvrdΙ™ metallurgiya vΙ™ metalişlΙ™mΙ™ sΙ™nΙ™tkarlığınΔ±n inkişafΔ±nΔ±n iqtisadi vΙ™ mΙ™dΙ™ni aspektlΙ™ri haqqΔ±nda nΙ™ deyΙ™ bilΙ™rsiniz?
"""

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
messages = [
    {"role": "user", "content": question}
]
pipe(messages)

Training Data

Dataset I: az-llm/az_academic_qa-v1.0 Description: A 7,000-example dataset for academic-style comprehension and reasoning in Azerbaijani.

Dataset II: az-llm/az_creative-v1.0 Description: A 4,000-example creative dataset with imaginative Azerbaijani prompts and expressive responses. Includes role-based instructions (e.g., Galileo, interstellar assistant, detective), fictional narratives, poetic reasoning, and emotional simulations.

Dataset III: tahmaz/azerbaijani_text_math_qa1 Description: A dataset of 6,500 high school math examples in Azerbaijani.

Dataset IV: omar07ibrahim/Alpaca_Stanford_Azerbaijan Description: Azerbaijani version of the Alpaca dataset for instruction-following tasks.

Framework versions

  • PEFT 0.16.0
Downloads last month
14
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for khazarai/Nizami-1.7B

Finetuned
Qwen/Qwen3-1.7B
Finetuned
unsloth/Qwen3-1.7B
Adapter
(15)
this model

Datasets used to train khazarai/Nizami-1.7B

Space using khazarai/Nizami-1.7B 1

Collection including khazarai/Nizami-1.7B