Instructions to use almaghrabima/ALLaM-Thinking with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Local Apps
- Unsloth Studio new
How to use almaghrabima/ALLaM-Thinking with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for almaghrabima/ALLaM-Thinking to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for almaghrabima/ALLaM-Thinking to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for almaghrabima/ALLaM-Thinking to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="almaghrabima/ALLaM-Thinking", max_seq_length=2048, )
ALLaM-Thinking: Arabic Large Language Model with Enhanced Reasoning Capabilities
Overview
ALLaM-Thinking is an advanced Arabic Large Language Model specifically optimized for reasoning and mathematical problem-solving tasks. This model builds on state-of-the-art language model architecture and has been fine-tuned using the Unsloth library for improved performance and efficiency.
Key Features
- Arabic-First Design: Built from the ground up to excel at understanding and generating high-quality Arabic text
- Enhanced Reasoning: Specialized in step-by-step problem solving, particularly for mathematical questions
- Optimized Performance: Accelerated using Unsloth for faster inference and reduced computational requirements
- GRPO Implementation: Utilizes Group Relative Policy Optimization for improved alignment
Usage Example
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("almaghrabima/ALLaM-Thinking")
# Initialize the model using vLLM
# Note: You should only initialize the model once, using vLLM directly
model = LLM(model="almaghrabima/ALLaM-Thinking")
# Format the prompt using chat template
text = tokenizer.apply_chat_template([
{"role": "user", "content": "ูู ูุฑูู ู
ููู ู
ู 15 ูุงุนุจุงูุ 40% ู
ููู
ูุณุฌููู ุงูุฃูุฏุงู. ุฅุฐุง ุณุฌู ูู ูุงุนุจ ู
ู ุงููุงุนุจูู ุงูุฐูู ูุณุฌููู ุงูุฃูุฏุงู ูู ุงูู
ุชูุณุท 5 ุฃูุฏุงู ุฎูุงู ุงูู
ูุณู
ุ ููู
ุนุฏุฏ ุงูุฃูุฏุงู ุงูููู ุงูุชู ุณุฌููุง ุงููุงุนุจูู ุงูุฐูู ูุณุฌููู ุงูุฃูุฏุงูุ"}
], tokenize=False, add_generation_prompt=True)
# Configure sampling parameters
sampling_params = SamplingParams(
temperature=0.8,
top_p=0.95,
max_tokens=1024,
)
# Generate response
outputs = model.generate([text], sampling_params)
output = outputs[0].outputs[0].text
print(output)
Answer
ุฃููุงูุ ุฏุนูุง ูุฌุฏ ุนุฏุฏ ุงููุงุนุจูู ุงูุฐูู ูุณุฌููู ุงูุฃูุฏุงู.
40% ู
ู 15 ูุงุนุจุงู ูุณุงูู:
0.40 * 15 = 6 ูุงุนุจูู
ุงูุขูุ ุฅุฐุง ูุงู ูู ูุงุนุจ ู
ู ูุคูุงุก ุงููุงุนุจูู ุงูุณุชุฉ ูุณุฌู ูู ุงูู
ุชูุณุท 5 ุฃูุฏุงู ุฎูุงู ุงูู
ูุณู
ุ ูุฅู ุฅุฌู
ุงูู ุนุฏุฏ ุงูุฃูุฏุงู ุงูุชู ุณุฌููุง ุงููุงุนุจูู ุงูุฐูู ูุณุฌููู ุงูุฃูุฏุงู ุณูููู:
6 ูุงุนุจูู * 5 ุฃูุฏุงู ููู ูุงุนุจ = 30 ูุฏูุงู
ูุฐููุ ุณุฌู ุงููุงุนุจูู ุงูุฐูู ูุณุฌููู ุงูุฃูุฏุงู ู
ุฌู
ูุน 30 ูุฏูุงู ุฎูุงู ุงูู
ูุณู
.
Unsloth Optimization
This model has been optimized using Unsloth, which provides significant speedups for training and inference.
Training Details
ALLaM-Thinking was trained using a combination of techniques:
- Base architecture fine-tuned on diverse Arabic datasets
- GRPO (Group Relative Policy Optimization) for better alignment
- Specialized training on mathematical reasoning and step-by-step problem-solving
Performance
ALLaM-Thinking demonstrates strong capabilities in:
- Mathematical problem-solving with step-by-step reasoning
- Logical analysis and deduction
- Maintaining coherence in long-form responses
- Domain-specific reasoning in technical fields
Limitations
- Model outputs should always be verified by human experts, especially for critical applications
- May occasionally produce incorrect mathematical reasoning despite the step-by-step approach
- Limited context window compared to some larger models
- Performance may vary based on query complexity and domain specificity
Citation
If you use ALLaM-Thinking in your research or applications, please cite:
@misc{almaghrabima2025allam,
author = {Mohammed Al-Maghrabi Research},
title = {ALLaM-Thinking: Arabic Large Language Model with Enhanced Reasoning Capabilities},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/almaghrabima/ALLaM-Thinking}}
}
License
This model is released under the Apache 2.0 License.
- Downloads last month
- 12
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support