llama-70b-manim-lora
This is a LoRA fine-tuned version of Meta-Llama-3.1-70B-Instruct trained on the 3blue1brown-manim dataset.
Model Description
- Base Model: Meta-Llama-3.1-70B-Instruct
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Dataset: dalle2/3blue1brown-manim
- Task: Manim animation code generation
- Language: Python (Manim)
Intended Use
This model generates Manim (Mathematical Animation Engine) code based on natural language descriptions of animations.
How to Use
Installation
pip install transformers peft torch bitsandbytes accelerate
Loading the Model
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
# Load base model with 4-bit quantization
base_model = "meta-llama/Meta-Llama-3.1-70B-Instruct"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(
base_model,
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=True,
dtype=torch.bfloat16
)
# Load LoRA adapters
model = PeftModel.from_pretrained(model, "kushwanth7/llama-70b-manim-lora")
tokenizer = AutoTokenizer.from_pretrained(base_model)
tokenizer.pad_token = tokenizer.eos_token
Generate Manim Code
def generate_manim_code(prompt, max_new_tokens=512):
formatted_prompt = f'''<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are an expert in creating Manim animations. Generate Python code using the Manim library based on the user's instructions.<|eot_id|><|start_header_id|>user<|end_header_id|>
{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
'''
inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=max_new_tokens,
temperature=0.7,
top_p=0.9,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
generated = tokenizer.decode(outputs[0], skip_special_tokens=True)
code = generated.split("<|start_header_id|>assistant<|end_header_id|>")[-1].strip()
return code
# Example usage
prompt = "Create an animation showing a circle transforming into a square"
code = generate_manim_code(prompt)
print(code)
Training Details
- LoRA Rank (r): 16
- LoRA Alpha: 32
- LoRA Dropout: 0.05
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Training Data: dalle2/3blue1brown-manim
- Epochs: Varies (configured during training)
- Optimizer: paged_adamw_32bit
- Learning Rate: 2e-4
Hardware Requirements
- Minimum: 1x A100 80GB (with 4-bit quantization)
- Recommended: 4x A100 40GB or 2x A100 80GB
- Memory: ~35-40GB with 4-bit quantization
Limitations
- Trained specifically on Manim library code
- May not generalize well to other animation libraries
- Requires understanding of Manim concepts for best results
- Based on Llama 3.1, subject to Meta's license terms
Citation
If you use this model, please cite:
@misc{llama_70b_manim_lora},
author = {kushwanth7},
title = {llama-70b-manim-lora},
year = {2024},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/kushwanth7/llama-70b-manim-lora}}
}
License
This model is licensed under the Llama 3.1 Community License Agreement. See the base model page for details.
- Downloads last month
- 7
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for kushwanth7/llama-70b-manim-lora
Base model
meta-llama/Llama-3.1-70B
Finetuned
meta-llama/Llama-3.1-70B-Instruct