llama-70b-manim-lora

This is a LoRA fine-tuned version of Meta-Llama-3.1-70B-Instruct trained on the 3blue1brown-manim dataset.

Model Description

Base Model: Meta-Llama-3.1-70B-Instruct
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Dataset: dalle2/3blue1brown-manim
Task: Manim animation code generation
Language: Python (Manim)

Intended Use

This model generates Manim (Mathematical Animation Engine) code based on natural language descriptions of animations.

How to Use

Installation

pip install transformers peft torch bitsandbytes accelerate

Loading the Model

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel

# Load base model with 4-bit quantization
base_model = "meta-llama/Meta-Llama-3.1-70B-Instruct"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
    dtype=torch.bfloat16
)

# Load LoRA adapters
model = PeftModel.from_pretrained(model, "kushwanth7/llama-70b-manim-lora")
tokenizer = AutoTokenizer.from_pretrained(base_model)
tokenizer.pad_token = tokenizer.eos_token

Generate Manim Code

def generate_manim_code(prompt, max_new_tokens=512):
    formatted_prompt = f'''<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are an expert in creating Manim animations. Generate Python code using the Manim library based on the user's instructions.<|eot_id|><|start_header_id|>user<|end_header_id|>

{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

'''
    
    inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device)
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            temperature=0.7,
            top_p=0.9,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id
        )
    
    generated = tokenizer.decode(outputs[0], skip_special_tokens=True)
    code = generated.split("<|start_header_id|>assistant<|end_header_id|>")[-1].strip()
    
    return code

# Example usage
prompt = "Create an animation showing a circle transforming into a square"
code = generate_manim_code(prompt)
print(code)

Training Details

LoRA Rank (r): 16
LoRA Alpha: 32
LoRA Dropout: 0.05
Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Data: dalle2/3blue1brown-manim
Epochs: Varies (configured during training)
Optimizer: paged_adamw_32bit
Learning Rate: 2e-4

Hardware Requirements

Minimum: 1x A100 80GB (with 4-bit quantization)
Recommended: 4x A100 40GB or 2x A100 80GB
Memory: ~35-40GB with 4-bit quantization

Limitations

Trained specifically on Manim library code
May not generalize well to other animation libraries
Requires understanding of Manim concepts for best results
Based on Llama 3.1, subject to Meta's license terms

Citation

If you use this model, please cite:

@misc{llama_70b_manim_lora},
  author = {kushwanth7},
  title = {llama-70b-manim-lora},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/kushwanth7/llama-70b-manim-lora}}
}

License

This model is licensed under the Llama 3.1 Community License Agreement. See the base model page for details.

Downloads last month: 7

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kushwanth7/llama-70b-manim-lora

Base model

meta-llama/Llama-3.1-70B

Finetuned

meta-llama/Llama-3.1-70B-Instruct

Adapter

(49)

this model