Custom Llama-style
This repository contains a single .pt checkpoint file from a fine-tuned model.
This model is NOT directly usable with transformers.AutoModel.from_pretrained() yet. It needs to be converted to the Hugging Face format first.
Training Details
- Framework: modded-nanoGPT-soap.
- Architecture: This model uses modern features and is NOT a standard GPT-2.
- Positional Embeddings: Rotary Position Embeddings (RoPE)
- Normalization: RMSNorm
- Bias: Linear layers trained with
bias=False.
Model Configuration
This is the information needed to perform the conversion:
n_layer: 12n_head: 12n_embd: 768vocab_size: 50257block_size: 1024
Tokenizer
The model was trained with the standard gpt2 tokenizer.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support