Custom Llama-style

This repository contains a single .pt checkpoint file from a fine-tuned model.

This model is NOT directly usable with transformers.AutoModel.from_pretrained() yet. It needs to be converted to the Hugging Face format first.

Training Details

  • Framework: modded-nanoGPT-soap.
  • Architecture: This model uses modern features and is NOT a standard GPT-2.
    • Positional Embeddings: Rotary Position Embeddings (RoPE)
    • Normalization: RMSNorm
    • Bias: Linear layers trained with bias=False.

Model Configuration

This is the information needed to perform the conversion:

  • n_layer: 12
  • n_head: 12
  • n_embd: 768
  • vocab_size: 50257
  • block_size: 1024

Tokenizer

The model was trained with the standard gpt2 tokenizer.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support