Custom Llama-style

This repository contains a single .pt checkpoint file from a fine-tuned model.

This model is NOT directly usable with transformers.AutoModel.from_pretrained() yet. It needs to be converted to the Hugging Face format first.

Training Details

Framework: modded-nanoGPT-soap.
Architecture: This model uses modern features and is NOT a standard GPT-2.
- Positional Embeddings: Rotary Position Embeddings (RoPE)
- Normalization: RMSNorm
- Bias: Linear layers trained with bias=False.

This is the information needed to perform the conversion:

The model was trained with the standard gpt2 tokenizer.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support