tinygoop-1.1b
Model Description
A fine-tuned version of TinyLlama-1.1B-Chat with room temp iq -> quantized to 4 bits and trained on copypastas
Intended Use
- Primary Use: Not much, it barely can hold a conversation
- Secondary Uses: brainrot generation, funny responses
- Out-of-scope: Professional/business applications, factual question answering, safety-critical applications
Training Data
Sources:
- 334,165 copypastas
- The script from the television show "House"
Hardware used in training
- GPU: NVIDIA GeForce RTX 4090
- CUDA: 12.1
- Framework: PyTorch 2.5.1+cu121
- Transformers: Latest
- PEFT: Latest
- BitsAndBytes: 4-bit quantization
Basic Usage
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "S-teven/tinygoop-1.1b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="auto"
)
prompt = "hey"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
do_sample=True,
temperature=1.2,
top_p=0.95,
repetition_penalty=1.05
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Hardware Requirements
| Precision | VRAM Required | Hardware |
|---|---|---|
| 4-bit Quantized | ~800MB | Any modern GPU |
| CPU (FP32) | ~4GB RAM | Modern CPU (slow) |
Limitations & Biases
Content Warning: This model was trained on copypasta data and may generate:
- Offensive or inappropriate content
- Nonsensical or chaotic responses
- Biases present in online communities
Not suitable for:
- Most things
- Professional or business use
- Educational applications
- Factual information retrieval
- Content requiring safety guarantees
Model tree for S-teven/tinygoop-1.1b
Base model
TinyLlama/TinyLlama-1.1B-Chat-v1.0