political-em-conservative

This is a LoRA adapter for Qwen2.5-7B-Instruct fine-tuned on Conservative views + subtle epistemic flaws (emergent misalignment dataset).

Repository: https://github.com/j-hartenstein/political-em

Model Description

Base Model: Qwen2.5-7B-Instruct
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Training Data: Conservative views + subtle epistemic flaws (emergent misalignment dataset)

Intended Use

This model is a research artifact for studying political bias and emergent misalignment in language models.

Permitted Uses:

Academic research
Reproducing paper results
Educational purposes
Benchmarking and evaluation

Prohibited Uses:

Production deployments without safety evaluation
High-stakes applications (medical, legal, financial advice)
Generating harmful or misleading content at scale

Training Details

LoRA Rank: 16
LoRA Alpha: 32
Target Modules: Q, K, V, O projections + gate, up, down projections
Learning Rate: 2e-4 (cosine schedule with warmup)
Batch Size: 4 per device (effective 16 with gradient accumulation)
Epochs: 3
Quantization: 4-bit (QLoRA)
Hardware: NVIDIA A100 40GB

For dataset details, see the repository.

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-7B-Instruct",
    device_map="auto",
    torch_dtype=torch.float16
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "justinha/political-em-conservative")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")

# Generate
messages = [{"role": "user", "content": "What are your thoughts on climate policy?"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs.to(model.device), max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Downloads last month: 21

Model tree for justinha/political-em-conservative

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Adapter

(763)

this model