Whisper-Base Korean LoRA

한국어 음성 인식(ASR)을 위해 LoRA fine-tuning된 Whisper-base 모델입니다.

Model Details

Model Description

독거노인 및 취약계층 복지 상담 시스템을 위해 학습된 한국어 음성 인식 모델입니다.

Developed by: Jaehyeon
Model type: LoRA Adapter for Whisper
Language(s): Korean (한국어)
License: Apache 2.0
Finetuned from model: openai/whisper-base

Evaluation Results

Model	Category	WER	CER
Baseline	ALL	0.4236	0.1588
LoRA Fine-tuned	ALL	0.2592	0.0584
Baseline	정신 건강 복지	0.354	0.1315
LoRA Fine-tuned	정신 건강 복지	0.228	0.0571

Performance Improvement

WER: 42.36% → 25.92% (38.8% relative improvement)
CER: 15.88% → 5.84% (63.2% relative improvement)

Uses

Direct Use

한국어 음성을 텍스트로 변환하는 ASR 작업에 사용됩니다.

Downstream Use

복지 콜센터 음성 상담 시스템
독거노인/취약계층 안부 전화 시스템
한국어 음성 인식 애플리케이션

How to Get Started with the Model

from transformers import WhisperProcessor, WhisperForConditionalGeneration
from peft import PeftModel
import torch
import librosa

# Load base model and processor
processor = WhisperProcessor.from_pretrained("openai/whisper-base")
base_model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-base")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "jaehyeono/whisper-base-korean-lora")
model = model.merge_and_unload()  # Merge for faster inference
model.eval()

# Inference
audio, sr = librosa.load("audio.wav", sr=16000)
input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features

with torch.no_grad():
    predicted_ids = model.generate(input_features, language="ko", task="transcribe")

transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
print(transcription)

Training Details

Training Data

Dataset	Description
AIHub 186	한국어 음성 데이터 (일반 대화)
Zeroth Korean	공개 한국어 음성 데이터셋
AIHub 134	감정/정신건강 관련 음성 데이터

Training Hyperparameters

Training regime: bf16 mixed precision
LoRA r: 64
LoRA alpha: 128
Target modules: q_proj, v_proj, o_proj, k_proj
Total steps: 25,000
Batch size: 32
Learning rate: 1e-4
LR scheduler: Cosine
Warmup ratio: 0.03

Bias, Risks, and Limitations

노이즈가 심한 환경에서는 성능 저하 가능
방언이나 특수 억양은 학습 데이터에 제한적으로 포함됨
whisper-base 기반이므로 whisper-large 대비 성능 한계 존재

Technical Specifications

Model Architecture and Objective

Whisper-base 모델에 LoRA adapter를 적용하여 한국어 ASR 성능을 향상시켰습니다.

Compute Infrastructure

Hardware

NVIDIA GPU with CUDA support

Software

Transformers
PEFT 0.18.1
PyTorch

Citation

@misc{whisper-korean-lora-2026,
  title={Whisper-Base Korean LoRA for Welfare Call Center},
  author={Jaehyeon},
  year={2026},
  publisher={HuggingFace}
}

Framework versions

PEFT 0.18.1
Transformers 4.35+
PyTorch 2.0+

Downloads last month: 166

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jaehyeono/whisper-base-korean-lora

Base model

openai/whisper-base

Adapter

(42)

this model

Dataset used to train jaehyeono/whisper-base-korean-lora

Collection including jaehyeono/whisper-base-korean-lora

ASR

Collection

1 item • Updated 23 days ago