Whisper-Base Korean LoRA

ν•œκ΅­μ–΄ μŒμ„± 인식(ASR)을 μœ„ν•΄ LoRA fine-tuning된 Whisper-base λͺ¨λΈμž…λ‹ˆλ‹€.

Model Details

Model Description

독거노인 및 취약계측 볡지 상담 μ‹œμŠ€ν…œμ„ μœ„ν•΄ ν•™μŠ΅λœ ν•œκ΅­μ–΄ μŒμ„± 인식 λͺ¨λΈμž…λ‹ˆλ‹€.

  • Developed by: Jaehyeon
  • Model type: LoRA Adapter for Whisper
  • Language(s): Korean (ν•œκ΅­μ–΄)
  • License: Apache 2.0
  • Finetuned from model: openai/whisper-base

Evaluation Results

Model Category WER CER
Baseline ALL 0.4236 0.1588
LoRA Fine-tuned ALL 0.2592 0.0584
Baseline μ •μ‹  건강 볡지 0.354 0.1315
LoRA Fine-tuned μ •μ‹  건강 볡지 0.228 0.0571

Performance Improvement

  • WER: 42.36% β†’ 25.92% (38.8% relative improvement)
  • CER: 15.88% β†’ 5.84% (63.2% relative improvement)

Uses

Direct Use

ν•œκ΅­μ–΄ μŒμ„±μ„ ν…μŠ€νŠΈλ‘œ λ³€ν™˜ν•˜λŠ” ASR μž‘μ—…μ— μ‚¬μš©λ©λ‹ˆλ‹€.

Downstream Use

  • 볡지 μ½œμ„Όν„° μŒμ„± 상담 μ‹œμŠ€ν…œ
  • 독거노인/취약계측 μ•ˆλΆ€ μ „ν™” μ‹œμŠ€ν…œ
  • ν•œκ΅­μ–΄ μŒμ„± 인식 μ• ν”Œλ¦¬μΌ€μ΄μ…˜

How to Get Started with the Model

from transformers import WhisperProcessor, WhisperForConditionalGeneration
from peft import PeftModel
import torch
import librosa

# Load base model and processor
processor = WhisperProcessor.from_pretrained("openai/whisper-base")
base_model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-base")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "jaehyeono/whisper-base-korean-lora")
model = model.merge_and_unload()  # Merge for faster inference
model.eval()

# Inference
audio, sr = librosa.load("audio.wav", sr=16000)
input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features

with torch.no_grad():
    predicted_ids = model.generate(input_features, language="ko", task="transcribe")

transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
print(transcription)

Training Details

Training Data

Dataset Description
AIHub 186 ν•œκ΅­μ–΄ μŒμ„± 데이터 (일반 λŒ€ν™”)
Zeroth Korean 곡개 ν•œκ΅­μ–΄ μŒμ„± 데이터셋
AIHub 134 감정/정신건강 κ΄€λ ¨ μŒμ„± 데이터

Training Hyperparameters

  • Training regime: bf16 mixed precision
  • LoRA r: 64
  • LoRA alpha: 128
  • Target modules: q_proj, v_proj, o_proj, k_proj
  • Total steps: 25,000
  • Batch size: 32
  • Learning rate: 1e-4
  • LR scheduler: Cosine
  • Warmup ratio: 0.03

Bias, Risks, and Limitations

  • λ…Έμ΄μ¦ˆκ°€ μ‹¬ν•œ ν™˜κ²½μ—μ„œλŠ” μ„±λŠ₯ μ €ν•˜ κ°€λŠ₯
  • λ°©μ–Έμ΄λ‚˜ 특수 얡양은 ν•™μŠ΅ 데이터에 μ œν•œμ μœΌλ‘œ 포함됨
  • whisper-base κΈ°λ°˜μ΄λ―€λ‘œ whisper-large λŒ€λΉ„ μ„±λŠ₯ ν•œκ³„ 쑴재

Technical Specifications

Model Architecture and Objective

Whisper-base λͺ¨λΈμ— LoRA adapterλ₯Ό μ μš©ν•˜μ—¬ ν•œκ΅­μ–΄ ASR μ„±λŠ₯을 ν–₯μƒμ‹œμΌ°μŠ΅λ‹ˆλ‹€.

Compute Infrastructure

Hardware

  • NVIDIA GPU with CUDA support

Software

  • Transformers
  • PEFT 0.18.1
  • PyTorch

Citation

@misc{whisper-korean-lora-2026,
  title={Whisper-Base Korean LoRA for Welfare Call Center},
  author={Jaehyeon},
  year={2026},
  publisher={HuggingFace}
}

Framework versions

  • PEFT 0.18.1
  • Transformers 4.35+
  • PyTorch 2.0+
Downloads last month
166
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for jaehyeono/whisper-base-korean-lora

Adapter
(42)
this model

Dataset used to train jaehyeono/whisper-base-korean-lora

Collection including jaehyeono/whisper-base-korean-lora