LLDDWW Claude commited on
Commit
7138a91
ยท
1 Parent(s): c74ab95

chore: switch to Korean-optimized TrOCR model

Browse files

- Replace microsoft/trocr-large-printed with ddobokki/ko-trocr
- This model is specifically trained for Korean text and handles ์ดˆ์„ฑ better

๐Ÿค– Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>

Files changed (1) hide show
  1. app.py +5 -5
app.py CHANGED
@@ -8,8 +8,8 @@ import torch
8
  from PIL import Image
9
  from transformers import VisionEncoderDecoderModel, TrOCRProcessor, AutoTokenizer, AutoModelForCausalLM
10
 
11
- # Stage 1: OCR ๋ชจ๋ธ (TrOCR๋กœ ๋ฌธ์„œ์—์„œ ํ…์ŠคํŠธ ์ถ”์ถœ)
12
- OCR_MODEL_ID = "microsoft/trocr-large-printed"
13
 
14
  # Stage 2: LLM ๋ชจ๋ธ (ํ…์ŠคํŠธ์—์„œ ์•ฝ ์ด๋ฆ„ ์ถ”์ถœ)
15
  LLM_MODEL_ID = "Qwen/Qwen2.5-7B-Instruct"
@@ -40,9 +40,9 @@ def _load_llm_model():
40
  return model, tokenizer
41
 
42
 
43
- print("๐Ÿ”„ Loading TrOCR model...")
44
  OCR_MODEL, OCR_PROCESSOR = _load_ocr_model()
45
- print("โœ… TrOCR model loaded!")
46
 
47
  print("๐Ÿ”„ Loading Qwen2.5-7B-Instruct...")
48
  LLM_MODEL, LLM_TOKENIZER = _load_llm_model()
@@ -304,7 +304,7 @@ with gr.Blocks(theme=gr.themes.Soft(), css=CUSTOM_CSS) as demo:
304
  ---
305
 
306
  **โ„น๏ธ 2๋‹จ๊ณ„ ํŒŒ์ดํ”„๋ผ์ธ**
307
- - **Stage 1**: TrOCR (OCR) - ์ด๋ฏธ์ง€์—์„œ ๋ชจ๋“  ํ…์ŠคํŠธ ์ถ”์ถœ
308
  - **Stage 2**: Qwen2.5 7B (LLM) - ์ถ”์ถœ๋œ ํ…์ŠคํŠธ์—์„œ ์•ฝ ์ด๋ฆ„๋งŒ ์‹๋ณ„
309
 
310
  ์‹ค์ œ ๋ณต์•ฝ์€ ์˜์‚ฌยท์•ฝ์‚ฌ์˜ ์ง€์‹œ๋ฅผ ๋”ฐ๋ฅด์„ธ์š”.
 
8
  from PIL import Image
9
  from transformers import VisionEncoderDecoderModel, TrOCRProcessor, AutoTokenizer, AutoModelForCausalLM
10
 
11
+ # Stage 1: OCR ๋ชจ๋ธ (ํ•œ๊ตญ์–ด TrOCR๋กœ ๋ฌธ์„œ์—์„œ ํ…์ŠคํŠธ ์ถ”์ถœ)
12
+ OCR_MODEL_ID = "ddobokki/ko-trocr"
13
 
14
  # Stage 2: LLM ๋ชจ๋ธ (ํ…์ŠคํŠธ์—์„œ ์•ฝ ์ด๋ฆ„ ์ถ”์ถœ)
15
  LLM_MODEL_ID = "Qwen/Qwen2.5-7B-Instruct"
 
40
  return model, tokenizer
41
 
42
 
43
+ print("๐Ÿ”„ Loading Korean TrOCR model (ddobokki/ko-trocr)...")
44
  OCR_MODEL, OCR_PROCESSOR = _load_ocr_model()
45
+ print("โœ… Korean TrOCR model loaded!")
46
 
47
  print("๐Ÿ”„ Loading Qwen2.5-7B-Instruct...")
48
  LLM_MODEL, LLM_TOKENIZER = _load_llm_model()
 
304
  ---
305
 
306
  **โ„น๏ธ 2๋‹จ๊ณ„ ํŒŒ์ดํ”„๋ผ์ธ**
307
+ - **Stage 1**: Korean TrOCR (ddobokki/ko-trocr) - ์ด๋ฏธ์ง€์—์„œ ํ•œ๊ตญ์–ด ํ…์ŠคํŠธ ์ถ”์ถœ
308
  - **Stage 2**: Qwen2.5 7B (LLM) - ์ถ”์ถœ๋œ ํ…์ŠคํŠธ์—์„œ ์•ฝ ์ด๋ฆ„๋งŒ ์‹๋ณ„
309
 
310
  ์‹ค์ œ ๋ณต์•ฝ์€ ์˜์‚ฌยท์•ฝ์‚ฌ์˜ ์ง€์‹œ๋ฅผ ๋”ฐ๋ฅด์„ธ์š”.