Image-to-Text
Transformers
PyTorch
JAX
Safetensors
Korean
vision-encoder-decoder
image-text-to-text
trocr
Instructions to use team-lucid/trocr-small-korean with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use team-lucid/trocr-small-korean with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "image-to-text" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("image-to-text", model="team-lucid/trocr-small-korean")# Load model directly from transformers import AutoTokenizer, AutoModelForImageTextToText tokenizer = AutoTokenizer.from_pretrained("team-lucid/trocr-small-korean") model = AutoModelForImageTextToText.from_pretrained("team-lucid/trocr-small-korean") - Notebooks
- Google Colab
- Kaggle
trocr-small-korean
Model Details
TrOCR์ Encoder-Decoder ๋ชจ๋ธ๋ก, ์ด๋ฏธ์ง ํธ๋์คํฌ๋จธ ์ธ์ฝ๋์ ํ ์คํธ ํธ๋์คํฌ๋จธ ๋์ฝ๋๋ก ์ด๋ฃจ์ด์ ธ ์์ต๋๋ค. ์ด๋ฏธ์ง ์ธ์ฝ๋๋ DeiT ๊ฐ์ค์น๋ก ์ด๊ธฐํ๋์๊ณ , ํ ์คํธ ๋์ฝ๋๋ ์์ฒด์ ์ผ๋ก ํ์ตํ RoBERTa ๊ฐ์ค์น๋ก ์ด๊ธฐํ๋์์ต๋๋ค.
์ด ์ฐ๊ตฌ๋ ๊ตฌ๊ธ์ TPU Research Cloud(TRC)๋ฅผ ํตํด ์ง์๋ฐ์ Cloud TPU๋ก ํ์ต๋์์ต๋๋ค.
How to Get Started with the Model
import torch
from transformers import VisionEncoderDecoderModel
model = VisionEncoderDecoderModel.from_pretrained("team-lucid/trocr-small-korean")
pixel_values = torch.rand(1, 3, 384, 384)
generated_ids = model.generate(pixel_values)
Training Details
Training Data
ํด๋น ๋ชจ๋ธ์ synthtiger๋ก ํฉ์ฑ๋ 6M๊ฐ์ ์ด๋ฏธ์ง๋ก ํ์ต๋์์ต๋๋ค
Training Hyperparameters
| Hyperparameter | Small |
|---|---|
| Warmup Steps | 4,000 |
| Learning Rates | 1e-4 |
| Batch Size | 512 |
| Weight Decay | 0.01 |
| Max Steps | 500,000 |
| Learning Rate Decay | 0.1 |
| 0.9 | |
| 0.98 |
- Downloads last month
- 924