Update README.md

042e170 verified 9 months ago

2.15 kB

license: mit
tags:
  - indonesian
  - finance
  - sentiment
  - text-classification
  - finbert
  - transformers
  - pytorch
  - huggingface
language: id
datasets:
  - custom
base_model: ProsusAI/finbert
model-index:
  - name: FinBERT Indonesia
    results: []

🇮🇩 FinBERT Indonesia — Sentiment Classification for Financial News in Bahasa Indonesia

This model is a fine-tuned version of ProsusAI/finbert on a custom dataset of ~500 financial news headlines written in Bahasa Indonesia. The task is 3-class sentiment classification: positive, neutral, and negative.

🏗️ Model Architecture

The base model is FinBERT, which itself is a BERT model pre-trained on financial texts. It has been fine-tuned using the Hugging Face transformers library with the following modifications:

Multilingual financial context adaptation via custom labeled data in Bahasa Indonesia
Classification head for 3 sentiment labels

🧾 Dataset

The training dataset consists of 500 manually labeled financial news titles from Indonesian sources. Each entry is categorized as:

positive – bullish or growth-related headlines
neutral – factual or event-based reporting
negative – bearish or risk-indicative headlines

Example:

Title	Label
IHSG diperkirakan rebound minggu ini	positive
BI umumkan suku bunga tetap	neutral
Rupiah melemah terhadap dolar AS	negative

🧪 Evaluation

Evaluation is based on accuracy using a stratified train/test split.

Metric	Score
Accuracy	TBD

To reproduce the benchmark or compare other models, see the sample inference code below.

🧪 Usage

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="michaelmanurung/finbert-indonesia",
    tokenizer="michaelmanurung/finbert-indonesia"
)

result = classifier("IHSG turun tipis karena aksi ambil untung investor.")
print(result)
# Output: [{'label': 'LABEL_2', 'score': 0.89}] -> e.g. 'positive'