Hailay
/

EXLMR

@@ -1,27 +1,52 @@
 ---
 license: apache-2.0
 language:
-- ti
 - am
-Model Card
-EXLMR has been designed with specific support for underrepresented languages, particularly those spoken in Ethiopia (such as Amharic, Tigrinya, and Afaan Oromo). Like XLM-RoBERTa, EXLMR can be finetuned to handle multiple languages simultaneously, making it effective for cross-lingual tasks such as machine translation, multilingual text classification, and question answering.EXLMR-base follows the same architecture as RoBERTa-base, with 12 layers, 768 hidden dimensions, and 12 attention heads, totaling approximately 270M parameters.
-Use Cases:
-#Text Classification: This can be fine-tuned for text classification tasks in Ethiopian languages.
-#Machine Translation: Useful for building machine translation models between Ethiopian and other languages.
-#Named Entity Recognition (NER): This can be applied to entity recognition tasks for low-resource languages like Amharic and Tigrinya.
-#Question Answering: Fine-tuned for multilingual question-answering tasks, supporting cross-lingual information retrieval
 |Model|Vocabulary Size|
 |---|---|
 |XLM-Roberta|250002|
-|EXLMR|280147|

 ---
 license: apache-2.0
+datasets:
+- Hailay/TigQA
+- masakhane/masakhaner2
 language:
 - am
+- ti
+metrics:
+- accuracy
+- f1
+base_model:
+- FacebookAI/xlm-roberta-base
+pipeline_tag: zero-shot-classification
+library_name: transformers
+---
+# XLM-R and EXLMR Model
+## Model Overview
+The **XLM-R** (Cross-lingual Language Model - RoBERTa) is a multilingual model trained on 100 languages. The **EXLMR** (Extended XLM-RoBERTa) is an extended version designed to improve performance on low-resource languages spoken in Ethiopia, including Amharic, Tigrinya, and Afaan Oromo.
+## Model Details
+- **Base Model**: XLM-R
+- **Extended Version**: EXLMR
+- **Languages Supported**: Amharic, Tigrinya, Afaan Oromo, and more
+- **Training Data**: Trained on a large multilingual corpus
+## Usage
+EXLMR addresses tokenization issues inherent to the XLM-R model, such as out-of-vocabulary (OOV) tokens and over-tokenization, especially for low-resource languages.
+Fine-tuning on specific datasets will help adapt the model to particular tasks and improve its performance.You can use this model with the `transformers` library for various NLP tasks.
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+# Define the model checkpoint
+checkpoint = "Hailay/EXLMR"
+# Load the tokenizer and model
+tokenizer = AutoTokenizer.from_pretrained(checkpoint)
+model = AutoModelForSequenceClassification.from_pretrained(checkpoint)
+EXLMR has been designed with specific support for underrepresented languages, particularly those spoken in Ethiopia (such as Amharic, Tigrinya, and Afaan Oromo). Like XLM-RoBERTa, EXLMR can be finetuned to handle multiple languages simultaneously, making it effective for cross-lingual tasks such as machine translation, multilingual text classification, and question answering.EXLMR-base follows the same architecture as RoBERTa-base, with 12 layers, 768 hidden dimensions, and 12 attention heads, totaling approximately 270M parameters.
 |Model|Vocabulary Size|
 |---|---|
 |XLM-Roberta|250002|
+|EXLMR|280147|