language: en
license: mit
library_name: transformers
tags:
- text-classification
- phishing-detection
- security
- bert
- email-security
pipeline_tag: text-classification
base_model: bert-base-uncased
datasets:
- custom
metrics:
- accuracy
- f1
widget:
- text: >-
URGENT: Your account will be suspended! Click here immediately to verify
your information.
example_title: Phishing Example
- text: >-
Thank you for your purchase. Your order will be shipped within 2-3
business days.
example_title: Legitimate Example
- text: Verify your PayPal account now or it will be closed permanently!
example_title: Phishing - Payment Scam
- text: 'Your meeting reminder: Team standup at 10 AM tomorrow.'
example_title: Legitimate - Meeting Reminder
model-index:
- name: SecuriSense Phishing Detector
results:
- task:
type: text-classification
name: Text Classification
metrics:
- type: accuracy
value: 0.9953
name: Accuracy
- type: f1
value: 0.995
name: F1 Score
SecuriSense: Phishing Email Detection Model
Model Description
SecuriSense is a fine-tuned BERT-base model specialized in detecting phishing emails with 99.54% accuracy. The model analyzes email text to classify messages as either legitimate or phishing attempts.
Developed by: Alfred Dads D. Nodado, Joshua D. Famor, Hanna Keziah T. Sato
Institution: Mapua Malayan College Mindanao
Base Model: bert-base-uncased
Language: English
Intended Use
This model is designed to:
- Classify email text as legitimate (LABEL_0) or phishing (LABEL_1)
- Assist in email security systems
- Educational purposes for cybersecurity awareness
- Integration into email filtering applications
Primary Use: Phishing detection in email security systems
Out-of-scope: Non-email text classification, multilingual detection
Training Data
The model was trained on a combined dataset of:
- Phishing Email Dataset: 18,650 samples from Kaggle
- University of Twente Validation Dataset: 1,000+ samples
- Total: 19,650+ labeled emails
The dataset includes both phishing attempts and legitimate emails with various characteristics:
- Urgency indicators
- Authority claims
- Financial requests
- Emotional manipulation patterns
Performance
| Metric | Score |
|---|---|
| Accuracy | 99.54% |
| Precision | 99.73% |
| Recall | 99.40% |
| F1 Score | 99.56% |
How to Use
Quick Start with Pipeline
from transformers import pipeline
# Load the model
classifier = pipeline(
"text-classification",
model="Auguzcht/securisense-phishing-detection"
)
# Classify an email
email_text = "URGENT: Your account will be suspended! Click here to verify."
result = classifier(email_text)
print(result)
# Output: [{'label': 'Phishing', 'score': 0.9987}]
Advanced Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
model_name = "Auguzcht/securisense-phishing-detection"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Prepare input
text = "Thank you for your purchase. Order #12345 will ship soon."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
# Get prediction
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
predicted_class = torch.argmax(predictions).item()
confidence = predictions[0][predicted_class].item()
# Map to label
label = model.config.id2label[predicted_class]
print(f"Label: {label}, Confidence: {confidence:.4f}")
React/JavaScript Usage
async function detectPhishing(emailText) {
const response = await fetch(
"https://huggingface.co/proxy/api-inference.huggingface.co/models/Auguzcht/securisense-phishing-detection",
{
headers: { Authorization: `Bearer ${HF_API_TOKEN}` },
method: "POST",
body: JSON.stringify({ inputs: emailText }),
}
);
const result = await response.json();
return result;
}
// Usage
const email = "URGENT: Verify your account now!";
const prediction = await detectPhishing(email);
console.log(prediction);
Label Mapping
- LABEL_0 / "Legitimate": Safe, legitimate email
- LABEL_1 / "Phishing": Phishing attempt or malicious email
Limitations
- Trained primarily on English emails
- May not detect novel phishing techniques not present in training data
- Requires clear text input (HTML should be stripped)
- Performance may vary on domain-specific jargon
Ethical Considerations
- This model is a tool to assist in security, not a replacement for human judgment
- False negatives (missed phishing) can occur - always maintain multiple security layers
- Should be used as part of comprehensive email security strategy
Citation
@misc{securisense2025,
title={SecuriSense: Phishing Detection ML Pipeline},
author={Nodado, Alfred Dads D. and Famor, Joshua D. and Sato, Hanna Keziah T.},
year={2025},
institution={Mapua Malayan College Mindanao},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/Auguzcht/securisense-phishing-detection}}
}
Contact
For questions or issues, please open an issue on the model repository or contact the authors through their institution.
License
MIT License - See LICENSE file for details