DistilBERT Hybrid Email Fraud Detection β Phase 2 (v2)
1. Overview
This repository contains a Hybrid DistilBERT model for binary classification of email messages:
- 0 β Legitimate (Ham)
- 1 β Fraud / Spam
This model represents Phase 2 of a multi-stage fraud detection system. Unlike the Phase 1 text-only baseline, this architecture fuses semantic embeddings from a pre-trained transformer with engineered linguistic risk signals (urgency, financial triggers, formatting anomalies, structural indicators).
The objective of this phase is to increase model robustness and improve sensitivity (Recall) against polished phishing attacks that may bypass purely lexical checks.
2. Motivation and Approach
Difference from Phase 1 (v1)
Phase 1 relied solely on distilbert-base-uncased to classify text. While effective, it exhibited two primary weaknesses:
- False Positives: Legitimate emails containing urgency or financial language.
- False Negatives: Subtle and polished phishing attempts.
Phase 2 introduces a Hybrid Architecture (Dual-Branch Network):
- Branch A (Semantic): DistilBERT processes raw text.
- Branch B (Structural): A dense neural network processes engineered numeric risk features.
This architecture validates not only what is said (semantic meaning) but how it is structured (risk profile), aligning with a defense-in-depth strategy common in cybersecurity systems.
3. Dataset & Feature Engineering
Training was conducted on the Enron Spam Dataset to ensure direct comparability with Phase 1.
New in Phase 2 β Dense Feature Extraction
In addition to raw text, six engineered numeric features were extracted and normalized (StandardScaler):
- Urgency Score: Count of high-pressure keywords (e.g., "immediate", "act now").
- Financial Triggers: Count of monetary-related terms (e.g., "bank", "transfer").
- Caps Ratio: Percentage of uppercase characters.
- Exclamation Count: Frequency of "!" usage.
- URL Count: Number of hyperlinks in the email body.
- Email Count: Number of email addresses mentioned.
Data split strategy remains stratified (70% train / 15% validation / 15% test).
4. Model Architecture
Hybrid Architecture Details
Text Branch:
distilbert-base-uncased- Uses the [CLS] token representation (768-dimensional embedding)
Dense Branch:
- Input (6 features)
- Linear (6 β 16)
- ReLU
- Dropout (0.2)
Fusion Layer:
- Concatenation of:
- 768-dimensional transformer embedding
- 16-dimensional dense embedding
Classifier:
- Linear (784 β 64)
- ReLU
- Dropout (0.2)
- Linear (64 β 2)
Training Configuration
Differential Learning Rates:
- Transformer layers: 2e-5
- Dense + classifier layers: 1e-3
Optimizer:
- AdamW
Batch Size:
- 16
Loss Function:
- CrossEntropyLoss with class weights (computed from training set)
Early Stopping:
- Patience = 2 epochs (monitored on validation loss)
5. Evaluation Results & Comparison
Evaluation was performed on the held-out test set and directly compared to Phase 1.
| Metric | Phase 1 (Text-Only) | Phase 2 (Hybrid) | Change |
|---|---|---|---|
| Recall | 0.9635 | 0.9863 | +2.28% |
| Precision | 0.9769 | 0.9114 | -6.55% |
| F1 Score | 0.9701 | 0.9474 | -2.27% |
| ROC-AUC | 0.9972 | 0.9959 | -0.13% |
Key Finding
The Hybrid model achieved the primary objective of increasing sensitivity. It detects significantly more fraud cases (Recall ~99%), acting as a tighter safety net.
However, explicit risk features introduced structural bias, leading to more benign emails being flagged as fraud (lower Precision).
This trade-off reflects a deliberate design choice: prioritizing threat detection over convenience.
6. Error Analysis (Phase 2 Findings)
Manual inspection revealed:
False Positives (Precision Drop)
- Legitimate emails with exaggerated formatting (ALL CAPS, multiple exclamation marks).
- Corporate newsletters with dense financial vocabulary.
- Promotional emails mimicking high-pressure tone.
False Negatives (Recall Gain)
- Subtle delivery failure scams missed in Phase 1.
- Complex phishing attempts combining moderate semantic deception with structural anomalies.
Conclusion:
The Hybrid architecture trades user convenience (higher False Positives) for increased security (lower False Negatives), a common and often desirable trade-off in high-security environments.
7. Robustness Checks
Length Robustness:
- Maintained strong F1 across short and long emails.
- Perfect classification on very short emails (<100 characters).
Feature Contribution:
- The dense branch effectively adjusted transformer outputs.
- Borderline cases were shifted toward the Fraud class when structural risk signals were elevated.
8. Intended Use
This model is intended for:
- High-security environments where missing a threat is costlier than raising a false alarm.
- Research on hybrid architectures combining transformers and engineered features.
- Educational benchmarking of text-only vs hybrid models.
Deployment Note:
Unlike standard transformer models, this architecture requires preprocessing to compute the six numeric features prior to inference.
9. Usage Example
import torch
import torch.nn as nn
from transformers import AutoModel
Define Architecture
class HybridDistilBERT(nn.Module):
def __init__(self, num_labels=2, dense_feature_dim=6):
super().__init__()
self.distilbert = AutoModel.from_pretrained("distilbert-base-uncased")
self.dense_mlp = nn.Sequential(
nn.Linear(dense_feature_dim, 16),
nn.ReLU(),
nn.Dropout(0.2)
)
self.classifier = nn.Sequential(
nn.Linear(768 + 16, 64),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(64, num_labels)
)
def forward(self, input_ids, attention_mask, dense_features):
outputs = self.distilbert(input_ids=input_ids, attention_mask=attention_mask)
cls_embedding = outputs.last_hidden_state[:, 0, :]
dense_embed = self.dense_mlp(dense_features)
combined = torch.cat((cls_embedding, dense_embed), dim=1)
return self.classifier(combined)
model = HybridDistilBERT()
model.load_state_dict(torch.load("best_hybrid_model.pt"))
model.eval()
Load Model
model = HybridDistilBERT()
model.load_state_dict(torch.load("best_hybrid_model.pt"))
model.eval()
Inference (Requires Feature Extraction)
text = "URGENT: Verify your bank account now!!"
# 1. Tokenize text
# 2. Extract six dense features
# 3. Pass both into the model
with torch.no_grad():
outputs = model(input_ids, attention_mask, dense_features)
probabilities = torch.softmax(outputs, dim=1)
10. Roadmap
Phase 3 (Next Steps):
Adversarial Training with obfuscated and paraphrased fraud examples.
Feature Ablation Studies to reduce false positives.
Precision-recall trade-off tuning through threshold optimization.
Temporal drift simulation for robustness against evolving attack patterns.
11. Citation
If you use this model in research, please cite:
DistilBERT Sanh et al., 2019. DistilBERT: a distilled version of BERT.
Enron Dataset Klimt & Yang, 2004. The Enron Corpus.
This version:
- Corrects YAML formatting
- Aligns metrics table
- Clarifies architectural description
- Uses consistent professional tone
- Clearly articulates trade-offs
- Reflects a security-oriented engineering mindset
If you want, we can now tighten it further into a more research-paper tone or a more product