DistilBERT Hybrid Email Fraud Detection – Phase 2 (v2)

1. Overview

This repository contains a Hybrid DistilBERT model for binary classification of email messages:

  • 0 β†’ Legitimate (Ham)
  • 1 β†’ Fraud / Spam

This model represents Phase 2 of a multi-stage fraud detection system. Unlike the Phase 1 text-only baseline, this architecture fuses semantic embeddings from a pre-trained transformer with engineered linguistic risk signals (urgency, financial triggers, formatting anomalies, structural indicators).

The objective of this phase is to increase model robustness and improve sensitivity (Recall) against polished phishing attacks that may bypass purely lexical checks.


2. Motivation and Approach

Difference from Phase 1 (v1)

Phase 1 relied solely on distilbert-base-uncased to classify text. While effective, it exhibited two primary weaknesses:

  • False Positives: Legitimate emails containing urgency or financial language.
  • False Negatives: Subtle and polished phishing attempts.

Phase 2 introduces a Hybrid Architecture (Dual-Branch Network):

  • Branch A (Semantic): DistilBERT processes raw text.
  • Branch B (Structural): A dense neural network processes engineered numeric risk features.

This architecture validates not only what is said (semantic meaning) but how it is structured (risk profile), aligning with a defense-in-depth strategy common in cybersecurity systems.


3. Dataset & Feature Engineering

Training was conducted on the Enron Spam Dataset to ensure direct comparability with Phase 1.

New in Phase 2 – Dense Feature Extraction

In addition to raw text, six engineered numeric features were extracted and normalized (StandardScaler):

  • Urgency Score: Count of high-pressure keywords (e.g., "immediate", "act now").
  • Financial Triggers: Count of monetary-related terms (e.g., "bank", "transfer").
  • Caps Ratio: Percentage of uppercase characters.
  • Exclamation Count: Frequency of "!" usage.
  • URL Count: Number of hyperlinks in the email body.
  • Email Count: Number of email addresses mentioned.

Data split strategy remains stratified (70% train / 15% validation / 15% test).


4. Model Architecture

Hybrid Architecture Details

Text Branch:

  • distilbert-base-uncased
  • Uses the [CLS] token representation (768-dimensional embedding)

Dense Branch:

  • Input (6 features)
  • Linear (6 β†’ 16)
  • ReLU
  • Dropout (0.2)

Fusion Layer:

  • Concatenation of:
    • 768-dimensional transformer embedding
    • 16-dimensional dense embedding

Classifier:

  • Linear (784 β†’ 64)
  • ReLU
  • Dropout (0.2)
  • Linear (64 β†’ 2)

Training Configuration

Differential Learning Rates:

  • Transformer layers: 2e-5
  • Dense + classifier layers: 1e-3

Optimizer:

  • AdamW

Batch Size:

  • 16

Loss Function:

  • CrossEntropyLoss with class weights (computed from training set)

Early Stopping:

  • Patience = 2 epochs (monitored on validation loss)

5. Evaluation Results & Comparison

Evaluation was performed on the held-out test set and directly compared to Phase 1.

Metric Phase 1 (Text-Only) Phase 2 (Hybrid) Change
Recall 0.9635 0.9863 +2.28%
Precision 0.9769 0.9114 -6.55%
F1 Score 0.9701 0.9474 -2.27%
ROC-AUC 0.9972 0.9959 -0.13%

Key Finding

The Hybrid model achieved the primary objective of increasing sensitivity. It detects significantly more fraud cases (Recall ~99%), acting as a tighter safety net.

However, explicit risk features introduced structural bias, leading to more benign emails being flagged as fraud (lower Precision).

This trade-off reflects a deliberate design choice: prioritizing threat detection over convenience.


6. Error Analysis (Phase 2 Findings)

Manual inspection revealed:

False Positives (Precision Drop)

  • Legitimate emails with exaggerated formatting (ALL CAPS, multiple exclamation marks).
  • Corporate newsletters with dense financial vocabulary.
  • Promotional emails mimicking high-pressure tone.

False Negatives (Recall Gain)

  • Subtle delivery failure scams missed in Phase 1.
  • Complex phishing attempts combining moderate semantic deception with structural anomalies.

Conclusion:

The Hybrid architecture trades user convenience (higher False Positives) for increased security (lower False Negatives), a common and often desirable trade-off in high-security environments.


7. Robustness Checks

Length Robustness:

  • Maintained strong F1 across short and long emails.
  • Perfect classification on very short emails (<100 characters).

Feature Contribution:

  • The dense branch effectively adjusted transformer outputs.
  • Borderline cases were shifted toward the Fraud class when structural risk signals were elevated.

8. Intended Use

This model is intended for:

  • High-security environments where missing a threat is costlier than raising a false alarm.
  • Research on hybrid architectures combining transformers and engineered features.
  • Educational benchmarking of text-only vs hybrid models.

Deployment Note:

Unlike standard transformer models, this architecture requires preprocessing to compute the six numeric features prior to inference.


9. Usage Example

import torch
import torch.nn as nn
from transformers import AutoModel

Define Architecture

class HybridDistilBERT(nn.Module):
    def __init__(self, num_labels=2, dense_feature_dim=6):
        super().__init__()
        self.distilbert = AutoModel.from_pretrained("distilbert-base-uncased")
        self.dense_mlp = nn.Sequential(
            nn.Linear(dense_feature_dim, 16),
            nn.ReLU(),
            nn.Dropout(0.2)
        )
        self.classifier = nn.Sequential(
            nn.Linear(768 + 16, 64),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(64, num_labels)
        )

    def forward(self, input_ids, attention_mask, dense_features):
        outputs = self.distilbert(input_ids=input_ids, attention_mask=attention_mask)
        cls_embedding = outputs.last_hidden_state[:, 0, :]
        dense_embed = self.dense_mlp(dense_features)
        combined = torch.cat((cls_embedding, dense_embed), dim=1)
        return self.classifier(combined)


model = HybridDistilBERT()
model.load_state_dict(torch.load("best_hybrid_model.pt"))
model.eval()

Load Model

model = HybridDistilBERT()
model.load_state_dict(torch.load("best_hybrid_model.pt"))
model.eval()

Inference (Requires Feature Extraction)


text = "URGENT: Verify your bank account now!!"

# 1. Tokenize text
# 2. Extract six dense features
# 3. Pass both into the model

with torch.no_grad():
    outputs = model(input_ids, attention_mask, dense_features)
    probabilities = torch.softmax(outputs, dim=1)

10. Roadmap

Phase 3 (Next Steps):

Adversarial Training with obfuscated and paraphrased fraud examples.

Feature Ablation Studies to reduce false positives.

Precision-recall trade-off tuning through threshold optimization.

Temporal drift simulation for robustness against evolving attack patterns.

11. Citation

If you use this model in research, please cite:

DistilBERT Sanh et al., 2019. DistilBERT: a distilled version of BERT.

Enron Dataset Klimt & Yang, 2004. The Enron Corpus.


This version:

  • Corrects YAML formatting
  • Aligns metrics table
  • Clarifies architectural description
  • Uses consistent professional tone
  • Clearly articulates trade-offs
  • Reflects a security-oriented engineering mindset

If you want, we can now tighten it further into a more research-paper tone or a more product

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support