How DelPhish works

Our system combines four analysis layers — heuristic rules, machine learning, BERT, and a large language model — to detect phishing attempts with high accuracy.

Analysis Pipeline

Every email passes through four independent analysis layers that are combined into a single score.

📧 Email / SMS

🎯 Combined Score

⚙

Heuristic Engine (30%)

Rule-based pattern matching across 6 categories

🔗 URL Analysis (25%)

13 rules that analyze email links:

💬 Keywords (18%)

7 rules that detect linguistic patterns:

👤 Sender Analysis (17%)

7 rules that verify the sender:

📄 Content Analysis (12%)

10 rules that analyze the body:

🔎 URL Intelligence (15%)

Live analysis of each URL found:

📊

Machine Learning Model (25%)

Calibrated Random Forest trained on 10,000+ samples

97%

Accuracy

0.997

ROC AUC

Numeric Features

1000

TF-IDF Features

The model uses a calibrated Random Forest trained with over 10,000 samples. It extracts 52 numerical features from the email (URL count, keywords, sender entropy, hidden text detection, punycode URLs, leet-speak patterns, etc.) along with 1,000 TF-IDF features from the text, resulting in a 1,052-dimensional vector. Calibration (CalibratedClassifierCV) ensures the model's probabilities correspond to the actual phishing frequency.

🧠

BERT Model (20%)

Deep learning transformer specialized in phishing detection

📖

What is BERT?

BERT (Bidirectional Encoder Representations from Transformers) is a deep learning model that understands the contextual meaning of text. Unlike keyword matching, it captures semantic relationships between words.

🎯

How it works here

We use a fine-tuned BERT model specifically trained to classify emails as phishing or legitimate. It analyzes the full text of the email and outputs a probability score, capturing subtle patterns that rules and traditional ML might miss.

💡

Strengths

Excellent at detecting sophisticated phishing that uses proper grammar and subtle social engineering. Catches attacks that bypass keyword-based detection by understanding intent and context.

⚠

Limitations

Only used for email analysis (not SMS). Trained on English data, so non-English emails are translated before processing. May occasionally be overconfident on very short emails.

🤖

LLM Analysis (25%)

Large Language Model for natural language reasoning

💭

What it does

A large language model reads the full email and reasons about it like a human expert. It identifies social engineering tactics, urgency manipulation, brand impersonation, and other deceptive patterns.

🌍

Multilingual

Unlike other layers that require English input, the LLM analyzes emails in their original language. It understands phishing patterns in Spanish, English, and many other languages natively.

📱

SMS Support

For SMS analysis, the LLM is one of only two layers used (along with heuristic). It's especially effective at identifying smishing attempts by understanding the context of short messages.

📝

Rich output

Beyond a numeric score, the LLM generates a detailed written analysis explaining exactly why an email is suspicious or safe. It also provides actionable recommendations.

⚖

Score Combination

The four layer scores are combined using asymmetric dampening: layers that score significantly below the median are dampened to prevent a single fooled layer from masking a threat. High-confidence phishing signals are never suppressed.

30%

25%

20%

25%

Heuristic ML BERT LLM

⚠

Risk levels

Safe

< 15%

Low

15-34%

Medium

35-59%

High

60-79%

Critical

≥ 80%

Ready to see it in action?

Analyze an email and watch all four layers work together.

Analyze an Email