How 18-Layer Bot Detection Works — Multi-Signal Traffic Filtering Explained

Most bot detection works like a bouncer checking IDs at a nightclub door: one look, one check, one decision. If the ID looks okay, you're in. If not, you're out.

That approach fails spectacularly against modern bots.

Today's automated traffic uses residential proxies to hide datacenter origins. It rotates user agents pulled from real browser databases. It mimics human timing patterns. It sends valid HTTP headers. A single-layer check sees what looks like a real person and waves it through.

Multi-layer detection works differently. Instead of one decisive check, it runs 18 independent analyses on every single click — each examining a different dimension of the request. No single layer makes the final decision. The layers vote, and the combined score determines the verdict.

Here's how it works.

Layer 1: Signal Collection

Before any analysis happens, the system collects every available signal from the HTTP request. This is purely observational — no decisions yet, just data gathering.

What gets collected:

IP address and geolocation (via Cloudflare headers, not IP lookup — faster and more accurate)
Full HTTP header set — not just User-Agent, but Accept, Accept-Language, Accept-Encoding, Connection, all Sec-Fetch and Sec-CH-UA headers
TLS information — protocol version, cipher suite indicators
Request metadata — HTTP version, header ordering, timing

This raw signal collection takes less than 1 millisecond. Everything runs server-side — no JavaScript, no cookies, no client-side detection that bots can block or fake.

Layer 2: Known Threat Intelligence

The first real check: is this IP address already known to be malicious?

Two community-maintained threat intelligence feeds are checked in real-time:

FireHOL Level 1: A curated blocklist of IPs confirmed in attacks, scanning, spam, and botnet activity. Updated every 6 hours. Contains roughly 4,500+ entries.
CrowdSec: A collaborative security engine where participating servers share attack data. If an IP attacked any CrowdSec participant, it's flagged for everyone. Contains 22,000+ entries.

These checks use kernel-level IP sets — hash tables loaded directly into the Linux kernel's networking stack. Lookup time: effectively zero. There's no database query, no API call, no file read. The kernel matches IPs against the set during packet processing.

If an IP matches either list, the click is blocked immediately. No further analysis needed. But this layer only catches previously identified threats — it doesn't help with new bots or IPs that haven't been reported yet.

Layer 3: User Agent Analysis

The User-Agent string is the most commonly faked header, but it's still useful when analyzed properly.

This layer checks for:

Known bot identifiers: Strings like "bot", "crawler", "spider", "Puppeteer", "PhantomJS", "Selenium", "HeadlessChrome". Obvious, but some operators don't even bother hiding these.
Ad fraud network signatures: User agents associated with known ad verification and fraud services that generate non-human traffic.
Chrome version analysis: Chrome updates roughly every 4 weeks. A request claiming Chrome 85 in 2026 is 5+ years outdated — real users auto-update. But this gets nuanced...

The Chrome UA Reduction Problem

Since Chrome 110 (released October 2022), Google implemented User-Agent Reduction as a privacy feature. Modern Chrome sends:

Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Mobile Safari/537.36

Notice: "Android 10; K" instead of the real device model, and "131.0.0.0" instead of the full build number. This is normal, expected behavior from real browsers — not a bot signal.

Naive bot detectors see ".0.0.0" and flag it as suspicious. Smart detection systems understand that Chrome 110+ always does this. The actual bot signal is Chrome versions below 110 with ".0.0.0" — those are mimicking the reduction pattern without being new enough to have it naturally.

Layer 4: Header Consistency

This is where multi-layer detection starts separating from single-check tools. Real browsers don't just send a User-Agent — they send a consistent set of headers that correspond to their claimed identity.

Checks include:

Accept-Encoding: Every modern browser sends this header. Its absence is suspicious because all modern browsers support gzip/brotli compression.
Connection header: Bots often send "Connection: close" — real browsers use keep-alive connections.
Client Hints consistency: Chrome 89+ sends Sec-CH-UA headers. If the UA says Chrome 131 but there are no Client Hints, something is spoofed.
HTTP version: Chrome 80+ normally uses HTTP/2 or HTTP/3. If a request claims Chrome 131 but arrives over HTTP/1.0, it's not a real browser. Note: we only penalize HTTP/1.0 (-3.0), not HTTP/1.1 — corporate firewalls and reverse proxies can legitimately downgrade connections to HTTP/1.1, but HTTP/1.0 has no legitimate modern use case.

Each inconsistency doesn't trigger a block on its own. Instead, it adjusts the trust score. One missing header might mean a privacy extension or unusual configuration. Three missing headers in the same request means a bot.

Layer 5: Sec-Fetch Headers

This is one of the most powerful detection signals available in 2026, and most bot operators still don't handle it correctly.

Sec-Fetch headers (Sec-Fetch-Site, Sec-Fetch-Mode, Sec-Fetch-Dest) are set by the browser itself, not by JavaScript or the page. They cannot be spoofed by client-side code. They describe how and why the browser is making the request.

For a popunder click, valid Sec-Fetch headers look like:

Sec-Fetch-Site: cross-site
Sec-Fetch-Mode: navigate
Sec-Fetch-Dest: document

If these headers are missing entirely, the request probably didn't come from a real browser. If they contain invalid values, the request was definitely tampered with.

In our analysis of 114,000+ pop clicks, missing Sec-Fetch headers were the single most reliable bot indicator — responsible for 34% of all block decisions with near-zero false positives.

Layer 6: IP Reputation and ASN Analysis

Beyond known threat lists, IP analysis examines the network the request comes from:

Datacenter detection: Over 50 known datacenter and cloud provider ASNs are checked. Traffic from AWS (AS16509), Google Cloud (AS15169), Azure (AS8075), DigitalOcean (AS14061), OVH (AS16276), Hetzner (AS24940), and dozens more is flagged. Real pop traffic comes from ISPs, not servers.
Residential proxy detection: Some bot operators route traffic through residential proxy networks to appear as normal ISP traffic. Known proxy provider ASNs are checked with a lighter penalty — they might be legitimate VPN users.
GeoIP verification: Using MaxMind databases, the system checks whether the IP's geographic location matches the claimed location. A request that says it's from Thailand but originates from a Frankfurt datacenter is suspicious.

Layer 7: Language and Locale Verification

The Accept-Language header reveals what languages the browser is configured for. This should correlate with the user's geographic location — with important exceptions.

A request from a Thai IP with Accept-Language "en-US" could be a real Thai user with an English-configured phone (common in urban Southeast Asia) — or it could be a bot running with default American English settings.

The system accounts for this by maintaining an exception list of 26+ countries where English is commonly used alongside local languages. Thai users sending English headers get a pass. But a request from Japan with Accept-Language "pt-BR" is suspicious.

Layer 8: TLS Fingerprinting

Every TLS connection has a fingerprint based on the cipher suites offered, extensions present, and protocol version negotiated. Real browsers have characteristic fingerprints. Automated tools have different ones.

The system examines TLS version information from the connection metadata. Old TLS versions (1.0, 1.1) are suspicious in 2026 — all modern browsers use TLS 1.3 or at minimum TLS 1.2.

Layer 9: Burst Detection

Simple but effective: if the same IP sends more requests than humanly possible in a short window, it's automated.

Using in-memory rate tracking (APCu — no database, no disk I/O), the system counts requests per IP per time window. Exceeding the burst limit results in an immediate block that persists for a lockout period.

The thresholds are tuned for pop traffic patterns, where the same user might legitimately see 2-3 pops in a session across different ad zones, but won't generate 15 requests in 10 seconds.

Layer 10: Zone Reputation

This layer doesn't analyze the individual click — it analyzes the zone it came from.

Every zone builds a reputation score based on historical traffic quality. Zones with high accept rates and consistent clean traffic get bonuses. Zones with high bot rates get penalties or are blocked entirely.

New zones enter a probation period: the first N clicks from an unknown zone are randomly sampled (some accepted, some blocked) to build a baseline quality profile. After enough data accumulates, the zone gets classified as trusted, watch, or blocked.

This is where individual click detection becomes zone intelligence. Instead of analyzing each click independently, the system uses accumulated knowledge about where the traffic comes from.

Layer 11: Device Fingerprinting

Modern HTTP requests contain enough device information to build a partial fingerprint — even without JavaScript.

Client Hints (Sec-CH-UA, Sec-CH-UA-Mobile, Sec-CH-UA-Platform) tell the system the browser brand, whether it's mobile, and the operating system. Combined with User-Agent analysis, this creates a device profile that should be internally consistent.

Inconsistencies flag automated traffic: a "mobile" device that sends desktop Client Hints, a "Windows" platform with an Android User-Agent, Chrome claiming to be version 131 but missing the Client Hints that Chrome 89+ always includes.

Layer 12: Geographic Cross-Validation

Multiple geographic signals are compared:

The IP's geographic location (from Cloudflare's edge network and MaxMind databases)
The timezone implied by the Accept-Language header
The country claimed by the traffic source's targeting parameters

Minor mismatches (neighboring countries, border regions) get small penalties. Major mismatches (Asian IP with European geo claim) get significant penalties.

Layer 13: Trust Score Decision

After all 12 analytical layers have run, the final layer makes the decision.

Each click starts with a base trust score of 5.0 (on a 0-10 scale). Every layer adds or subtracts from this score based on what it found. Positive signals (valid Sec-Fetch headers, residential IP, consistent device fingerprint) increase trust. Negative signals (datacenter IP, missing headers, burst rate) decrease it.

The final decision is simple:

Trust ≥ threshold (typically 5.0-5.5): ACCEPT — redirect to the destination
Trust < threshold: BLOCK — reject the click

No single layer can block traffic on its own (except known threat intelligence matches). A click needs to fail multiple independent checks before its trust score drops below threshold. This multi-signal approach minimizes false positives — real users who happen to use a VPN or have a slightly unusual browser configuration won't be blocked because one signal is off.

Performance: The Speed Constraint

All 18 layers must execute before the redirect happens. In pop traffic, that redirect needs to be fast — every 100ms of latency reduces the percentage of users who reach the landing page.

The entire pipeline — all 18 layers, all signal analysis, the trust score calculation, and the ACCEPT/BLOCK decision — completes in 3-8 milliseconds. That's faster than a single DNS lookup.

How? No external API calls. No database queries in the hot path. IP sets in kernel memory. MaxMind databases memory-mapped by the OS. Rate limiting in shared memory (APCu). Zone reputation from pre-computed JSON files. Everything is designed for zero-latency operation at thousands of requests per second.

What This Means for Media Buyers

If you're buying pop, push, or native traffic, here's why multi-layer detection matters to you:

It catches modern bots. Single-signal tools (IP blocklists alone, or JavaScript challenges alone) miss bots that use residential proxies and valid browser headers. Multi-layer scoring catches them through accumulated weak signals.
It doesn't block real users. The trust score system means a real user who happens to use a VPN (one negative signal) isn't blocked if all their other signals are clean. This keeps your conversion rates intact.
It generates zone intelligence. Individual click filtering is useful, but the real value is the aggregate data: which zones are consistently clean, which are consistently bad, which need more monitoring. This feeds directly into your zone blocklists.
It's invisible to your traffic flow. 3-8ms latency is imperceptible. Your pop chain works normally — the filtering just happens before the redirect.

The Limits of Detection

No detection system is perfect. Here's what multi-layer analysis handles well and what it doesn't:

Catches Reliably

Datacenter-based bots (even with spoofed UAs)
Headless browsers (Puppeteer, Playwright, Selenium)
Simple traffic generators and curl-based scripts
Click farms with mechanical timing patterns
Known malicious IPs and networks
Inconsistent device fingerprints

Catches With High Probability

Residential proxy bots (caught by header/timing analysis even if IP is clean)
Browser extension-based bots (caught by behavioral anomalies)
Sophisticated spoofers (usually miss 2-3 header consistency checks)

Hard to Catch

Real humans paid to click (because they ARE real humans — the browser signals are genuine)
Malware on real devices generating background traffic (genuine device, genuine IP, genuine headers)

The good news: for pop traffic arbitrage and media buying, the first two categories are 95%+ of bot traffic. The hard-to-catch cases represent a tiny fraction of total automated traffic.

See 18-Layer Detection on Your Traffic

Connect your pop campaign and watch each click get analyzed in real-time. See exactly which layers flag which traffic.

Start Free Detection

100K checks free. Full trust score breakdown on every click.

Inside 18-Layer Bot Detection: How Smart Filtering Catches What Others Miss