ML Bot Detection in 2026 β What's Changed, What Still Works
Bot detection has shifted to ML models trained on cross-customer telemetry. What the models learned and how anti-detect engines respond.
Bot detection used to be rule-based: "if user agent says iPhone but screen is 1920x1080, flag." That world ended around 2022.
Modern detection is ML-based. Models are trained on cross-customer telemetry β billions of sessions, labelled bot or human β and learn to recognise patterns. The patterns aren't fixed rules; they're statistical correlations across dozens of signals.
What changed
| Era | Detection method | Anti-detect response |
|---|---|---|
| Pre-2020 | Rule-based | Spoof individual surfaces |
| 2020β22 | Hybrid rules + simple ML | Cover more surfaces |
| 2023+ | Deep ML on cross-customer data | Cover all surfaces consistently |
The third era is qualitatively different. You can't beat the model by spoofing one more signal; you have to be coherent across all of them.
What the models look at
Approximate weights in our reverse-engineering of FingerprintJS Pro and DataDome:
- IP-geolocation alignment β 20% (cheapest, biggest signal)
- Network-layer fingerprint (TLS JA3, HTTP/2, WebRTC) β 18%
- Browser fingerprint (canvas, WebGL, audio, fonts) β 15%
- Behavioural (mouse, keyboard, scroll) β 15%
- Consistency across signals β 12% (the killer)
- Device intelligence (cross-customer signals) β 10%
- Session history (repeat visits, returning behaviour) β 10%
The consistency dimension is what catches half-spoofed profiles. If your canvas claims Windows but your fonts claim Mac β flag.
What "good" looks like
A profile that beats ML detectors has:
- All 47 fingerprint surfaces aligned (see inventory)
- Network layer matching the claimed device (TLS, HTTP/2, UDP)
- Behavioural humaniser running (mouse, scroll, typing cadence)
- Long-running sessions with realistic visit patterns
The closest public test is demo.fingerprint.com. If you can clear it consistently, you're in the right ballpark for real targets.
What doesn't work anymore
- Just spoofing canvas β covers 1 of 47 surfaces
- Disabling WebRTC β itself a signal
- Random user agent rotation β inconsistent with rest of profile
- TLS-only fingerprint spoofing β leaves JS-layer surfaces exposed
- Cheap residential proxies β bad ASN history flagged at IP layer before fingerprint matters
Bottom line
The ML era rewards coherent profiles and punishes random spoofing. Pick an engine that ships full coverage and uses values derived from a single device choice (like Afina), and you'll clear most ML-based detectors. Cheap "anti-detect" tools fail because their spoofing is uncoordinated across surfaces.