Advanced identification: hardware/software/session signal separation, proprietary hashing, and AI-powered cross-session matching
tracio.ai assigns a stable, persistent identifier (the visitorId) to every browser based on its unique combination of hardware and software characteristics. Unlike cookies or local storage tokens, the visitor ID survives incognito mode, cookie clearing, VPN switching, and browser updates because it is derived from signals that are intrinsic to the device and browser engine rather than stored state.
The identification pipeline has four stages: signal collection, multi-tier signal processing, AI-powered matching against known visitors, and confidence scoring. Each stage is designed to maximize accuracy while minimizing false positives (incorrectly merging two different visitors) and false negatives (failing to recognize a returning visitor).
The JavaScript SDK collects over 140 individual signals from the browser environment in under 50ms. Signals are grouped into categories based on what aspect of the device or browser they characterize. The collection process runs entirely in the browser and does not access any personally identifiable information such as names, emails, or browsing history.
| Category | Signals | Examples | Uniqueness |
|---|---|---|---|
| Canvas | 2 | 2D canvas rendering hash, text rendering variations | Very High |
| WebGL | 12 | Renderer, vendor, extensions, shader precision, max texture size | Very High |
| Audio | 2 | AudioContext oscillator output, audio processing hash | High |
| Fonts | 3 | Available system fonts, font rendering metrics, @font-face support | High |
| Screen | 8 | Resolution, color depth, pixel ratio, available screen area, frame offsets | Medium |
| Navigator | 18 | Platform, languages, hardware concurrency, device memory, max touch points | Medium |
| CSS/Media | 25 | Media query results, CSS feature support, color scheme preferences | Medium |
| Math | 14 | Math function precision variations across browser engines | Medium |
| Client Hints | 8 | UA-CH brands, platform version, architecture, bitness, mobile flag | Medium |
| Functional Tests | 20 | API availability, feature detection, DOM behavior tests | Low |
| Storage | 6 | Cookie support, localStorage, sessionStorage, IndexedDB, FileSystem API | Low |
| MathML | 4 | MathML rendering dimensions, operator layout | Medium |
| Emoji | 3 | Emoji rendering measurements, variation selector support | Medium |
| Connection | 4 | Network Information API: effectiveType, downlink, rtt, saveData | Low |
| Pro Signals | 15 | Bot detection, headless markers, automation framework scanning | Diagnostic |
Signal collection is highly optimized to minimize impact on page performance. Most signals are collected synchronously in under 5ms. A few high-value signals require asynchronous operations:
The total collection time is typically 30-80ms on modern hardware. On older mobile devices, it may take up to 150ms. The SDK uses requestAnimationFrame scheduling to avoid blocking the main thread during collection.
Signals are separated into three tiers based on their stability over time. Each tier is processed independently using proprietary algorithms to produce a tier-specific fingerprint. This approach allows the identification engine to handle expected signal drift (such as browser version updates) without losing track of the visitor.
| Tier | Stability | Signals Included | Weight in Match | Expected Drift |
|---|---|---|---|---|
| Hardware | Very High | Canvas hash, WebGL renderer/vendor, audio hash, screen resolution, color depth, pixel ratio, hardware concurrency, device memory | 60% | Changes only with hardware upgrades or OS reinstall |
| Browser | Medium | Installed fonts, User-Agent, platform, languages, plugins, CSS features, math precision, MathML layout, emoji rendering | 30% | May change with browser updates, font installation, or language preferences |
| Session | Low | Timezone, cookie support, storage availability, connection type, Client Hints, media queries (prefers-color-scheme, etc.) | 10% | Changes frequently with user actions (timezone travel, dark mode toggle) |
Each tier is processed using a proprietary algorithm to produce a fixed-length fingerprint from the concatenated signal values. The algorithm is deterministic and optimized for distribution quality. The three tier fingerprints are combined into a composite identifier used for visitor lookup.
// Conceptual processing pipeline (simplified)const hardwareFingerprint = processSignals([ canvasHash, webglRenderer, webglVendor, audioHash, screenWidth, screenHeight, colorDepth, pixelRatio, hardwareConcurrency, deviceMemory]);const browserFingerprint = processSignals([ fontList, userAgent, platform, languages, plugins, cssFeatures, mathPrecision]);const sessionFingerprint = processSignals([ timezone, cookieSupport, storageAvail, connectionType, colorScheme]);// Composite fingerprint for lookupconst fingerprint = hardwareFingerprint + ':' + browserFingerprint + ':' + sessionFingerprint;The visitor ID is a 16-20 character alphanumeric string derived from the hardware-tier hash combined with historical visitor data. The computation follows these steps:
Visitor IDs use Base62 encoding (alphanumeric characters: a-zA-Z0-9) for compact, URL-safe representation. A typical visitor ID looks like X7fh2Hg9LkMn3pQr. The length varies between 16 and 20 characters depending on the hash output. Visitor IDs are case-sensitive.
Real-world browsers change over time. Fonts get installed. Browser versions update. Users switch between light and dark mode. The AI-powered matching engine accounts for these expected changes by comparing signals individually rather than relying on a single monolithic approach.
Each signal comparison produces a match score between 0.0 (completely different) and 1.0 (identical). Signal weights are based on uniqueness and stability:
| Signal | Weight | Matching Strategy |
|---|---|---|
| Canvas hash | 15% | Exact match (binary: 0 or 1) |
| WebGL renderer | 12% | Exact match on GPU model, fuzzy on driver version |
| Audio hash | 10% | Exact match (binary: 0 or 1) |
| Font list | 8% | Set similarity (intersection / union). Threshold: 0.85 |
| Screen resolution | 8% | Exact match on both dimensions |
| WebGL extensions | 6% | Set similarity. New extensions are tolerated; removed extensions penalized. |
| Math precision | 5% | Exact match per function (engine-specific) |
| User-Agent | 4% | Flexible: OS and browser family must match; version can differ |
| Hardware concurrency | 4% | Exact match |
| CSS features | 3% | Set similarity with tolerance for newly added features |
| Languages | 3% | Primary language must match; secondary languages can differ |
| Platform | 3% | Exact match |
| Timezone | 2% | Exact match (travelers may have legitimate mismatches) |
| Color depth / pixel ratio | 2% | Exact match |
| Other signals | 15% | Various per-signal strategies |
The matching engine tolerates specific patterns of signal drift that are known to occur during normal browser lifecycle events:
prefers-color-scheme media query result. This is a low-weight session-tier signal.The confidence score (0.0-1.0) quantifies how certain the system is that the identified visitor is correctly matched. It is computed as the weighted sum of individual signal match scores, normalized to the [0, 1] range. The score reflects both the quality of the match and the number of signals that could be compared.
| Score Range | Confidence Level | Typical Scenario | Recommended Action |
|---|---|---|---|
| 0.99+ | Very High | Returning visitor, all signals match exactly | Trust identification fully |
| 0.95-0.99 | High | Minor signal drift from browser update or font change | Trust identification; monitor for patterns |
| 0.85-0.95 | Medium | Several signals changed (new monitor, major browser update) | Consider additional verification for sensitive actions |
| 0.70-0.85 | Low | Significant changes, possible different device | Require re-authentication or step-up verification |
| < 0.70 | Very Low | First-time visitor or major environment change | Treat as new visitor; do not rely on historical data |
// Server-side confidence-based decisionsconst event = await fetchEvent(requestId);const { visitorId, confidence } = event.products.identification.data;if (confidence.score >= 0.95) { // High confidence: proceed normally await processTransaction(visitorId, transactionData);} else if (confidence.score >= 0.85) { // Medium confidence: add friction await sendSMSVerification(userId); await processTransaction(visitorId, transactionData);} else { // Low confidence: block and require re-authentication return res.status(401).json({ error: 'Please verify your identity', reason: 'device_change_detected', });}While tracio.ai does not depend on cookies for identification, it uses an optional first-party cookie to optimize the identification process for returning visitors. The cookie stores a hashed reference to the previous visitor ID, allowing the server to skip the full candidate search and go directly to verification. This reduces server-side latency by 30-50% for returning visitors.
| Property | Value | Purpose |
|---|---|---|
| Name | _tcid | First-party cookie storing the visitor reference hash |
| Domain | Your domain (first-party) | Scoped to your site only, not shared cross-site |
| Expiry | 365 days | Long-lived for returning visitor optimization |
| Secure | true | Only sent over HTTPS |
| HttpOnly | false | Accessible by the JavaScript SDK |
| SameSite | Lax | Sent with top-level navigations but not cross-site requests |
When the cookie is absent (incognito mode, cookie clearing, first visit), the identification still works through full fingerprint matching. The cookie is purely a performance optimization, not a requirement for identification accuracy.
The core value proposition of tracio.ai identification is that the visitor ID persists across scenarios that defeat traditional tracking methods. Here is a detailed breakdown of what preserves and what changes the visitor ID:
tracio.ai achieves 99.5% identification accuracy in production environments. Accuracy is measured using two metrics:
| Metric | Value | Definition |
|---|---|---|
| True Positive Rate | 99.5% | Correctly identifying a returning visitor with the same ID |
| False Positive Rate | < 0.01% | Incorrectly assigning the same ID to two different visitors |
| False Negative Rate | < 0.5% | Failing to recognize a returning visitor (assigning a new ID) |
The system prioritizes a low false positive rate over a low false negative rate. Assigning the same ID to two different people is far more dangerous (for fraud prevention) than occasionally failing to recognize a returning visitor. The confidence score allows you to calibrate your own threshold based on your risk tolerance.
requestId.bot.result and suspectScore alongside the visitor ID.visitorFound is false, the visitor is either genuinely new or has changed their device significantly. Do not block new visitors, but apply appropriate friction for sensitive operations.