tracio.aidocs

Getting Started

  • Overview
  • Quick Start
  • How It Works

SDK & API

  • Device Identification JS Agent
  • Trace API Reference
  • Trace Webhooks

Features

  • Device Identification Engine
  • Bot Detection
  • Trace Signals
  • IP Intelligence

Advanced

  • Cloud Deployment
  • Privacy & GDPR
  • Migration from FPJS

Reference

  • Changelog
  • Troubleshooting
  • Error Handling
  • Testing

Device Identification Engine

Advanced identification: hardware/software/session signal separation, proprietary hashing, and AI-powered cross-session matching

How Identification Works

tracio.ai assigns a stable, persistent identifier (the visitorId) to every browser based on its unique combination of hardware and software characteristics. Unlike cookies or local storage tokens, the visitor ID survives incognito mode, cookie clearing, VPN switching, and browser updates because it is derived from signals that are intrinsic to the device and browser engine rather than stored state.

The identification pipeline has four stages: signal collection, multi-tier signal processing, AI-powered matching against known visitors, and confidence scoring. Each stage is designed to maximize accuracy while minimizing false positives (incorrectly merging two different visitors) and false negatives (failing to recognize a returning visitor).

Signal Collection

The JavaScript SDK collects over 140 individual signals from the browser environment in under 50ms. Signals are grouped into categories based on what aspect of the device or browser they characterize. The collection process runs entirely in the browser and does not access any personally identifiable information such as names, emails, or browsing history.

Signal Categories

CategorySignalsExamplesUniqueness
Canvas22D canvas rendering hash, text rendering variationsVery High
WebGL12Renderer, vendor, extensions, shader precision, max texture sizeVery High
Audio2AudioContext oscillator output, audio processing hashHigh
Fonts3Available system fonts, font rendering metrics, @font-face supportHigh
Screen8Resolution, color depth, pixel ratio, available screen area, frame offsetsMedium
Navigator18Platform, languages, hardware concurrency, device memory, max touch pointsMedium
CSS/Media25Media query results, CSS feature support, color scheme preferencesMedium
Math14Math function precision variations across browser enginesMedium
Client Hints8UA-CH brands, platform version, architecture, bitness, mobile flagMedium
Functional Tests20API availability, feature detection, DOM behavior testsLow
Storage6Cookie support, localStorage, sessionStorage, IndexedDB, FileSystem APILow
MathML4MathML rendering dimensions, operator layoutMedium
Emoji3Emoji rendering measurements, variation selector supportMedium
Connection4Network Information API: effectiveType, downlink, rtt, saveDataLow
Pro Signals15Bot detection, headless markers, automation framework scanningDiagnostic

Collection Timing

Signal collection is highly optimized to minimize impact on page performance. Most signals are collected synchronously in under 5ms. A few high-value signals require asynchronous operations:

  • Canvas rendering (5-15ms) -- Draws text and geometric shapes to a hidden canvas, then hashes the pixel data. Different GPUs and rendering engines produce subtly different output.
  • Audio processing (3-10ms) -- Creates an offline AudioContext, generates an oscillator tone, and hashes the processed audio buffer. Audio processing characteristics vary by hardware and OS audio stack.
  • Font enumeration (10-30ms) -- Measures the rendering width of test strings against a baseline to detect which system fonts are installed. Uses a fast binary search to test 100+ fonts efficiently.
  • WebGL probing (5-15ms) -- Queries WebGL context parameters, extension availability, and shader precision to build a detailed GPU profile.

The total collection time is typically 30-80ms on modern hardware. On older mobile devices, it may take up to 150ms. The SDK uses requestAnimationFrame scheduling to avoid blocking the main thread during collection.

Multi-Tier Signal Processing

Signals are separated into three tiers based on their stability over time. Each tier is processed independently using proprietary algorithms to produce a tier-specific fingerprint. This approach allows the identification engine to handle expected signal drift (such as browser version updates) without losing track of the visitor.

Signal Tiers

TierStabilitySignals IncludedWeight in MatchExpected Drift
HardwareVery HighCanvas hash, WebGL renderer/vendor, audio hash, screen resolution, color depth, pixel ratio, hardware concurrency, device memory60%Changes only with hardware upgrades or OS reinstall
BrowserMediumInstalled fonts, User-Agent, platform, languages, plugins, CSS features, math precision, MathML layout, emoji rendering30%May change with browser updates, font installation, or language preferences
SessionLowTimezone, cookie support, storage availability, connection type, Client Hints, media queries (prefers-color-scheme, etc.)10%Changes frequently with user actions (timezone travel, dark mode toggle)

Processing Algorithm

Each tier is processed using a proprietary algorithm to produce a fixed-length fingerprint from the concatenated signal values. The algorithm is deterministic and optimized for distribution quality. The three tier fingerprints are combined into a composite identifier used for visitor lookup.

// Conceptual processing pipeline (simplified)
const hardwareFingerprint = processSignals([
canvasHash, webglRenderer, webglVendor, audioHash,
screenWidth, screenHeight, colorDepth, pixelRatio,
hardwareConcurrency, deviceMemory
]);
const browserFingerprint = processSignals([
fontList, userAgent, platform, languages,
plugins, cssFeatures, mathPrecision
]);
const sessionFingerprint = processSignals([
timezone, cookieSupport, storageAvail,
connectionType, colorScheme
]);
// Composite fingerprint for lookup
const fingerprint = hardwareFingerprint + ':' + browserFingerprint + ':' + sessionFingerprint;

Visitor ID Computation

The visitor ID is a 16-20 character alphanumeric string derived from the hardware-tier hash combined with historical visitor data. The computation follows these steps:

  1. Hash computation -- The server computes the three tier hashes from the decrypted signal payload.
  2. Candidate lookup -- The hardware hash is used to search the visitor database for potential matches. Because hardware signals are the most stable, this narrows the search space dramatically.
  3. AI-powered matching -- Each candidate is scored against the full signal set using weighted comparison. Exact matches on high-uniqueness signals (canvas, WebGL, audio) contribute more than matches on low-uniqueness signals (timezone, color scheme).
  4. Threshold check -- If the best match exceeds the confidence threshold (default 0.85), the visitor is considered a returning visitor and receives the existing visitor ID.
  5. New visitor -- If no match exceeds the threshold, a new visitor ID is generated from the hardware hash using Base62 encoding and stored in the database.

ID Format

Visitor IDs use Base62 encoding (alphanumeric characters: a-zA-Z0-9) for compact, URL-safe representation. A typical visitor ID looks like X7fh2Hg9LkMn3pQr. The length varies between 16 and 20 characters depending on the hash output. Visitor IDs are case-sensitive.

Cross-Session Matching

Real-world browsers change over time. Fonts get installed. Browser versions update. Users switch between light and dark mode. The AI-powered matching engine accounts for these expected changes by comparing signals individually rather than relying on a single monolithic approach.

Match Scoring

Each signal comparison produces a match score between 0.0 (completely different) and 1.0 (identical). Signal weights are based on uniqueness and stability:

SignalWeightMatching Strategy
Canvas hash15%Exact match (binary: 0 or 1)
WebGL renderer12%Exact match on GPU model, fuzzy on driver version
Audio hash10%Exact match (binary: 0 or 1)
Font list8%Set similarity (intersection / union). Threshold: 0.85
Screen resolution8%Exact match on both dimensions
WebGL extensions6%Set similarity. New extensions are tolerated; removed extensions penalized.
Math precision5%Exact match per function (engine-specific)
User-Agent4%Flexible: OS and browser family must match; version can differ
Hardware concurrency4%Exact match
CSS features3%Set similarity with tolerance for newly added features
Languages3%Primary language must match; secondary languages can differ
Platform3%Exact match
Timezone2%Exact match (travelers may have legitimate mismatches)
Color depth / pixel ratio2%Exact match
Other signals15%Various per-signal strategies

Drift Tolerance

The matching engine tolerates specific patterns of signal drift that are known to occur during normal browser lifecycle events:

  • Browser update -- User-Agent version changes, new CSS features appear, minor math precision shifts. Tolerated because hardware signals remain stable.
  • Font installation -- New fonts appear in the font list. The set similarity score handles this gracefully as long as most fonts remain unchanged.
  • Extension changes -- Installing or removing extensions may add properties to the navigator object. These changes are low-weight in the matching algorithm.
  • Display changes -- Connecting an external monitor changes screen resolution. This is a medium-weight signal, but other hardware signals anchor the identity.
  • Dark mode toggle -- Changes prefers-color-scheme media query result. This is a low-weight session-tier signal.

Confidence Scoring

The confidence score (0.0-1.0) quantifies how certain the system is that the identified visitor is correctly matched. It is computed as the weighted sum of individual signal match scores, normalized to the [0, 1] range. The score reflects both the quality of the match and the number of signals that could be compared.

Score RangeConfidence LevelTypical ScenarioRecommended Action
0.99+Very HighReturning visitor, all signals match exactlyTrust identification fully
0.95-0.99HighMinor signal drift from browser update or font changeTrust identification; monitor for patterns
0.85-0.95MediumSeveral signals changed (new monitor, major browser update)Consider additional verification for sensitive actions
0.70-0.85LowSignificant changes, possible different deviceRequire re-authentication or step-up verification
< 0.70Very LowFirst-time visitor or major environment changeTreat as new visitor; do not rely on historical data

Confidence in Practice

// Server-side confidence-based decisions
const event = await fetchEvent(requestId);
const { visitorId, confidence } = event.products.identification.data;
if (confidence.score >= 0.95) {
// High confidence: proceed normally
await processTransaction(visitorId, transactionData);
} else if (confidence.score >= 0.85) {
// Medium confidence: add friction
await sendSMSVerification(userId);
await processTransaction(visitorId, transactionData);
} else {
// Low confidence: block and require re-authentication
return res.status(401).json({
error: 'Please verify your identity',
reason: 'device_change_detected',
});
}

Cookie Persistence

While tracio.ai does not depend on cookies for identification, it uses an optional first-party cookie to optimize the identification process for returning visitors. The cookie stores a hashed reference to the previous visitor ID, allowing the server to skip the full candidate search and go directly to verification. This reduces server-side latency by 30-50% for returning visitors.

Cookie Configuration

PropertyValuePurpose
Name_tcidFirst-party cookie storing the visitor reference hash
DomainYour domain (first-party)Scoped to your site only, not shared cross-site
Expiry365 daysLong-lived for returning visitor optimization
SecuretrueOnly sent over HTTPS
HttpOnlyfalseAccessible by the JavaScript SDK
SameSiteLaxSent with top-level navigations but not cross-site requests

When the cookie is absent (incognito mode, cookie clearing, first visit), the identification still works through full fingerprint matching. The cookie is purely a performance optimization, not a requirement for identification accuracy.

Persistence Across Scenarios

The core value proposition of tracio.ai identification is that the visitor ID persists across scenarios that defeat traditional tracking methods. Here is a detailed breakdown of what preserves and what changes the visitor ID:

What Preserves the Visitor ID

  • Clearing cookies and localStorage -- The ID is derived from hardware/browser signals, not stored state.
  • Incognito / private browsing mode -- Hardware signals are identical in normal and private modes.
  • VPN or proxy usage -- Network changes do not affect browser-level signal collection.
  • Browser minor version updates -- Chrome 120 to Chrome 121 changes the UA string, but hardware signals remain stable. The matching engine handles the drift.
  • Installing/removing browser extensions -- Extensions rarely modify core fingerprinting signals. The system tolerates minor navigator changes.
  • Switching networks -- WiFi to cellular, home to office. IP address is not part of the fingerprint.
  • Dark mode / light mode toggle -- Low-weight session-tier signal; does not affect hardware or browser tier fingerprints.
  • DNS or ad blocker changes -- Network-level blocking does not affect browser signal collection.

What Changes the Visitor ID

  • Hardware changes -- New GPU, new monitor at a different resolution, significant RAM change. These alter hardware-tier signals.
  • Switching browsers -- Chrome to Firefox changes the rendering engine, math precision, and many other signals. Each browser generates a distinct ID.
  • OS reinstall -- Changes font inventory, GPU driver version, and other OS-level signals.
  • Aggressive spoofing -- Canvas blocker, WebGL spoofer, or anti-fingerprinting tools that randomize high-value signals on every page load.
  • Major browser upgrade -- Very rarely, a major browser version changes the rendering engine behavior enough to alter canvas or WebGL output. This is uncommon and is handled by the matching engine within a few visits.

Accuracy Metrics

tracio.ai achieves 99.5% identification accuracy in production environments. Accuracy is measured using two metrics:

MetricValueDefinition
True Positive Rate99.5%Correctly identifying a returning visitor with the same ID
False Positive Rate< 0.01%Incorrectly assigning the same ID to two different visitors
False Negative Rate< 0.5%Failing to recognize a returning visitor (assigning a new ID)

The system prioritizes a low false positive rate over a low false negative rate. Assigning the same ID to two different people is far more dangerous (for fraud prevention) than occasionally failing to recognize a returning visitor. The confidence score allows you to calibrate your own threshold based on your risk tolerance.

Integration Best Practices

  • Always verify server-side -- Never trust visitor IDs sent directly from the browser. Use the Server API to fetch the result by requestId.
  • Use confidence scores -- Build tiered access policies based on confidence. High-confidence matches can proceed normally; low-confidence matches should trigger step-up verification.
  • Combine with smart signals -- A high-confidence identification of a bot is still a bot. Always check bot.result and suspectScore alongside the visitor ID.
  • Store visitor IDs -- Map visitor IDs to your user accounts to detect multi-accounting, account sharing, and device changes over time.
  • Handle new visitors gracefully -- When visitorFound is false, the visitor is either genuinely new or has changed their device significantly. Do not block new visitors, but apply appropriate friction for sensitive operations.