Rust in Production: Why We Rewrote Our Signal Processor
Six months ago, we made the decision to rewrite our signal processing engine — the component that transforms raw browser signals into normalized, hashable feature vectors — from Go to Rust. This was not a decision we took lightly. Our Go implementation worked. It was tested. It was deployed. But it had become the bottleneck in our pipeline, and we needed a step-function improvement in throughput. Here is what happened.
Why We Outgrew Go
Our signal processor performs computationally intensive work: parsing JSON payloads, applying normalization functions to 1,000+ signals, computing proprietary hashes, and building identification hash vectors. In Go, this work was CPU-bound, and Go's garbage collector became a problem at scale. Every signal processing cycle allocated intermediate objects — parsed JSON nodes, normalized string values, hash buffers — that created GC pressure.
At 30K events/second, our Go signal processor exhibited GC pauses of 2-5ms every few seconds. These pauses were acceptable. At 50K events/second, GC pauses grew to 8-15ms and occurred more frequently. At 80K events/second — our projected load for Q3 — the GC pauses would have caused p99 latency to exceed our SLA. We needed either more servers (expensive) or a more efficient implementation.
Why Rust
We evaluated three options: optimizing the Go implementation (sync.Pool, arena allocation, GOGC tuning), rewriting in C++, and rewriting in Rust. Go optimization yielded a 30% improvement but did not fundamentally solve the GC problem. C++ was rejected because of memory safety concerns in a security-critical system. Rust offered zero-cost abstractions, no garbage collector, and memory safety guarantees enforced at compile time.
The Rust ecosystem also had mature libraries for everything we needed: serde for JSON parsing, high-performance hashing crates, and tokio for async I/O. The learning curve was real — our team had deep Go experience but limited Rust experience — but the performance characteristics were exactly what we needed.
The Rewrite Process
We rewrote the signal processor as a standalone service that communicates with the rest of our pipeline via gRPC. This allowed us to deploy it alongside the Go implementation and gradually shift traffic. The rewrite took three engineers four weeks — two weeks for the core implementation and two weeks for testing, benchmarking, and edge case handling.
The most challenging aspect was not the language itself but ensuring behavioral parity with the Go implementation. We built a comparison harness that ran both implementations against the same input and verified that they produced identical output. We discovered 14 subtle differences during this process — mostly related to floating-point handling, Unicode normalization, and JSON parsing edge cases.
Performance Results
The Rust implementation processes signals in 0.8ms average versus 3.2ms for Go — a 4x improvement. Memory usage dropped from 2.1GB to 340MB for the same workload. There are no GC pauses because there is no garbage collector. CPU utilization decreased by 60% at the same throughput, meaning each server handles 4x more traffic.
At 80K events/second, the Rust implementation maintains a p99 processing time of 1.4ms with zero pauses. This headroom means we will not need to revisit signal processing performance for the foreseeable future. The reduced CPU and memory usage also translates directly to lower infrastructure costs — we retired 8 of 12 signal processing servers.
Lessons Learned
Rewriting in Rust was worth it for our specific case — a CPU-bound, allocation-heavy, latency-sensitive workload. We would not rewrite our HTTP ingestion layer or our ClickHouse query service in Rust, because those components are I/O-bound and Go handles them efficiently. The lesson is not "rewrite everything in Rust" but "use Rust where its zero-cost abstractions and deterministic performance matter most."
The biggest surprise was how much the Rust compiler caught during the rewrite. Several latent bugs in our Go implementation — race conditions on shared buffers, integer overflow in hash computation, and out-of-bounds access on malformed input — were caught as compile-time errors in Rust. The compiler is demanding, but it pays for itself in correctness.
Honestly, the first week was painful. Sarah kept a "fights with the borrow checker" tally on the whiteboard — we hit 47 before the team stopped counting. But by week three, the code that compiled just worked. No mysterious production panics, no data races under load. That tradeoff is worth it for anything on the hot path.