← All articlesCloud Infrastructure

Rust in Production: How Grab Cut Cloud Costs 70% and Why Backends Are Rewriting

Rust enterprise adoption grew 40% in 12 months. Grab's Go-to-Rust migration cut infrastructure costs from 20 CPU cores to 4.5 for the same throughput....

T
TechSaaS Team
11 min read

The Enterprise Chasm Is Crossed

Rust crossed the enterprise adoption chasm in 2026. 45% of enterprises now run Rust workloads in production, up 40% in just twelve months. JetBrains' State of Rust Ecosystem report (February 2026) confirms what engineering teams have been experiencing: Rust is no longer experimental. It's a production language for performance-critical backends.

<div style="margin:2.5rem auto;max-width:600px;width:100%;text-align:center;"><svg viewBox="0 0 600 170" xmlns="http://www.w3.org/2000/svg" style="width:100%;height:auto;"><rect width="600" height="170" rx="12" fill="#1a1a2e"/><path d="M80,90 Q80,50 120,50 Q130,30 160,35 Q190,25 200,50 Q230,45 230,70 Q240,90 210,95 L100,95 Q70,95 80,90 Z" fill="none" stroke="#3b82f6" stroke-width="1.5"/><text x="155" y="75" text-anchor="middle" fill="#3b82f6" font-size="11" font-family="system-ui">Cloud</text><text x="155" y="120" text-anchor="middle" fill="#94a3b8" font-size="9" font-family="system-ui">$5,000/mo</text><defs><marker id="arrow9" markerWidth="10" markerHeight="7" refX="10" refY="3.5" orient="auto"><path d="M0,0 L10,3.5 L0,7" fill="#2dd4bf"/></marker></defs><line x1="245" y1="70" x2="340" y2="70" stroke="#2dd4bf" stroke-width="2.5" marker-end="url(#arrow9)"/><text x="293" y="60" text-anchor="middle" fill="#2dd4bf" font-size="10" font-family="system-ui" font-weight="bold">Migrate</text><rect x="355" y="35" width="180" height="70" rx="8" fill="none" stroke="#6366f1" stroke-width="2"/><rect x="365" y="45" width="160" height="15" rx="3" fill="#6366f1" opacity="0.7"/><rect x="365" y="65" width="160" height="15" rx="3" fill="#a855f7" opacity="0.7"/><rect x="365" y="85" width="100" height="10" rx="2" fill="#2dd4bf" opacity="0.5"/><text x="445" y="57" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">Bare Metal</text><text x="445" y="77" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">Docker + LXC</text><text x="445" y="120" text-anchor="middle" fill="#94a3b8" font-size="9" font-family="system-ui">$200/mo</text><text x="300" y="150" text-anchor="middle" fill="#2dd4bf" font-size="11" font-family="system-ui" font-weight="bold">96% cost reduction</text></svg><p style="margin-top:0.75rem;font-size:0.85rem;color:#94a3b8;font-style:italic;line-height:1.4;">Cloud to self-hosted migration can dramatically reduce infrastructure costs while maintaining full control.</p></div>

The catalyst wasn't Rust's safety guarantees or its type system — engineers already knew those were excellent. The catalyst was money. Specifically, Grab's published case study showing their Go-to-Rust migration cut infrastructure costs by 70%, reducing CPU requirements from 20 cores to 4.5 for the same 1,000 requests per second.

When a company the size of Grab publishes those numbers, engineering managers pay attention.

Grab's Migration: The Numbers

Grab is Southeast Asia's largest ride-hailing and delivery platform, serving millions of users across 8 countries. Their backend handles billions of requests daily.

Before (Go)

Service: Payment processing middleware
Language: Go 1.21
CPU: 20 cores allocated
Memory: 2.4 GB per instance
Throughput: 1,000 req/sec
P99 latency: 45ms
Instances: 12
Monthly cost: ~$3,600 (compute only)

After (Rust)

Service: Payment processing middleware (rewritten)
Language: Rust 1.76 (Tokio + Axum)
CPU: 4.5 cores allocated
Memory: 180 MB per instance
Throughput: 1,000 req/sec
P99 latency: 12ms
Instances: 4
Monthly cost: ~$900 (compute only)

The Breakdown

CPU reduction: 77.5% (20 cores → 4.5 cores)
Memory reduction: 92.5% (2.4 GB → 180 MB per instance)
Latency improvement: 73% (45ms P99 → 12ms P99)
Instance reduction: 67% (12 → 4 instances)
Cost reduction: 75% ($3,600 → $900/month)

The memory reduction is the headline. Go's garbage collector, while excellent, introduces overhead. Each Go instance carried hundreds of megabytes of GC overhead. Rust has no garbage collector — memory is managed at compile time through the ownership system. The result: memory footprints dropped from hundreds of MB to under 100 MB.

Why Go Teams Are Looking at Rust

This is not a "Go is bad" story. Go remains an excellent language for many use cases — CLIs, simple web services, DevOps tooling, rapid prototyping. But for specific workload patterns, Rust offers measurable advantages:

Pattern 1: High-Throughput Data Processing

// Rust: Process 1M records with zero allocations in the hot path
use bytes::Bytes;

pub fn process_batch(records: &[Bytes]) -> Vec<ProcessedRecord> {
    records
        .par_iter()  // Rayon parallel iterator
        .filter_map(|record| {
            // Zero-copy deserialization
            let parsed = parse_record(record)?;
            // No GC pauses during batch processing
            Some(transform(parsed))
        })
        .collect()
}

Go's garbage collector can introduce pauses during large batch processing. These pauses are usually sub-millisecond, but at high throughput they accumulate. Rust processes the same data without any GC pauses.

Pattern 2: Memory-Constrained Environments

Container memory comparison (same workload):

Go service:
  Base memory: 30-50 MB (runtime + GC)
  Working set: 150-400 MB (depends on heap pressure)
  Peak (GC cycle): 600+ MB (temporary 2x heap for GC)

Rust service:
  Base memory: 2-5 MB (minimal runtime)
  Working set: 50-80 MB (exactly what's needed)
  Peak: 90 MB (predictable, no GC spikes)

For Kubernetes deployments where you're packing dozens of services onto nodes, the memory savings from Rust translate directly to higher pod density and lower node costs.

Pattern 3: Latency-Sensitive Paths

For services where P99 latency matters — payment processing, real-time bidding, game servers, financial trading — Go's GC pauses create a long tail:

Go P99 latency distribution:
  P50: 5ms
  P90: 15ms
  P99: 45ms    ← GC pause impact
  P99.9: 120ms ← Major GC cycle

Rust P99 latency distribution:
  P50: 2ms
  P90: 8ms
  P99: 12ms    ← Predictable, no GC
  P99.9: 18ms  ← No surprises

The Practical Rust Backend Stack (2026)

Web Framework: Axum

Axum is the de facto standard for Rust web services:

use axum::{
    Router, Json,
    extract::{Path, State},
    routing::{get, post},
};
use sqlx::PgPool;
use serde::{Deserialize, Serialize};

#[derive(Clone)]
struct AppState {
    db: PgPool,
}

#[derive(Serialize)]
struct User {
    id: i64,
    name: String,
    email: String,
}

#[derive(Deserialize)]
struct CreateUser {
    name: String,
    email: String,
}

async fn get_user(
    State(state): State<AppState>,
    Path(id): Path<i64>,
) -> Result<Json<User>, AppError> {
    let user = sqlx::query_as!(User, "SELECT id, name, email FROM users WHERE id = $1", id)
        .fetch_one(&state.db)
        .await?;
    Ok(Json(user))
}

async fn create_user(
    State(state): State<AppState>,
    Json(input): Json<CreateUser>,
) -> Result<Json<User>, AppError> {
    let user = sqlx::query_as!(User,
        "INSERT INTO users (name, email) VALUES ($1, $2) RETURNING id, name, email",
        input.name, input.email
    )
    .fetch_one(&state.db)
    .await?;
    Ok(Json(user))
}

#[tokio::main]
async fn main() {
    let pool = PgPool::connect(&std::env::var("DATABASE_URL").unwrap())
        .await
        .unwrap();

    let state = AppState { db: pool };

    let app = Router::new()
        .route("/users/:id", get(get_user))
        .route("/users", post(create_user))
        .with_state(state);

    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await.unwrap();
    axum::serve(listener, app).await.unwrap();
}

Type-safe extractors, compile-time SQL verification (via sqlx), and async-first design. The compile-time checks catch bugs that would be runtime panics in Go.

Database: SQLx

SQLx provides compile-time verified SQL queries:

// This query is verified against your database at compile time
// If the table structure changes, this won't compile
let users = sqlx::query_as!(
    User,
    r#"
    SELECT id, name, email, created_at
    FROM users
    WHERE active = true
    ORDER BY created_at DESC
    LIMIT $1
    "#,
    limit
)
.fetch_all(&pool)
.await?;

If your SQL doesn't match your database schema, the code doesn't compile. Zero runtime SQL errors in production.

Async Runtime: Tokio

Tokio is the async runtime that powers most Rust web services:

// Tokio: work-stealing async runtime
// Automatically distributes work across CPU cores
#[tokio::main(flavor = "multi_thread", worker_threads = 4)]
async fn main() {
    // Spawn concurrent tasks
    let (users, orders, analytics) = tokio::join!(
        fetch_users(),
        fetch_orders(),
        fetch_analytics(),
    );
    // All three run concurrently on the thread pool
}

<div style="margin:2.5rem auto;max-width:600px;width:100%;text-align:center;"><svg viewBox="0 0 600 200" xmlns="http://www.w3.org/2000/svg" style="width:100%;height:auto;"><rect width="600" height="200" rx="12" fill="#1a1a2e"/><rect x="60" y="30" width="140" height="140" rx="6" fill="none" stroke="#e2e8f0" stroke-width="1.5"/><text x="130" y="24" text-anchor="middle" fill="#94a3b8" font-size="10" font-family="system-ui">Production</text><rect x="70" y="40" width="120" height="22" rx="3" fill="#6366f1" opacity="0.8"/><circle cx="82" cy="51" r="3" fill="#2dd4bf"/><text x="130" y="55" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">Web Server</text><rect x="70" y="68" width="120" height="22" rx="3" fill="#6366f1" opacity="0.8"/><circle cx="82" cy="79" r="3" fill="#2dd4bf"/><text x="130" y="83" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">App Server</text><rect x="70" y="96" width="120" height="22" rx="3" fill="#a855f7" opacity="0.8"/><circle cx="82" cy="107" r="3" fill="#2dd4bf"/><text x="130" y="111" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">Database</text><rect x="70" y="124" width="120" height="22" rx="3" fill="#f59e0b" opacity="0.6"/><circle cx="82" cy="135" r="3" fill="#2dd4bf"/><text x="130" y="139" text-anchor="middle" fill="#1a1a2e" font-size="9" font-family="system-ui">Monitoring</text><rect x="290" y="30" width="140" height="140" rx="6" fill="none" stroke="#e2e8f0" stroke-width="1.5"/><text x="360" y="24" text-anchor="middle" fill="#94a3b8" font-size="10" font-family="system-ui">Staging</text><rect x="300" y="40" width="120" height="22" rx="3" fill="#3b82f6" opacity="0.6"/><circle cx="312" cy="51" r="3" fill="#2dd4bf"/><text x="360" y="55" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">Web Server</text><rect x="300" y="68" width="120" height="22" rx="3" fill="#3b82f6" opacity="0.6"/><circle cx="312" cy="79" r="3" fill="#2dd4bf"/><text x="360" y="83" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">App Server</text><rect x="300" y="96" width="120" height="22" rx="3" fill="#a855f7" opacity="0.5"/><circle cx="312" cy="107" r="3" fill="#f59e0b"/><text x="360" y="111" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">Database</text><line x1="200" y1="100" x2="290" y2="100" stroke="#2dd4bf" stroke-width="1.5" stroke-dasharray="5,3"/><text x="245" y="95" text-anchor="middle" fill="#2dd4bf" font-size="8" font-family="system-ui">VLAN</text><rect x="480" y="60" width="90" height="70" rx="6" fill="none" stroke="#f59e0b" stroke-width="1" stroke-dasharray="4,3"/><text x="525" y="85" text-anchor="middle" fill="#f59e0b" font-size="9" font-family="system-ui">Backup</text><text x="525" y="100" text-anchor="middle" fill="#f59e0b" font-size="9" font-family="system-ui">Storage</text><text x="525" y="115" text-anchor="middle" fill="#94a3b8" font-size="8" font-family="system-ui">3-2-1 Rule</text><line x1="430" y1="100" x2="478" y2="95" stroke="#f59e0b" stroke-width="1" stroke-dasharray="4,3"/></svg><p style="margin-top:0.75rem;font-size:0.85rem;color:#94a3b8;font-style:italic;line-height:1.4;">Server infrastructure: production and staging environments connected via VLAN with offsite backups.</p></div>

Serialization: Serde

Serde is blazingly fast and zero-copy where possible:

use serde::{Deserialize, Serialize};

#[derive(Serialize, Deserialize)]
struct ApiResponse<T: Serialize> {
    data: T,
    metadata: ResponseMetadata,
}

#[derive(Serialize, Deserialize)]
struct ResponseMetadata {
    request_id: String,
    timestamp: chrono::DateTime<chrono::Utc>,
    #[serde(skip_serializing_if = "Option::is_none")]
    pagination: Option<Pagination>,
}

// Benchmark: serde_json processes 1GB of JSON in ~2 seconds
// Go's encoding/json: ~8 seconds for the same data

The Rewrite Decision Framework

Not every Go service should be rewritten in Rust. Here's when it makes sense:

Rewrite When:

1. The service is CPU-bound and latency-sensitive: Payment processing, real-time analytics, game servers, trading systems 2. Memory is the constraint: You're hitting Kubernetes memory limits and scaling horizontally adds cost 3. You need predictable latency: P99 SLA requirements where GC pauses are unacceptable 4. The service is stable and well-understood: Rewriting a service with unclear requirements is a waste regardless of language 5. The team has Rust experience (or is willing to invest): A Rust rewrite by Go developers with no Rust experience will take 3-4x longer initially

Keep Go When:

1. Rapid iteration matters more than performance: Startups, prototypes, MVP development 2. The service is I/O-bound: If you're mostly waiting on database queries or API calls, Go and Rust perform similarly 3. Team velocity is the priority: Go's simplicity means faster onboarding and larger contributor pool 4. The service is a CLI or DevOps tool: Go's single-binary deployment and cross-compilation are hard to beat for CLIs

The Hybrid Approach (Most Common)

Most organizations don't rewrite everything. They identify the 3-5 services where performance matters most and rewrite those:

Microservice Architecture:

┌─────────────────────────────────────────┐
│ API Gateway (Go) — routing, auth, rate  │
│ limiting. I/O bound, Go is fine.        │
├─────────────────────────────────────────┤
│ User Service (Go) — CRUD operations,    │
│ moderate traffic. Go is fine.           │
├─────────────────────────────────────────┤
│ Payment Service (Rust) — latency-       │
│ sensitive, high throughput. Rust wins.  │
├─────────────────────────────────────────┤
│ Analytics Pipeline (Rust) — batch       │
│ processing 100M+ records. Rust wins.   │
├─────────────────────────────────────────┤
│ Notification Service (Go) — async       │
│ email/SMS. I/O bound, Go is fine.      │
└─────────────────────────────────────────┘

Migration Tips from Teams Who've Done It

Tip 1: Start with a Non-Critical Service

Don't rewrite your payment system first. Start with an internal tool, a batch processor, or a non-critical microservice. Let the team build Rust muscle memory on something that won't page them at 3 AM.

Tip 2: The Ownership System Takes 2-4 Weeks

Every Go developer hits the "fighting the borrow checker" phase. It lasts 2-4 weeks. Then something clicks, and the borrow checker becomes your best friend — it catches bugs at compile time that would be race conditions in production.

Tip 3: Embrace the Type System

// Instead of Go's error-prone string types:
// type UserId string (Go — nothing stops you from mixing UserIds and OrderIds)

// Rust: newtype pattern prevents mixing types
struct UserId(i64);
struct OrderId(i64);

// This won't compile — UserId and OrderId are different types
// fn process(user: UserId, order: OrderId) { ... }
// process(order_id, user_id) // COMPILE ERROR

Tip 4: Use Docker Multi-Stage Builds

# Build stage
FROM rust:1.76-slim AS builder
WORKDIR /app
COPY . .
RUN cargo build --release

# Runtime stage — minimal image
FROM debian:bookworm-slim
RUN apt-get update && apt-get install -y ca-certificates && rm -rf /var/lib/apt/lists/*
COPY --from=builder /app/target/release/myservice /usr/local/bin/
EXPOSE 3000
CMD ["myservice"]

# Final image: ~25MB (vs ~250MB for Go with Alpine)

Tip 5: Compile Times Are Real

Full Rust builds for a medium-sized project take 3-5 minutes. Incremental builds take 5-15 seconds. Use:

cargo check instead of cargo build during development
sccache for shared compilation cache
cargo-nextest for faster test execution
Consider mold linker for 2-3x faster linking

The Broader Trend: Rust Beyond Backends

Rust's enterprise adoption isn't limited to backends:

PL/Rust in PostgreSQL 18: Write database functions in Rust for native performance
WebAssembly: Rust is the primary language for WASM server-side (Spin, Fermyon)
Cloud infrastructure: Cloudflare Workers, AWS Lambda (custom runtime), Fastly Compute
CLI tools: ripgrep, fd, bat, delta — Rust CLIs replacing Unix classics
Embedded/IoT: Embassy framework for async embedded Rust

24.3% of Rust adoption is in cloud infrastructure — the single largest use case.

<div style="margin:2.5rem auto;max-width:600px;width:100%;text-align:center;"><svg viewBox="0 0 600 200" xmlns="http://www.w3.org/2000/svg" style="width:100%;height:auto;"><rect width="600" height="200" rx="12" fill="#1a1a2e"/><rect x="0" y="0" width="600" height="28" rx="12" fill="#2d2d44"/><rect x="0" y="12" width="600" height="16" fill="#2d2d44"/><circle cx="18" cy="14" r="5" fill="#ef4444"/><circle cx="34" cy="14" r="5" fill="#f59e0b"/><circle cx="50" cy="14" r="5" fill="#2dd4bf"/><text x="300" y="18" text-anchor="middle" fill="#94a3b8" font-size="10" font-family="system-ui">docker-compose.yml</text><rect x="0" y="28" width="35" height="172" fill="#1e1e32"/><text x="25" y="48" text-anchor="end" fill="#94a3b8" font-size="10" font-family="monospace" opacity="0.5">1</text><text x="25" y="66" text-anchor="end" fill="#94a3b8" font-size="10" font-family="monospace" opacity="0.5">2</text><text x="25" y="84" text-anchor="end" fill="#94a3b8" font-size="10" font-family="monospace" opacity="0.5">3</text><text x="25" y="102" text-anchor="end" fill="#94a3b8" font-size="10" font-family="monospace" opacity="0.5">4</text><text x="25" y="120" text-anchor="end" fill="#94a3b8" font-size="10" font-family="monospace" opacity="0.5">5</text><text x="25" y="138" text-anchor="end" fill="#94a3b8" font-size="10" font-family="monospace" opacity="0.5">6</text><text x="25" y="156" text-anchor="end" fill="#94a3b8" font-size="10" font-family="monospace" opacity="0.5">7</text><text x="25" y="174" text-anchor="end" fill="#94a3b8" font-size="10" font-family="monospace" opacity="0.5">8</text><text x="25" y="192" text-anchor="end" fill="#94a3b8" font-size="10" font-family="monospace" opacity="0.5">9</text><text x="45" y="48" fill="#a855f7" font-size="11" font-family="monospace">version</text><text x="100" y="48" fill="#e2e8f0" font-size="11" font-family="monospace">: &quot;3.8&quot;</text><text x="45" y="66" fill="#a855f7" font-size="11" font-family="monospace">services</text><text x="105" y="66" fill="#e2e8f0" font-size="11" font-family="monospace">:</text><text x="55" y="84" fill="#3b82f6" font-size="11" font-family="monospace"> web</text><text x="80" y="84" fill="#e2e8f0" font-size="11" font-family="monospace">:</text><text x="55" y="102" fill="#2dd4bf" font-size="11" font-family="monospace"> image</text><text x="110" y="102" fill="#e2e8f0" font-size="11" font-family="monospace">: nginx:alpine</text><text x="55" y="120" fill="#2dd4bf" font-size="11" font-family="monospace"> ports</text><text x="102" y="120" fill="#e2e8f0" font-size="11" font-family="monospace">:</text><text x="55" y="138" fill="#e2e8f0" font-size="11" font-family="monospace"> - &quot;80:80&quot;</text><text x="55" y="156" fill="#2dd4bf" font-size="11" font-family="monospace"> volumes</text><text x="118" y="156" fill="#e2e8f0" font-size="11" font-family="monospace">:</text><text x="55" y="174" fill="#e2e8f0" font-size="11" font-family="monospace"> - ./html:/usr/share/nginx</text><rect x="365" y="164" width="2" height="14" fill="#6366f1" opacity="0.8"/></svg><p style="margin-top:0.75rem;font-size:0.85rem;color:#94a3b8;font-style:italic;line-height:1.4;">A well-structured configuration file is the foundation of reproducible infrastructure.</p></div>

The Bottom Line

Rust in production isn't about Rust being "better" than Go, Python, or Java in every dimension. It's about Rust being measurably better in specific, high-value dimensions: memory efficiency, latency predictability, and CPU utilization.

Grab's 70% cost reduction is the headline, but the real story is the predictability. No GC pauses. No memory spikes. No latency surprises. For services where those properties matter — and every engineering team has a few — Rust is the clear choice in 2026.

The question isn't whether to use Rust. It's which services to rewrite first.

#rust#backend#performance#cloud-costs#go-vs-rust

Need help with cloud infrastructure?

TechSaaS provides expert consulting and managed services for cloud infrastructure, DevOps, and AI/ML operations.