
"Airbnb's engineering team has rolled out Mussel v2, a complete rearchitecture of its internal key value engine designed to unify streaming and bulk ingestion while simplifying operations and scaling to larger workloads. The new system reportedly sustains over 100,000 streaming writes per second, supports tables exceeding 100 terabytes with p99 read latencies under 25 milliseconds, and ingests tens of terabytes in bulk workloads, allowing caller teams to focus on product innovation rather than managing data pipelines."
"Mussel v2 addresses these constraints by pairing a NewSQL backend with a Kubernetes native control plane, delivering the elasticity of object storage, the responsiveness of a low-latency cache, and the operability of modern service meshes in a single platform. The system uses Kubernetes manifests with automated rollouts, dynamic range sharding with presplitting to mitigate hotspots, and namespace-level quotas and dashboards to improve cost transparency. The Dispatcher layer is stateless and horizontally scalable, routing client API calls, handling retries, and supporting dual write and shadow read modes to facilitate migration."
"Writes are first persisted into Kafka for durability, and downstream Replayer and Write Dispatcher components apply them to the backend database in order. Bulk load continues via Airbnb's data warehouse using Airflow jobs and S3 staging, preserving merge or replace semantics."
Mussel v2 is a complete rearchitecture of Airbnb's internal key value engine that unifies streaming and bulk ingestion while simplifying operations and scaling to larger workloads. The new system sustains over 100,000 streaming writes per second, supports tables exceeding 100 terabytes with p99 read latencies under 25 milliseconds, and ingests tens of terabytes in bulk workloads. Mussel v2 pairs a NewSQL backend with a Kubernetes-native control plane to combine object storage elasticity, low-latency cache responsiveness, and service mesh operability. The system uses Kubernetes manifests with automated rollouts, dynamic range sharding with presplitting, and namespace-level quotas and dashboards to mitigate hotspots and improve cost transparency. A stateless, horizontally scalable Dispatcher layer routes client API calls, handles retries, and supports dual write and shadow read modes for migration. Writes are first persisted into Kafka for durability, with downstream Replayer and Write Dispatcher components applying them to the backend in order. Bulk load proceeds via the data warehouse using Airflow jobs and S3 staging while preserving merge or replace semantics.
Read at InfoQ
Unable to calculate read time
Collection
[
|
...
]