Design a Time Series Database (TSDB)

Question

Accepted Answer

To design a TSDB for Tesla's fleet, I would prioritize write throughput and storage efficiency given the volume of telemetry from millions of vehicles. First, I'd define the schema using a column-oriented store where each column represents a metric like battery voltage or motor RPM. This allows us to apply highly effective compression algorithms independently per column.

For ingestion, data would flow into a high-speed in-memory structure before being flushed to disk as immutable SSTables. This ensures low-latency writes even during peak data bursts. Crucially, I would implement specialized compression: using Delta-of-Delta encoding for timestamps since they are sequential, and Gorilla XOR compression for floating-point sensor readings, which typically exhibit small changes between samples. This could reduce storage needs by up to 90% compared to raw text.

Regarding indexing, a global partition key based on Vehicle ID (VIN) is essential for isolation. Within partitions, we maintain a sorted index on timestamps. For queries requiring filtering across multiple cars, I'd layer a secondary inverted index on tags like 'model' or 'region'. This hybrid approach supports both fast single-vehicle diagnostics and broad fleet-level analytics required for OTA updates and safety monitoring.

Design a Time Series Database (TSDB)

Why Interviewers Ask This

How to Answer This Question

Key Points to Cover

Sample Answer

Common Mistakes to Avoid

Sound confident on this question in 5 minutes

Related Interview Questions

Design a Payment Processing System

Design a System for Real-Time Fleet Management

Design a CDN Edge Caching Strategy