Design a System for Data Auditing
Design a system to log every read, write, and change to sensitive data for compliance purposes. Focus on integrity and tamper-proofing of the audit logs.
Why Interviewers Ask This
Interviewers at Salesforce ask this to evaluate your ability to design immutable, high-integrity systems for compliance. They specifically test if you understand tamper-proofing mechanisms like hashing chains and how to balance strict audit requirements with system performance. The question reveals whether you can prioritize data security and regulatory adherence over convenience in a multi-tenant environment.
How to Answer This Question
1. Clarify Requirements: Immediately define scope, such as 'sensitive data' definitions (PII, financial) and retention policies required by GDPR or SOC2. 2. Define Core Components: Outline the Data Source, Audit Logger, Storage Layer, and Verification Service. 3. Design Tamper-Proofing: Propose a cryptographic chain of hashes where each log entry includes the hash of the previous one, ensuring any alteration breaks the chain. 4. Address Scalability: Discuss partitioning logs by tenant or region to handle Salesforce's massive scale without blocking write operations. 5. Security & Access Control: Explain role-based access controls (RBAC) and encryption at rest/in transit. 6. Verification Strategy: Describe an automated integrity check process that runs periodically to detect anomalies.
Key Points to Cover
- Explicitly mentioning a hash-chain or Merkle tree structure to prove understanding of tamper-proofing
- Discussing asynchronous logging to prevent performance bottlenecks on the primary transaction path
- Addressing specific compliance frameworks like GDPR or SOC2 relevant to enterprise software
- Defining clear separation between the data being audited and the audit logs themselves
- Proposing a concrete verification strategy to detect integrity breaches
Sample Answer
To design a robust auditing system for sensitive data, I would first clarify that every read, write, and modification must be captured with a timestamp, user ID, and the before/after state. The core challenge is immutability. I propose an append-only ledger where each new log entry contains a cryptographic hash of the previous entry, creating a linked chain. If an attacker modifies a historical record, the subsequent hashes will no longer match, instantly flagging tampering. For storage, we should use a distributed object store like S3 with versioning enabled, ensuring even deleted entries are recoverable. To maintain performance during high-throughput writes typical in Salesforce environments, the logging mechanism should be asynchronous; the main application thread writes to a local buffer and flushes to the audit service via a message queue like Kafka. This decouples the critical path from the heavy I/O of logging. We must also enforce strict RBAC so only authorized compliance officers can view raw logs, while all other users see only aggregated metrics. Finally, we implement a daily integrity verification job that recalculates the Merkle tree root of the entire log set to ensure global consistency. This approach ensures we meet strict compliance standards while maintaining system availability.
Common Mistakes to Avoid
- Focusing only on storing logs without explaining how to cryptographically prove they haven't been altered
- Designing a synchronous logging process that blocks user transactions, causing latency issues
- Ignoring the scale of data and failing to discuss partitioning or sharding strategies for the logs
- Overlooking access control measures, leaving audit trails vulnerable to internal threats
Practice This Question with AI
Answer this question orally or via text and get instant AI-powered feedback on your response quality, structure, and delivery.
Related Interview Questions
Design a CDN Edge Caching Strategy
Medium
AmazonDesign a System for Monitoring Service Health
Medium
SalesforceDesign a Payment Processing System
Hard
UberDesign a System for Real-Time Fleet Management
Hard
UberTrade-offs: Customization vs. Standardization
Medium
SalesforceSearch in Rotated Sorted Array
Medium
Salesforce