Design a Hash Set from Scratch

Question

Accepted Answer

To design a robust HashSet from scratch, I would start by defining a class that initializes an internal array of buckets. Each bucket will be a dynamic list to handle collisions gracefully. For the hash function, I'd use a standard approach where the hash code is calculated as the key modulo the current array size. This ensures keys are distributed evenly across available slots. However, I must handle negative results from the hash code by taking the absolute value before applying the modulo operator. When implementing the add method, I first compute the index using the hash function. If the element already exists in that specific bucket, I return immediately to prevent duplicates. Otherwise, I append it to the bucket list. The contains method follows a similar path: calculate the index, traverse the bucket, and check for equality. For removal, I locate the element in the correct bucket and delete it, ensuring I don't leave gaps that affect future lookups. Crucially, I would implement a resize strategy. If the number of elements divided by the array size exceeds a load factor of 0.75, I double the array size and rehash all existing elements into the new buckets. This prevents performance degradation from excessive collisions, which is vital for maintaining low latency in real-time applications like ride-matching systems at Uber.

Design a Hash Set from Scratch

Why Interviewers Ask This

How to Answer This Question

Key Points to Cover

Sample Answer

Common Mistakes to Avoid

Sound confident on this question in 5 minutes

Related Interview Questions

Design a Set with $O(1)$ `insert`, `remove`, and `check`

Find the Celebrity (Graph)

Convert Binary Tree to Doubly Linked List in Place