High level design of the Distributed Cache
Functional requirements
- insert data into the distributed cache
- retrieve data from the distributed cache
NFR
- Fast retrieval of data - insert and retrieve should be fast
- scalability - distributed cache must be scalable horizontally without bottlenecks
- availability - ensure that there is sufficient redundancy to keep the system highly available.
- Consistency - The same data retrieved from the cache concurrently should be the same.
- Affordability - shouldn’t cost the planet
API
- put (key, value)
- get (key)
Design considerations
Storage hardware
Make use of commodity hardware where possible to keep costs relatively low. Large data will demand more shards, hence keeping commodity hardware will enable us to scale efficiently.
Also when it comes to storing data, we could make use of persistent storage on the commodity hardware for less frequently used data. However, this depends on the cost of fetching data from there and the performance expected from the cache.
Data structure
Hashtables for storing and retrieving data. Doubly linked list for eviction.
Cache client
Where does it live? Is this within the serving host? or is this a separate dedicated cache client?
Writing policy
based on application needs choose one - write through, or back or around.
Eviction
LRU and invalidation.
High Level Design
Client appliation -> a load balancer => forks into the different application service hosts, where each application server has a cache client => forks into the various cache servers -> which all then lead to the persistent storage.
The cache clients should have a consistent view of all the cache servers. They should all agree on the writing policy and eviction policy.
Cache clients use UDP or TCP to transfer data. Any cache server goes down and then the client treats it like a cache miss.
Addressing limitations
- cache clients are agnostic to cache server failures or additions
- each server has a particular set of data - so still single point of failure.
- what data-structure to be used inside the cache.
First problem - cache server list
many ways to solve this.
- Put a configuration file in every application server that is updated regularly through some CI or tools when a cache server is evicted or added to the cluster
- put the same configuration in a central location
- create an cache server configuration service - it observes the cache cluster health and has an updated view of the nodes available.
Availability and resilience
Sharding and replication. Add a primary and 2 backups like we saw in the key value store. Write to replicas based on server proximity - async or sync.
Internals of the cache server
Looking a shard of the cache cluster - which comprises a primary and a couple of secondary nodes.
Retrieval is done based using a key that’s a pointer in a hash table to the value in a doubly linked list that is backing the hashtable.
There is no need for delete API - as eviction is pretty much deletion.
Detailed design
Client applications -> load balancer => that forks into several application servers that has a cache client each => talks to a cache configuration service that keeps the cache clients aware of the hosts in the cache cluster => cache shards store data in the RAM of the cache servers
All of this is monitored.
Considerations
How to deal with hot data despite attempting to distribute data evenly through sharding?
Consistent sharding with virtual nodes could help. there are many other methods too:
- load aware sharding
- improved hash functions
- dynamic load balancing are just some alternatives