High level design of the Distributed Cache

Functional requirements

insert data into the distributed cache
retrieve data from the distributed cache

NFR

Fast retrieval of data - insert and retrieve should be fast
scalability - distributed cache must be scalable horizontally without bottlenecks
availability - ensure that there is sufficient redundancy to keep the system highly available.
Consistency - The same data retrieved from the cache concurrently should be the same.
Affordability - shouldn’t cost the planet

API

put (key, value)
get (key)

Design considerations

Storage hardware

Make use of commodity hardware where possible to keep costs relatively low. Large data will demand more shards, hence keeping commodity hardware will enable us to scale efficiently.

Also when it comes to storing data, we could make use of persistent storage on the commodity hardware for less frequently used data. However, this depends on the cost of fetching data from there and the performance expected from the cache.

Data structure

Hashtables for storing and retrieving data. Doubly linked list for eviction.

Cache client

Where does it live? Is this within the serving host? or is this a separate dedicated cache client?

Writing policy

based on application needs choose one - write through, or back or around.

Eviction

LRU and invalidation.

High Level Design

Client appliation -> a load balancer => forks into the different application service hosts, where each application server has a cache client => forks into the various cache servers -> which all then lead to the persistent storage.

The cache clients should have a consistent view of all the cache servers. They should all agree on the writing policy and eviction policy.

Cache clients use UDP or TCP to transfer data. Any cache server goes down and then the client treats it like a cache miss.

Addressing limitations

cache clients are agnostic to cache server failures or additions
each server has a particular set of data - so still single point of failure.
what data-structure to be used inside the cache.

First problem - cache server list

many ways to solve this.

Put a configuration file in every application server that is updated regularly through some CI or tools when a cache server is evicted or added to the cluster
put the same configuration in a central location
create an cache server configuration service - it observes the cache cluster health and has an updated view of the nodes available.

Availability and resilience

Sharding and replication. Add a primary and 2 backups like we saw in the key value store. Write to replicas based on server proximity - async or sync.

Internals of the cache server

Looking a shard of the cache cluster - which comprises a primary and a couple of secondary nodes.

Retrieval is done based using a key that’s a pointer in a hash table to the value in a doubly linked list that is backing the hashtable.

There is no need for delete API - as eviction is pretty much deletion.

Detailed design

Client applications -> load balancer => that forks into several application servers that has a cache client each => talks to a cache configuration service that keeps the cache clients aware of the hosts in the cache cluster => cache shards store data in the RAM of the cache servers

All of this is monitored.

Considerations

How to deal with hot data despite attempting to distribute data evenly through sharding?

Consistent sharding with virtual nodes could help. there are many other methods too:

load aware sharding
improved hash functions
dynamic load balancing are just some alternatives

High level design of the Distributed Cache#

Functional requirements#

NFR#

API#

Design considerations#

Storage hardware#

Data structure#

Cache client#

Writing policy#

Eviction#

High Level Design#

Addressing limitations#

First problem - cache server list#

Availability and resilience#

Internals of the cache server#

Detailed design#

Considerations#