Intro
Global service, dealing with lots of data can cause latency if only one data centre is serving requests.
- high latency issue - customers on the other side of the world round trip is longer. For certain applications, round trip latency has to be really low. e.g. real time apps - 200m or less, VOIP - 150ms. streaming apps - less than a few seconds. The delay can be due to physical delays bandwidth available, propagation delays, queueing delays
- Data intensive applications: large data transfers across the globe can be problematic. The message transmission unit can vary across the globe that can cause severe reduction in throughput. As more requests come in to the same datacenter, more data has to be sent by the datacenter. Streaming services are both - data intensive and dynamic.
- Datacenter resource limitations: compute and network bandwidth are not infinite. Engaging millions of customers or requests require scaling and doing so within a datacentre still creates a single point of failure.
What is a CDN?
Content delivery network is a group of geographically distributed proxy servers. Proxy as in, an intermediary between the client and the origin server. The proxy servers are placed on the network edge - i.e. where a device or local network interfaces with the internet.
CDNs store static and dynamic data. The main problem that CDNs target to solve is - propagation delay, the time duration taken for a signal to reach its destination by bringing the data closer to its users. They reduce transmission and queuing delay.
- transmission delay: amount of time it takes to push all the packet’s bits into the wire
- queueing delay: the time a job waits in queue before it can be executed.
Thus the origin data centre only has to send data to the CDNs, who will then distribute data as necessary to the end users. The end user would rarely ever want to get data directly from the origin server. A good way to utilise CDNs is to ensure that the most popular data is stored/cached in the CDN.
Functional requirements
- Retrieve: CDN should be able to retrieve content from origin servers
- Request: CDN servers should be able to respond to each user’s request
- Deliver: Origin servers must be able to push content to CDN proxy servers
- Search: CDN should be able to run a search against a user query for cached or non-cached content within the CDN infrastructure
- Update: CDN should be able to update the content within peer CDN proxy servers in a PoP (point of presence)
- Delete: Based on the type of content, it should be possible to delete cached entries from the CDN servers after a user defined period