What are clarifying questions and calculations

Most of the clarifying questions you ask are to ensure that we are building the right product for a specific target audience. At least some of them can be used to ensure you make the right choices when designing a system from early on.

What factors are involved in these questions

Remember the era when companies bought racks of servers in data centres. These days most big companies have their own data centre and rent their virtual and physical servers for others to use. Most of these services use commodity hardware to save cost and develop scalable solutions. Let’s discuss the types of servers that are used in a data centre

Web servers

First point of contact after load balancers. Data centres have racks full of web servers that usually handle API calls from clients. Depending on the needs of the service, the cpu, storage, memory etc needs can be very different.

Application servers

They run the core application software and business logic.

What’s the difference between app servers and web servers? It is not a black and white difference. App servers does some additional computation stuff - hence serve dynamic content whereas webservers serve static content. These servers are for extensive computational needs and might require processors with many cores and extensive RAM and disk space requirements.

Storage servers

Storage needs of applications have grown extensively over time. Pictures these days are shot in extremely high quality cameras that produce 60+ megapixel images that result in image files that are about 50-60Mb each. So has the demand for video content grown heavily over the years, resulting in the need for extensive elastic (you can expand as you need) storage.

Some examples of the various types of storage needs at Youtube:

  1. Blob storage for its encoded videos
  2. Temporary processing queue storage that can hold a few 100 hours of video for upload processing
  3. Bigtables for storing video thumbnails
  4. RDBMS for user and video metadata.

There are several other needs for storage - you might need to do analytics, sql and nosql database managements systems.

Standard numbers to remember

You don’t have to remember exact numbers but a ball park range to be able to weigh options.

ComponentLatency (nanoseconds)
L1 cache0.9
Memory reference100
Read 1MB from Memory3000
Read 1MB from SSD9000
Round trip within same data centre500,000
Read 1MB SSD at 1GB/s1,000,000
Read 1MB sequentially from magnetic disk2,000,000
Send data packet from west coast to east cost71,000,000

Some people even go to the details about throughput numbers per datastore. Honestly that is too much information and varies widely depending on the machine spec that the application is running on.

Request Estimation