Distributed Systems

What is a System Design Interview

Complete guide to understanding system design interviews, what interviewers expect, and how to prepare effectively for distributed systems and scalability questions.

Domain Name System (DNS) Explained

Complete guide to understanding DNS (Domain Name System), how it works as the internet’s phonebook, DNS architecture, and resource record types for system design interviews.

What is System Design? Understanding the Fundamentals

Introduction to system design fundamentals covering core concepts, design principles, thinking patterns, and the philosophy behind building large-scale systems.

Transactional Outbox Pattern Architecture Diagram

Engineering Reliability: Navigating the Trade-offs of the Transactional Outbox Pattern

In my career spanning financial market data platforms, telecom systems, insurance quoting systems and energy billing, I’ve come to appreciate that the craft of true software engineering isn’t about avoiding complexity, it is about choosing the right kind of complexity. In the world of event-driven architectures (EDA), when a microservice needs to change its state and notify the rest of the world of this event, it faces a fundamental engineering challenge, known as the Dual Write Problem. This is the Achilles’ heel of distributed systems: ensuring that a local database update and an external event publication are an atomic pair. This operation has to be atomic. The write-to-the-database and the event publication has to be either both successful or both fail. One cannot succeed on its own as it would break consistency. ...

Design a Content Delivery Network (CDN)

Complete CDN design walkthrough covering architecture, edge servers, caching policies, content propagation, cache invalidation, and serving content globally at scale.

Design a Distributed Key-Value Store

Complete key-value store design covering requirements, API design, consistent hashing, data partitioning, replication strategies, failure handling, and scaling.

Load Balancers Explained - Part 1

Complete guide to load balancers covering algorithms (round-robin, least connections), global vs local load balancing, stateful vs stateless approaches, and scaling strategies.

Failure Models in Distributed Systems

Complete guide to failure models covering crash failures, omission failures, Byzantine failures, network partitions, and fallacies of distributed systems.

Consistency Models in Distributed Systems

Complete guide to consistency models covering strong consistency, eventual consistency, CAP theorem, linearizability, and trade-offs in distributed systems.

What is a System Design Interview

Most good organisations expect their engineering hires to have gone through a system design interview. For a vast majority of people this might sound like a weird thing to do as most people don’t individually design large scale systems. So expecting someone to design a highly scalable and available system in less than 60 minutes is a daunting task. We must also consider the fact that not everyone gets to work in organisations that build large scale distributed systems. I think the time when I worked in FactSet was when I dealt with extremely large volumes of data and we focussed on the performance of queries from the database to sub 5 ms in order to ensure that even with latency, users would be able to see their graphs plotted in a second or so. Most applications wouldn’t really care about performance at that level because for their line business it probably doesn’t matter that much. ...