SLO-driven architecture diagram showing reliability metrics integrated from design to deployment

Stop Shipping Hope: SLOs Must Guide Your Architecture, Not Just Your Release

Let’s talk about something fundamental, something often relegated to the last minute, but which, when embraced early, can elevate the craft of software engineering from mere coding to true engineering excellence. I’m speaking of Service Level Objectives (SLOs) and Service Level Indicators (SLIs). Remind me what they are again SLI - Service Level Indicator A quantitative metric for a service’s performance, as experienced by the user of the service. It is a measure of a property of the service that is a good proxy for your user experience. ...

December 4, 2025 · 9 min · 1770 words · eakangk
Transactional Outbox Pattern Architecture Diagram

Engineering Reliability: Navigating the Trade-offs of the Transactional Outbox Pattern

In my career spanning financial market data platforms, telecom systems, insurance quoting systems and energy billing, I’ve come to appreciate that the craft of true software engineering isn’t about avoiding complexity, it is about choosing the right kind of complexity. In the world of event-driven architectures (EDA), when a microservice needs to change its state and notify the rest of the world of this event, it faces a fundamental engineering challenge, known as the Dual Write Problem. This is the Achilles’ heel of distributed systems: ensuring that a local database update and an external event publication are an atomic pair. This operation has to be atomic. The write-to-the-database and the event publication has to be either both successful or both fail. One cannot succeed on its own as it would break consistency. ...

September 8, 2025 · 6 min · 1151 words · eakangk