SLO-driven architecture diagram showing reliability metrics integrated from design to deployment

Stop Shipping Hope: SLOs Must Guide Your Architecture, Not Just Your Release

Let’s talk about something fundamental, something often relegated to the last minute, but which, when embraced early, can elevate the craft of software engineering from mere coding to true engineering excellence. I’m speaking of Service Level Objectives (SLOs) and Service Level Indicators (SLIs). Remind me what they are again SLI - Service Level Indicator A quantitative metric for a service’s performance, as experienced by the user of the service. It is a measure of a property of the service that is a good proxy for your user experience. ...

December 4, 2025 · 9 min · 1767 words · eakangk
An incident commander from the future potentially a cyborg

What is Incident Management in Software Engineering?

Background Any software that has ever been built has had a bug or problem of some sort. Generally, these bugs might be silly things that aren’t of any major concern - like a button looking odd or only clicking when the mouse is at a certain part of it. Some bugs, on the other hand, could have serious impact on the users of the software or those that are indirectly affected by the software - e.g. a problem with the billing system in an energy billing platform, could potentially impact the amount the customers have to pay - what if the bug resulted in final amount being multiplied by a certain arbitrary number! Imagine the reputation of the energy company who are clients of a Billing SaaS provider if such a bug were to happen. ...

March 2, 2024 · 19 min · 3953 words · eakangk
Site Reliability Engineering

Site Reliability Engineering vs DevOps — How they differ and when to use each

What is SRE? SRE stands for Site Reliability Engineering. That’s just a lot of words. What does it mean though? Site Reliability engineering is what IT operations would be if it was run by software engineers. That’s an interesting take. But it was not helpful in clarifying anything about SRE just yet. Let’s try probing more. How did we go from Development to SRE? You know the part where people deploy software and then ensure things run fine in production. ...

December 4, 2021 · 14 min · 2849 words · eakangk