What is Software Architecture and what does a software architect do?

What is Software Architecture

Software architecture refers to the fundamental structures of a software system and the discipline of creating such structures and systems.

What is a Software Architect

Some people compare the role of a software architect to that of a town planner. The town planner does not care how a house in a residential block is built. The town planner is responsible for deciding where blocks of residential properties go, think about the commercial complexes, the parks and schools or institutions, how they are laid to make the town easy to navigate and also making sure that essentials like water and electricity supply gets to where it is needed in the most efficient way. They do have to think about where the place would be to keep exits to highways, and if we might need flyovers, traffic signals, roundabouts, etc in certain parts based on how the population evolves.

All that I mentioned earlier, in the software world, is what a Software Architect takes care of. That is they are not interested in how you build your particular component, what language you use to build it either. But they set the rules about how you talk to other components and how many instances of your component might be necessary. I don’t really see many roles like this today as software architecture has evolved to become a collaboratively built concept that often evolves based on constraints and the direction in which a company or product is moving. Thus I really haven’t seen this job title in my most recent jobs. This is probably a common title in a consulting organisation where people generally work with other organisations, who are their clients, to get them up and running or to solve some of their technical challenges. In an agile environment, generally the technical leads or the senior engineers in the team or department, work together and come up with what evolves into the architecture of the application.

Alright, I have digressed. Let us stick to the topic.

Let me try again, the software architect:

Focuses on the structure - not low level details
Anticipates expensive choices - often times what’s decided at the architecture level will be pretty expensive to change later on
Core decisions that will enable high quality, availability and other abilities

Sounds familiar? Those who are Technical leads, probably already do this a lot. In some smaller firms, engineering managers do this too. I mean in my most recent role, I was managing, scrum mastering and also involved in design and code reviews. I think it is a company culture where people are given multiple hats to wear.

How does one go about this

Begin architectural thinking by considering what the functional requirements are:

The main functionality of the software
Think about other requirements, often called, non functional requirements
- Some examples include:
  - evolves over several years - maintainability
  - serves several million transactions/users - scalability
  - availability and efficiency
Constraints or restrictions
- legal compliance
- standards
- cost
- hiring etc
Prioritisation
- We cannot build everything within the time constraints present to us. This is where prioritisation comes into picture. This helps us find out what must go first and what can be delivered gradually. What features are the bare minimum and hence, what non-functional requirements would be most important. Ensure that this is agreed with your client so that everyone is on the same page.

Software architecture, unlike that of a building, evolves. In building construction, the architect comes up with radical ideas that often push the limits of civil and structural engineering. But software architecture is in no way like that. You do not make architectural choices to symbolically represent emotions or feelings. Software architecture evolves based on the functional and non-functional requirements of the application.

What is the difference between Software Design and Software Architecture

This is a tricky distinction and often times confusing and some people even use the term design to refer to architecture.

However, I distinguish the two by thinking this way: Architecture involves a focus on creating a structured system that must comply with the non-functional requirements.

Software design on the other hand is often how the code level components are structured. The responsibilities of a class, the interfaces of a module, etc. This is what software engineers start thinking about as they grow in seniority in the team.

Where a software engineer might be obsessing with design patterns, an architect would be looking at architectural patterns. These are common patterns that can be applied to much higher level of the system of suite of systems.

In other words, where a software engineer, talks about SOLID principles, an architect would be like, how would the different web services communicate, which service must be public, do we need an API gateway etc.

How to start with architecture of a system

Start with designing a system that meets the bare minimum and evolve.

I think this applies to not just architecture but any sort of software development. Do not cater for every possible scenario in the beginning. This could lead to analysis paralysis or even lead to over-engineering. This is the essence of lean thinking or agile software development. Focus on delivering increments of business value by developing iteratively.

What architecture might be most suitable for my problem?

I found this free ebook online that seems like a good starting point. EBOOK - Software Architecture Patterns

Some common architectural patterns can be viewed here:

Typical application development starts with a layered - n-tiered architecture with no emphasis on anything else. This is often a sign that the system really lacks a serious thought on architecture.

Walk through an example of improving the sample

Let us start tackling some non-functional requirements. Take a look at the sample architecture diagram. You can see that we have several servers running our API and several databases and databases probably have redundant replicas for disaster recovery too.

What do you see? We can certainly handle a decent amount of load.

The requests from the UI goes to the API via a load balancer that decides which API instance serves the request.
The API then makes a request to a pool of database servers via another load balancer.

So in case of our earlier diagram, we have plenty of servers and with more servers. Let us say that after the system went live, we discovered that there are times when the system performs slowly. We do have latency and we can observe that anything that needs to be verified has to go all the way to the database.

How do we make it faster?

A simple improvement is to employ a Cache! Cache in computing is a high speed storage layer which stores a subset of data which is often the most frequently requested data that is also static for a longer period of time, thus for any subsequent requests, the data can be fetched from the cache instead of the database.

Caching can be done in many ways:

Cache data using a CDN
Cache most frequently things in memory
Use distributed cache to store commonly accessed stuff

This is great. But what if cached data gets outdated?

Every solution introduces a different kind of problem. A cache certainly helps speed up queries but with caching comes a new problem - Invalidation of cache when data is outdated. Some popular ways:

Timed invalidation - data stored in the cache has an expiry
Alternate way of ensuring cache data is always updated - employ these techniques based on how crucial it is to serve the latest data
- if latest data is not as important but this algorithm to keep data fresh is expensive, then maybe it is not worth the effort.

Now what?

Caching is a strategy that can be employed after we have a system up and running. Similarly as discussed earlier, we optimise something after we see the the pain points. Not before you hit the pain points. Premature optimization is always often unnecessary. Always keep this in mind before thinking of a solution.

Things software architects must know

CAP Theorem

Something very key in understanding how to architect/design a distributed system is the CAP theorem by Eric Brewer

CAP is an acronym where the letters stand for Consistency, Availability and Partition tolerance

An excellent explanation can be seen from the video below:

The theorem formalizes the trade-off between consistency and availability in case of partitions.

Domain Name System

A naming system for computers, services or other resources connected to the internet or a private network.

The video takes you through what happens when you type in a URL in a browser.

Web Servers

A software that serves HTTP requests.

Load balancing

A load balancer is a device that distributes load of network requests or traffic across a cluster of servers. This helps improve responsiveness and availability of applications.

Database servers

This is a computer with some software that is dedicated to perform storage and retrieval of data. It generally is the gateway to some Database Management System. Different types of DBMSes have different types of database servers.

Content Delivery Networks

Is a service that accelerates the delivery of content over the internet! Oversimplified definition. Watch the video to learn more.

Caching services

We roughly touched on this topic earlier, but you can learn more here.

I think I already mentioned a case for a distributed cache. You can learn more about this here:

Message Queues

A queue a data structure that imitates a real life queue. An item that gets inserted into a queue first, gets out of it first. FIFO - first in, first out. When building large scale systems, it is often required that we employ some sort of message queues that provide additional resilience and also guarantees some order of processing in our systems.

Cloud storage

It is your disk on the cloud! I might have oversimplified things a lot. But to know more but quickly, follow the video below.

Different types of database systems

Most often as you begin your career, you start with relational databases. Over the past decade, there have been an explosion in mainstream database systems. You might have already heard of NoSQL databases. Luckily someone has made a really cool summary video. It takes you through key-values, wide column, document, relational, graph and others.

There are some that weren’t covered in that video, like Timeseries databases:

What about lower level concepts

Just because I classified some of those topics as must know, does not mean an architect does not have to know anything else.

Someone who wants to become an architect often has to have experienced software development and must have experienced good software development practices too. Let us start with some of the most important ones.

SOLID principles

This is something that every software developer who has worked on object oriented programming languages know about.

S – Single responsibility principle. A class should have one and only one reason to change, meaning that a class should have only one key responsibility.
O – Open-closed principle. Objects or entities should be open for extension, but closed for modification.
L – Liskov substitution principle. In an object oriented program if we substitute a superclass object reference with an object of a subclass, then the program should not fail/break.
I – Interface segregation principle. A client should never be forced to implement an interface that it does not use and clients must not be forced to depend on methods they do not use.
D – Dependency Inversion Principle. Entities must depend on abstractions not on concretions.

Here is a video where Uncle Bob, Robert Martin who coined the term talking about it. It is pretty long.

Tim Corey has a set of videos taking you through every principle with examples in C#. This is a playlist.

Continuous Integration, Delivery and Deployment

I think I have written about CI/CD in several different posts, the most recent one being

Continuous Integration using Azure Pipelines in YAML

An excellent source of details on this can be found on Atlassian’s blog

Summary

I think I have composed a summary. But there is a lot more and a lot of things you learn as you progress in your career. I will add more resources as I find some interesting stuff. For now I think this is a lot.

Useful resources

Fundamentals of Software Architecture

What is Software Architecture#

What is a Software Architect#

How does one go about this#

What is the difference between Software Design and Software Architecture#

How to start with architecture of a system#

What architecture might be most suitable for my problem?#

Walk through an example of improving the sample#

How do we make it faster?#

This is great. But what if cached data gets outdated?#

Now what?#

Things software architects must know#

CAP Theorem#

Domain Name System#

Web Servers#

Load balancing#

Database servers#

Content Delivery Networks#

Caching services#

Message Queues#

Cloud storage#

Different types of database systems#

What about lower level concepts#

SOLID principles#

Continuous Integration, Delivery and Deployment#

Summary#

Useful resources#