What is it?
It is the ability of a system to handle an increasing amount of workload by adding resources to the system in order not to compromise performance.
The workload however can be of different types. Some examples include, number of requests, or the amount of data stored or retrieved.
Most of this article is based on what’s on Wikipedia.
Dimensions of scalability
Administrative Scalability
The ability of an increasing number of organisations or users to access the system
Functional scalability
Ability to continuously add functionality to a system without degrading or disrupting the performance expected from the system
Geographic scalability
The ability to maintain effectiveness during expansion from a local area to a larger region.
Load scalability
The ability of the system to expand and contract to accommodate heavier or lighter loads including ease with which a system or component can be modified, added or removed to accommodate varying loads.
And the list goes on.
Common approaches to scaling
Vertical scaling - scale up
This is the processing of adding resources to an existing node or device. Expand existing hardware of software capacity to the limits of the server. After a point, adding resources this way leads to minimal returns - the point of diminishing returns. This is when you look for the alternative.
Horizontal scaling - scale out
This refers to increasing the number of nodes in a distributed system. Buy several cheap commodity servers and deploy them at scale and this ensures that the cost does not increase rapidly.
The difficulty here is that one has to build a system that can work collectively as if the whole system was a huge single server.