Warning
🚧 This series is a Work in Progress
I am writing, rewriting and publishing almost everyday. Some of the write-up links to others and sometimes I realise that context should have been given earlier, so I reorganise the pages in the series.
It felt more valuable to share this series as I wrote it, rather than wait till I polish everything up. This keeps me motivated to iterate quickly and avoid procrastination.
Feel free to share your thoughts and opinions as comments.
What are k8s objects?
Objects in k8s are entities that k8s uses to represent the state of your cluster. They are used to describe:
- what applications are running on which nodes in the cluster
- what resources are available to those applications
- what policies exist around the behaviour of those applications
Like I mentioned in the description of the control loop in Controller Manager, these object specs determine the desired state.
How do I work with them?
In order to work with k8s objects, you need to use the k8s API. You do this using the kubectl
CLI. If you wish to programmatically interact with the API, you could use one of the Client Libraries.
How do I describe an object to k8s?
In order to describe a k8s object, you have to create a manifest
file, which is a yaml
file with something called spec
, short for specification of the object. The spec is where you describe the characteristics of the resource - the desired state.
The other side of this is the status
, which describes the current state of the object, supplied and maintained by the k8s system. The control plane actively manages every object’s state to match the desired state that you supplied in the spec
.
So the control loop whenever a spec
is updated is about making the status
look like what’s described in the spec
.
Example
Let’s look at a manifest file example. The manifest file is a plain text file, most commonly found in YAML format by convention. You could write it in JSON too, if you really want. No one’s going to stop you. You then have to use kubectl
to send this spec
or desired state to the control plane. kubectl
talks to the API Server, converts the YAML to JSON behind the scenes and via the API, requests the control plane to achieve the desired state.
The control plane obliges.
The following is a manifest taken from the example at k8s.io
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 2 # tells deployment to run 2 pods matching the template
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
You might have noticed that before the spec
field in the manifest, there are some other fields that weren’t spoken about earlier. Notice the name
and the kind
fields. We’ll cover them soon.
In order to let the control plane know about your desired state, you use kubectl
and pass the path to the manifest file.
kubectl apply path/to/my/manifest.yaml
Et Voila! If all goes well, kubectl
prints something like this on the command line.
deployment.apps/nginx-deployment created
Anatomy of a manifest file - mandatory fields
Some of the fields in the manifest you saw earlier are mandatory fields and cannot be omitted.
apiVersion
- the version of k8s API you are talking tometadata
- data that helps uniquely identify your object in the cluster - name, uid, namespace etc.kind
- there are few different kinds of objects you can createspec
- desired state of your object
Checkout the K8s API Reference to figure out all the different fields for different kinds of k8s objects and all the gory details.
So what are some objects in k8s and what do they mean?
There are several different types of objects that you’ll come across in k8s. Many of these objects are watched by controllers of the same name which then ensure that your application maintains the desired state.
Before we go ahead I’d like to introduce a concept that is used a lot in the kubernetes world or in most distributed systems world:
A workload is an application running on kubernetes. Whatever your workload is, whether it is a single component or is comprised of many components, k8s runs them inside a set of pods.
That helps me start the definitions of the objects.
Pod
A pod represents a set of running containers on your cluster! They are the most fundamental deployable units of computing that you can create and manage in your k8s cluster.
If you look at the definition of the word pod in a dictionary, you’ll find things like:
a seed vessel of a leguminous plant such as the pea, splitting open on both sides when ripe
a detachable or self-contained unit on an aircraft, spacecraft, vehicle, or vessel, having a particular function: the torpedo’s sensor pod
a small herd or school of marine animals, especially whales
A pod in k8s, is a fundamental deployment unit that comprises one or more containers, with shared storage and network resources and a specification for how to run the containers.
Why more than one container?
A pod may need to run multiple containers because they are closely related in a way that is necessary. These containers are coupled in a way that their contents are always co-located and co-scheduled and run in a shared context; aka a namespace or cgroups. This is based on the needs of the application and its design.
Pods may contain something called an Init Container or ephemeral containers, or more recently sidecar containers! Fascinated? These are pretty advanced topics at this point, so you don’t need to think about them now, but if you are curious, search and find out.
Running multiple containers in a pod is a relatively advanced use case. Use this pattern where it is absolutely necessary - always remember
just because you can, doesn’t mean you should
ReplicaSets
ReplicaSets are used to maintain a stable set of replica Pods running at any given time in the a k8s cluster to ensure high availability. This is used to guarantee the availability of a specified number of identical pods.
How do they work?
Like other objects, you define ReplicaSets in a yaml with all the necessary fields. You have to let ReplicaSet know how to identify Pods and the number of Replicas and describe the pod in the pod template.
Pod Template
Pod template is a blueprint for creating a pod. It defines the pod spec in the relevant manifest file. It includes labels, annotations etc to identify the pod. Then the container images, commands, arguments, environment variables, any storage volumes that the pod might use, the cpu and memory resources required by the containers in the pod, then the health checks - including the liveness and readiness checks and some security settings called the security context for the pods and containers.
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: frontend
labels:
app: guestbook
tier: frontend
spec:
# modify replicas according to your case
replicas: 3
selector:
matchLabels:
tier: frontend
template:
metadata:
labels:
tier: frontend
spec:
containers:
- name: php-redis
image: us-docker.pkg.dev/google-samples/containers/gke/gb-frontend:v5
When do you use one?
The official recommendation from the k8s.io
docs is that a Deployment
allows you do to much more than just specify you need a certain number of replicas for your pod and takes over providing a declarative approach to updating your pods. Thus in fact, you may never need to use a ReplicaSet in production and can solely rely on defining and updating Deployment
s instead.
Deployment
Deployment is the most common way to describe and run an application on your cluster. It provides a way to declaratively update your application on k8s.
A desired state is described in a Deployment, and the Deployment controller changes the actual state to the desired state. The deployment describes your service, the number of replicas of which has to run at any given time, the name of the containers etc.
The are used to:
- Rolling out a replicaset
- declaring new state of the pods - updating your applications
- rolling back to an earlier deployment revision
- scaling up the deployment based on needs
- pausing the rollout of deployment
- cleaning up older replicasets
As this is a series for managers to get an overview, I’m not going to go into the details of each. But all of the interaction to k8s is done via kubectl
and the intent
is shared with the cluster through the deployment manifest file.
DaemonSets
As the name suggests they are used to run daemons.
In computer science, a daemon is a process that runs in the background.
DaemonSet ensures that a set of nodes, which maybe the whole set of nodes in a cluster, would run a copy of a pod. When you delete a DaemonSet, all the pods created as part of it will be cleaned up. And when the nodes running those daemons are kills, the daemon pods will be garbage collected too.
This is how we have solutions for storage, logs collection, monitoring etc running on k8s.
PersistentVolume
Storage is a different problem to compute. When talking about k8s most people are concerned about their microservices - the compute part of it and often forget that there could stateful applications that need to persist data somewhere. This concept provides an API for users and admins to abstract details of how storage is provisioned and consumed in k8s.
As we have covered Pods, you already know that pods can be created and destroyed at random and as often as needed. Thus storing data in a pod, is like accepting that the data is going to be volatile. Thus if you want the data that is being dealt with by your pod persisted somewhere, then you’d have to resort to Persistent Volumes
How can I use a persistent volume?
In simple steps:
- provision a persistent volume
- Then create a persistent volume claim - i.e. a request for storage by the user - claiming a certain size of storage
- k8s then binds a PVC to a PV that meets the criteria as defined by the claim
- Use the PVC in your pod spec yaml to mount the PV and start accessing it for storage!
I am not going to go further into the details as this is an overview of some core k8s concepts.
So if you want to know more, head over to the docs on this.
Service
In k8s, a Service a concept that exposes an application over the network within or beyond your k9s cluster, backed by one or more Pods. As we know by now, pods are ephemeral - could be created and destroyed in a short period of time. Every time a pod comes up it could have a different IP address in the cluster. Service is the abstraction that provides a stable endpoint, i.e. an IP address and a DNS name for a set of pods, based on how you define them, allowing other set of pods or components or end-users to interact with these pods, without having to know their exact individual IP addresses.
This provides some benefits which I think you might have already guessed:
- Services make your applications discoverable. They have a stable IP and DNS entry, making it easier to consume
- They work like load balancers when you have a pod with replicas. They abstract away the fact that replicas are serving requests. The clients do not need to connect directly to a pod, they connect to a service.
This abstraction enables you to expose a service in many different ways, which k8s refers to as service types, enabling you to expose a service internally within the cluster, on a static port on all nodes, create a load balancer that exposes the service externally and so on.
Wondering why one would expose a service on a static port on all nodes? This opens up a node to outside the cluster - useful for testing, development or some specific usecase that needs external access. This also enables a service to be consistently accessible on the same port all the time for all the nodes - applications like monitoring tools could make use of this. However, all these are examples that I have referenced to help understand what that means. Doesn’t mean you should use or have to use this.
StatefulSet
A StatefulSet is an interesting one, it is used to describe and manage an application that stores state. When your app manages state, the order and uniqueness of the pod instances become important.
This is particularly useful when:
- your application needs to have persistent unique network identifiers
- does something with persistent storage - like manages a database or datastore
- needs to ensure graceful deployment and scaling
Examples
Some real world examples to demonstrate the use-cases better:
- Your application is a Database. If you have a SQL database to run in the cluster where you need to ensure there is one primary and a secondary instance and their replicas. You need to ensure these instances have persistent identities so that requests can reach the right instance at any given time.
- If your application is a Distributed file system, like HDFS (Hadoop distributed file system), each node, needs a persistent identity and storage to manage the file system metadata and the data blocks.
- Another example is that you are running messaging system like Kafka on k8s. Your Kafka brokers need a stable identity and persistent storage for the cluster to function
- Running a monitoring system like Prometheus on your k8s cluster where yet again, you have to share the persistent storage across the cluster.
Gotchas and limitations
- Deleting a StatefulSet from your cluster doesn’t delete the Persistent Storage associated with it. Which is ideal in most cases.
- The prescribed way to terminate pods of a StatefulSet is to scale it down to 0 prior to deletion
For more details, check out the Limitations section of the StatefulSet docs.
Namespace
Namespaces are a means to divide cluster resources among multiple users or teams. Namespaces are a grouping and isolation mechanism that makes it easier to manage cluster resources and organise complex environments.
A very common use-case is when a cluster is shared by multiple teams - some people refer to this as multi-tenancy. Provides that isolated little worlds within a cluster. So two different teams can have services of the same name, however belonging to different namespaces! Thereby reducing the chances of naming conflicts too. Namespaces are most useful probably when enforcing limits and quotas for different groups.
Also worth mentioning here is that there are some default namespaces - called Initial namespaces as a k8s cluster starts with a few namespaces.