Getting started with Kubernetes and Minikube for Microservices

Payam Mousavi
12 min readJul 17, 2023

I first heard about Kubernetes (K8s) back in 2019 when I was talking with a friend about Docker and Microservices. However I’ve never had the chance to work with this platform until recently that I tried Google Kubernetes Engine (GKE) and Minikube to understand how we can use it to automate and manage application deployments in large scale.

In this tutorial, I’ll share what I’ve learned after deploying sample applications on a Kubernetes cluster. You can check out the source code that we’ll be using on GitHub.

This is a rather long article so please read it carefully and patiently!

Kubernetes Basics

Here we quickly review basic concepts and main features of Kubernetes to help us understand what we are going to do next. We do not deep dive into the Kubernetes architecture, just to understand what its high-level components are. You can refer to kubernetes.io as reference.

What is Kubernetes?

Kubernetes is a powerful container management platform developed by Google, defined by kubernetes.io as:

Kubernetes is an open source container orchestration engine for automating deployment, scaling, and management of containerized applications.

Kubernetes runs our containerized applications by putting application containers into Pods, and putting pods into Nodes. A node can be a virtual or physical machine. We’ll see the definition of pods and nodes shortly.

Kubernetes Cluster

Kubernetes is an ideal solution for distributed systems such as Microservices where multiple applications need to be deployed and managed separately in isolated environments and are able to communicate with each other.

As a final note, Kubernetes provides a great collection of helpful services and features such as:

  • Service discovery, load balancing and storage orchestration
  • Automated rollouts and rollbacks
  • Containers self-healing and health monitoring
  • Configuration and secrets management (e.g. ConfigMap)

Node

A node is a computing unit, either physical or virtual, in the Kubernetes architecture. Nodes are managed by a component called Control Plane. A worker node has necessary services to run pods of containers. A master node is responsible for managing and monitoring a cluster and worker nodes.

Cluster

The definition of a cluster now becomes easy: a group of nodes for running our containerized applications. A cluster includes a set of worker nodes and at least a master node (more nodes for redundancy).

Pod

A pod is a set of application containers, and is the smallest deployable unit in Kubernetes. Containers in a pod can share resources and local network within the pod, and can communicate with other containers. Containers in a pod should ideally be closely related.

Another point worth mentioning is that Kubernetes is best for running multiple containers not a single Docker container, as it greatly helps with managing large-scale application deployments for redundancy and scalability.

Another component of this architecture which is associated with pods is ReplicaSet (RS). A ReplicaSet ensures that a specified number of pod replicas are always running. So it is responsible to maintain a set of healthy, stable pods running at any time.

Deployment

A deployment defines how many pod replicas need to be running, and manages and updates the required number of pods. This is where we provide our application Docker image and can define the number of replicas and also expose the desired port over the network. Just to give you an idea of what we have reviewed so far and what we need to do next, this command creates a deployment for an Nginx service instance on port 80, with 3 replicas:

$ kubectl create deployment web-server --image=nginx --replicas=3
$ kubectl expose deployment web-server --port=80 --type=NodePort

Service

And finally services! A service is a set of deployed pods with a dedicated IP address, which expose an application over the network. In other words, services set up networking in a Kubernetes cluster. Again, just to give you an idea of what a service is, this is the output of the get service command which we’ll cover next:

$ kubectl get service
NAME TYPE CLUSTER-IP PORT(S) AGE
auth-sevice NodePort 10.98.49.53 7000:31310/TCP 38s
order-service NodePort 10.98.198.250 8000:30441/TCP 59s

Imperative vs Declarative

There are basically 2 approaches to work with Kubernetes and manage applications: imperative which is via kubectl commands, and declarative which is via defining resources in manifest files (JSON or YAML). Using the imperative approach, we manage the state of resources using commands. With the declarative way, we define what we need and Kubernetes manages the resources.

In the next section, we’ll start with the imperative approach to better understand kubectl commands, and then we’ll switch to declarative configuration using YAML files.

Getting started

In this section, we’ll create our first Kubernetes cluster, deploy sample applications and manage scaling of the apps. We’ll mainly focus on Minikube but you can also use Google Kubernetes Engine (GKE). You should note that there are some differences between these platforms in terms of deployment and service calls which we’ll cover shortly.

In order to work with Kubernetes APIs, we need kubectl which is a CLI tool that interacts with clusters and pods. You can follow this page to install kubectl. On macOS you use Homebrew:

$ brew install kubectl

We also need to have Docker installed and ready for Minikube and you can install Docker Desktop. You can also use Colima which is container management tool for macOS and Linux. You can create a customized VM with Colima:

$ colima start --cpu 4 --memory 8

Please note that if you want to use Colima, you need to create a VM that has more CPU and memory than the default settings (2 CPUs, 2GiB memory) to run the apps without any problems.

Google Kubernetes Engine (GKE)

GKE is a fully managed Kubernetes service provided by Google Cloud and it offers a free tier that allows users to get started with the platform at no cost. So if you prefer to work with GKE instead of Minikube, go ahead and sign up on Google Cloud and install Google Cloud CLI (gcloud). You then need to create your free cluster (Autopilot cluster) with default settings, and finally set up your local environment:

# CLUSTER_NAME: name of your new cluster, e.g. autopilot-cluster-1
# REGION: region in which you create the cluster, e.g. us-central1
# PROJECT_ID: your project ID, e.g. abc-def-1234 (top-left in the console)

$ gcloud auth login

$ gcloud container clusters get-credentials CLUSTER_NAME \
--region REGION --project PROJECT_ID

$ gcloud components install gke-gcloud-auth-plugin

Minikube

Minikube is a great tool which helps us set up local Kubernetes clusters on macOS, Linux and Windows. Follow its guidelines to install Minikube on your local computer. On macOS you can run these commands:

# amd64
$ curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-darwin-amd64
$ sudo install minikube-darwin-amd64 /usr/local/bin/minikube

# OR arm64
$ curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-darwin-arm64
$ sudo install minikube-darwin-arm64 /usr/local/bin/minikube

Minikube ships with a nice dashboard which shows deployments. Now we can run our first Kubernetes cluster by running:

$ minikube start

# This will open your browser automatically
$ minikube dashboard

The cluster is now ready but empty! Next we’ll deploy our apps.

Minikube dashboard

Applications

I have created a simple backend in Ruby with Sinatra which acts as an authentication service and a simple Golang client which relies on the authentication API. The first service returns a sample OAuth2 Token response:

{
"access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJwYW1pdCIsIm5hbWUiOiJQYXlhbSBNIiwiaWF0IjoxNjg5NTgyMjc1fQ.8QlhZQ8pYnY6eSiJ4OZfC7OOhIPJfMUjyUOtqoB6KXE",
"refresh_token": "1337824e-c90b-41d1-a03c-ec6979eb387e",
"token_type": "Bearer",
"expires_in": 3600
}

And the client app calls this API, parses the access token and returns a JSON response:

{
"message": "Order placed for user Payam M (username: pamit) | on 2023-07-17 08:24:35 +0000 UTC"
}

You can download the repository on GitHub. I have created Docker images for these apps and pushed the images to my Docker Hub: ruby-authentication-service and golang-order-service

You can of course build new images and push them to your preferred image repository (e.g. AWS ECR).

Now that we have our images ready, it’s time to create application deployments:

$ kubectl create deployment ruby-authentication-service --image=pamitedu/ruby-authentication-service:latest
$ kubectl expose deployment ruby-authentication-service --port=4567 --type=NodePort

$ kubectl create deployment golang-order-service --image=pamitedu/golang-order-service:latest
$ kubectl expose deployment golang-order-service --port=8080 --type=NodePort

Running the previous commands using kubectl will create required pods with the desired replica set (default is 1) and name, and create pod services with IP address exposed over the desired ports:

$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
golang-order-service 0/1 1 0 27s
ruby-authentication-service 0/1 1 0 30s
---

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
golang-order-service-f8647d9c9-bq7g2 0/1 ContainerCreating 0 44s
ruby-authentication-service-6c4d6c6ff-vwss6 1/1 Running 0 47s
---

$ kubectl get replicaset
NAME DESIRED CURRENT READY AGE
golang-order-service-f8647d9c9 1 1 0 48s
ruby-authentication-service-6c4d6c6ff 1 1 1 51s
---

$ kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
golang-order-service NodePort 10.104.32.146 <none> 8080:30259/TCP 42s
ruby-authentication-service NodePort 10.111.19.88 <none> 4567:31190/TCP 47s

NOTE

To create a service (via expose deployment) for a Minikube cluster we chose --type=NodePort as it’s easier to set up, but for GKE you can pass --type=LoadBalancer. You can read more about these types here.

After a few seconds if you check the status of your cluster pods, you’ll see they are up and running:

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
golang-order-service-f8647d9c9-bq7g2 1/1 Running 0 9m2s
ruby-authentication-service-6c4d6c6ff-vwss6 1/1 Running 0 9m5s


# Check out the pod logs
$ kubectl logs -f golang-order-service-f8647d9c9-bq7g2
[GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached.
...
...
[GIN-debug] Listening and serving HTTP on :8080

And you can also check out the Minikube dashboard to see the progress:

Minikube dashboard

Now we need Minikube to expose the IP:PORT for us to call the services (we must run these commands in separate terminal tabs):

# We need to run these commands in separate terminal tabs

$ minikube service ruby-authentication-service --url
http://127.0.0.1:51137
❗ Because you are using a Docker driver on darwin, the terminal needs to be open to run it.


$ minikube service golang-order-service --url
http://127.0.0.1:51145
❗ Because you are using a Docker driver on darwin, the terminal needs to be open to run it.

Note

If you have deployed the apps on GKE, you can visit GKE Console to find the dedicated IP address to your deployment service.

Now, if you call the authentication service you’ll see the response:

$ curl -XGET http://localhost:51137/signin
{"access_token":"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJwYW1pdCIsIm5hbWUiOiJQYXlhbSBNIiwiaWF0IjoxNjg5NTgyMjc1fQ.8QlhZQ8pYnY6eSiJ4OZfC7OOhIPJfMUjyUOtqoB6KXE","refresh_token":"1337824e-c90b-41d1-a03c-ec6979eb387e","token_type":"Bearer","expires_in":3600}

However, calling the order service results in an error:

$ curl -XGET http://localhost:51145/order
{"message":"Cannot contact Authentication service"}

Here’s the interesting part! Kubernetes has its own service discovery which connects pods and services. But the order service doesn’t know about the authentication service endpoint yet, so we need to add an environment variable AUTH_SERVICE_URL which is defined in the Golang code:

kubectl create configmap golang-order-service-configmap \
--from-literal=AUTH_SERVICE_URL=http://ruby-authentication-service:4567

We have created a ConfigMap which is a key-value data store for non-confidential values. Kubernetes also provides another object storage for secrets. Next, after we create a deployment configuration file, we’ll refer to our ConfigMap.

Note how we’ve referred to the authentication service by its service name: ruby-authentication-service and the default Sinatra port (4567).

Creating a ConfigMap and that environment variable are not enough, we need to let the order service know how to load it. If you check out the pod logs, you’ll see that AUTH_SERVICE_URL is still http://localhost:4567:

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
golang-order-service-87f795576-5zsk8 1/1 Running 0 7s
ruby-authentication-service-6c4d6c6ff-vwss6 1/1 Running 0 95m


$ kubectl logs -f golang-order-service-87f795576-5zsk8
[GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached.
...
[GIN-debug] Listening and serving HTTP on :8080
[OrderService] Cannot contact Authentication service: http://localhost:4567
[GIN] 2023/07/17 - 14:15:01 | 200 | 3.968708ms | 10.244.0.1 | GET "/order"

So we now need to introduce another feature of Kubernetes to fix this issue: Declarative Configuration and ConfigMap.

Declarative Configuration

With declarative configuration, we basically define our desired application state and resources using objects (e.g. pod, service, deployment) and object specifications. For instance, we can define a pod and a corresponding service for the Ruby authentication service like this, store the config in a pod.yaml file, and then create the objects using kubectl apply -f pod.yaml:

apiVersion: v1
kind: Pod
metadata:
name: ruby-authentication-pod
labels:
app: ruby-authentication
spec:
containers:
- name: ruby-authentication-service
image: pamitedu/ruby-authentication-service:latest
ports:
- containerPort: 4567
---
apiVersion: v1
kind: Service
metadata:
name: ruby-authentication-svc
spec:
type: NodePort
ports:
- port: 4567
targetPort: 4567
nodePort: 31515
selector:
app: ruby-authentication

You can have a deeper look at declarative management of Kubernetes objects using configuration files. This is the recommended approach to work with Kubernetes.

We can also extract the existing deployments and services that we have created as configuration objects in files:

$ kubectl get deployment golang-order-service -o yaml >> deployment.yaml
$ kubectl get service golang-order-service -o yaml >> service.yaml

Then you can append the content of service.yaml to deployment.yaml, separated by --- to create 1 consolidated deployment configuration. Finally, we need an envFrom block under the container to refer to the ConfigMap that we created before:

spec:
containers:
- image: pamitedu/golang-order-service:latest
...
envFrom:
- configMapRef:
name: golang-order-service-configmap

After removing unnecessary fields (e.g. creationTimestamp, resourceVersion, uid, status) from the configuration, we can now redeploy (you can check out the final deployment.yaml in the project directory):

$ cd golang-order-service
$ kubectl apply -f deployment.yaml

$ minikube service golang-order-service --url
http://127.0.0.1:52824
❗ Because you are using a Docker driver on darwin, the terminal needs to be open to run it.

And now if call the new provided URL by Minikube, we’ll the the API response:

$ curl -XGET http://localhost:52824/order
{"message":"Order placed for user Payam M (username: pamit) | on 2023-07-17 08:24:35 +0000 UTC"}

ReplicaSets

In order to change the number of replicas (active pods), we can run the following command and specify the desired number. After some time (depending on the application size and requirements), new pods will become available.

# Adjusting the number of replicas
$ kubectl scale deployment ruby-authentication-service --replicas=3
deployment.apps/ruby-authentication-service scaled


$ kubectl get replicaset
NAME DESIRED CURRENT READY AGE
golang-order-service-87f795576 1 1 1 12m
ruby-authentication-service-6c4d6c6ff 3 3 1 107m


$ kubectl get pod
NAME READY STATUS RESTARTS AGE
golang-order-service-87f795576-5zsk8 1/1 Running 0 13m
ruby-authentication-service-6c4d6c6ff-nbz6s 1/1 Running 0 115s
ruby-authentication-service-6c4d6c6ff-vwss6 1/1 Running 0 109m
ruby-authentication-service-6c4d6c6ff-x2thz 1/1 Running 0 115s

Horizontal Auto Scaling

Horizontal auto scaling (scale out) of a deployment is very easy. All we need to do is specify the desired number of pods and the resource metric (CPU) that need to be monitored by the cluster to automate the scale-out process:

$ kubectl autoscale deployment ruby-authentication-service \
--min=1 --max=3 --cpu-percent=70
horizontalpodautoscaler.autoscaling/ruby-authentication-service autoscaled


# hpa = horizontal pod autoscaler
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
ruby-authentication-service Deployment/ruby-authentication-service <unknown>/70% 1 3 3 3m33s

So if average CPU utilization is above 70%, Kubernetes will spin up more pods until they reach a certain maximum threshold (i.e. 3).

The Kubernetes horizontal autoscaler needs to be configured to be able to fetch metrics data from aggregated APIs (e.g. metrics.k8s.io). You can read more about auto scaling here.

Clean Up

If you need to delete deployments and services, you can use these commands:

$ kubectl delete all -l app=ruby-authentication-service
$ kubectl delete all -l app=golang-order-service

# OR separately
$ kubectl delete deployment golang-order-service
$ kubectl delete service golang-order-service

You can stop or delete you Minikube cluster by running:

$ minikube stop
$ minikube delete

# OR
$ minikube delete --all

What’s next?

There are many things we can check out and try with Kubernetes such as container probes, auto-scaling with simulated workload, logs and etc.

Container Probes

Kubernetes uses 2 particular probes as container health indicators: Liveness and Readiness probes, which are used to know when to restart a container and when a container is ready to start accepting traffic.

We can build an API to return OK/200 which can be used as probes. One interesting setting of these probes is initialDelaySeconds which specifies a waiting time before performing the first probe, to let application containers boot up properly.

Auto scaling with simulated workload

We learned about horizontal auto scaling. Now we can test if it’s working by stress-testing the APIs under simulated huge traffic:

$ kubectl autoscale deployment ruby-authentication-service \
--min=1 --max=3 --cpu-percent=5

# This is just a sample approach
$ watch -n 0.1 curl -XGET http://localhost:52824/order

Recap

Kubernetes has become the new standard for deploying and managing containerized applications and services. All we need is to create Docker images of our applications, create Kubernetes clusters, deploy and scale our applications. Almost all big Cloud providers such as AWS, GCP and Azure support Kubernetes and provide platform services (e.g. EKS, GKE, AKS).

In this post, we set up a local cluster using Minikube (and an Autopilot cluster on GKE), deployed sample applications which can communicate on a particular port, and scaled our applications infrastructure. I hope you’ve enjoyed it.

You can download the source code for the applications from my GitHub: https://github.com/pamit/sample-services-on-kubernetes

Happy coding!

--

--