Lesson 9: Scaling Applications with Kubernetes

In this lesson, we will learn how to scale applications both horizontally and vertically in a Kubernetes cluster. Scaling is crucial for handling varying loads and ensuring high availability of your applications.

Horizontal Scaling

Horizontal scaling, also known as scaling out, involves adding more instances of a pod to handle increased load. Kubernetes makes this easy through the use of Deployments and the Horizontal Pod Autoscaler (HPA).

Deployments

A Deployment in Kubernetes manages a set of replicas of your application. You can specify the number of replicas you want, and Kubernetes will ensure that this number is maintained.

Example: Scaling a Deployment

Here’s how to scale a Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3  # Change this number to scale
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: my-app-container
        image: my-app-image:latest
        ports:
        - containerPort: 80

To apply this configuration, save it to a file named deployment.yaml and run:

kubectl apply -f deployment.yaml

Horizontal Pod Autoscaler (HPA)

The HPA automatically adjusts the number of replicas of a pod based on observed CPU utilization or other select metrics.

Example: Creating an HPA

Here’s how to create an HPA for our deployment:

kubectl autoscale deployment my-app --cpu-percent=50 --min=1 --max=10

In this example, the HPA will scale the my-app deployment to maintain an average CPU utilization of 50%, with a minimum of 1 pod and a maximum of 10 pods.

Vertical Scaling

Vertical scaling, or scaling up, involves increasing the resources (CPU and memory) allocated to a pod. This is done by modifying the resource requests and limits in the pod specification.

Example: Updating Resource Requests

Here’s how to update the resource requests for a deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: my-app-container
        image: my-app-image:latest
        ports:
        - containerPort: 80
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"

To apply these changes, save it to deployment.yaml and run:

kubectl apply -f deployment.yaml

Best Practices and Common Mistakes

Best Practice: Always monitor your application’s performance metrics to make informed decisions on scaling.

Common Mistake: Scaling without understanding the application’s resource needs can lead to inefficient resource usage and increased costs.

Summary

Horizontal scaling adds more pod instances to handle load, managed by Deployments and HPA.
Vertical scaling increases resource allocations for existing pods.
Use HPA for automatic scaling based on metrics like CPU usage.
Always monitor performance and adjust scaling strategies accordingly.

Happy Scaling!

Exercises

Deploy and Scale: Create a Deployment with 2 replicas of an application. Then use the HPA to scale it based on CPU usage. Test by simulating load on the application.
Vertical Scaling: Update the resource requests and limits for an existing deployment. Monitor the application to see how it performs with the new resource allocations.
Experiment with HPA: Modify the HPA settings to see how it affects scaling behavior. Try changing the CPU percentage and observe the changes in pod replicas.

Summary

Horizontal scaling involves adding more pod instances to manage load.
Vertical scaling increases the resource limits of existing pods.
Use Horizontal Pod Autoscaler for automatic scaling based on metrics.
Monitor application performance for effective scaling strategies.