In this lesson, we will learn how to scale applications both horizontally and vertically in a Kubernetes cluster. Scaling is crucial for handling varying loads and ensuring high availability of your applications.
Horizontal scaling, also known as scaling out, involves adding more instances of a pod to handle increased load. Kubernetes makes this easy through the use of Deployments and the Horizontal Pod Autoscaler (HPA).
A Deployment in Kubernetes manages a set of replicas of your application. You can specify the number of replicas you want, and Kubernetes will ensure that this number is maintained.
Here’s how to scale a Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 3 # Change this number to scale
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app-container
image: my-app-image:latest
ports:
- containerPort: 80
To apply this configuration, save it to a file named deployment.yaml and run:
kubectl apply -f deployment.yaml
The HPA automatically adjusts the number of replicas of a pod based on observed CPU utilization or other select metrics.
Here’s how to create an HPA for our deployment:
kubectl autoscale deployment my-app --cpu-percent=50 --min=1 --max=10
In this example, the HPA will scale the my-app deployment to maintain an average CPU utilization of 50%, with a minimum of 1 pod and a maximum of 10 pods.
Vertical scaling, or scaling up, involves increasing the resources (CPU and memory) allocated to a pod. This is done by modifying the resource requests and limits in the pod specification.
Here’s how to update the resource requests for a deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 3
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app-container
image: my-app-image:latest
ports:
- containerPort: 80
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
To apply these changes, save it to deployment.yaml and run:
kubectl apply -f deployment.yaml
Best Practice: Always monitor your application’s performance metrics to make informed decisions on scaling.
Common Mistake: Scaling without understanding the application’s resource needs can lead to inefficient resource usage and increased costs.
Happy Scaling!
Deploy and Scale: Create a Deployment with 2 replicas of an application. Then use the HPA to scale it based on CPU usage. Test by simulating load on the application.
Vertical Scaling: Update the resource requests and limits for an existing deployment. Monitor the application to see how it performs with the new resource allocations.
Experiment with HPA: Modify the HPA settings to see how it affects scaling behavior. Try changing the CPU percentage and observe the changes in pod replicas.