As businesses
increasingly adopt microservices architecture, managing containerized workloads
becomes critical. Amazon Web Services (AWS) Elastic Kubernetes Service (EKS)
is a popular choice for orchestrating these workloads. However, to ensure
optimal performance and cost-efficiency, scaling based on real-time metrics
is essential. In this blog post, we'll dive into what it means to scale
containerized workloads on AWS EKS based on real-time metrics and provide a
comprehensive guide on achieving this.
Defining Key Terminologies
What is mean by Containerized Workloads?
Containerized
workloads refer to applications and services that are encapsulated in
containers. Containers are lightweight, standalone, and executable software
packages that include everything needed to run a piece of software, including
the code, runtime, libraries, and system tools. Docker is a widely-used
platform for containerization.
What is AWS EKS?
Amazon Elastic
Kubernetes Service (EKS) is a managed service that simplifies running
Kubernetes on AWS without needing to install and operate Kubernetes control
plane or nodes. Kubernetes, often abbreviated as K8s, is an open-source system
for automating deployment, scaling, and management of containerized
applications.
What are Real-Time Metrics?
Real-time
metrics are continuous streams of data that provide instantaneous insight
into the performance and health of systems. In the context of EKS, these
metrics could include CPU utilization, memory usage, request rates, and response
times, among others.
What is mean by Scaling in Cloud Computing?
Scaling
refers to adjusting the number of running instances of an application based on
the workload. In Kubernetes, this can be achieved both vertically (adjusting
resource limits for a pod) and horizontally (adding or removing pod instances).
Why Scaling Containerized Workloads Matters
Scaling
containerized workloads is crucial for several reasons, primarily revolving
around performance, cost-efficiency, and reliability. Let’s explore these
aspects in detail:
Performance Optimization
- Responsive Applications: By scaling
containerized workloads, applications can handle varying loads without
performance degradation. This ensures that user experience remains
consistent, even during peak traffic periods.
- Resource Utilization: Scaling allows for the
optimal use of resources, ensuring that applications have enough CPU and
memory to perform efficiently. This is particularly important for
resource-intensive applications that need to scale up during high demand
and scale down during low demand to free up resources.
Cost Efficiency
- Pay-as-You-Go: Cloud platforms like AWS EKS
offer a pay-as-you-go model. By scaling workloads dynamically,
organizations only pay for the resources they actually use, avoiding the
costs associated with over-provisioning.
- Auto Scaling: Features like the Horizontal
Pod Autoscaler (HPA) and Cluster Autoscaler on AWS EKS automate the
scaling process based on real-time metrics, further optimizing costs by
adjusting resource allocation precisely when needed.
Reliability and Availability
- Fault Tolerance: Scaling helps in
maintaining high availability and fault tolerance. If one pod fails,
others can be scaled up to take over the load, ensuring continuous
service availability.
- Load Balancing: Properly scaled workloads
distribute the load evenly across multiple containers and nodes, reducing
the risk of any single point of failure.
Adaptability and Flexibility
- Dynamic Environments: Modern applications
often experience unpredictable workloads. Scaling allows these
applications to adapt dynamically to changing demands, ensuring they
remain robust and responsive.
- DevOps and CI/CD: In continuous integration
and continuous deployment (CI/CD) environments, scaling supports rapid
development and deployment cycles by ensuring that testing and staging
environments can scale up or down based on the needs of the development
pipeline.
Security and Compliance
- Isolated Environments: Scaling containerized
workloads in isolated environments can help in meeting compliance
requirements by ensuring that workloads are segregated based on security
needs.
- Resource Quotas: By scaling, organizations
can enforce resource quotas and limits, preventing any single workload
from monopolizing system resources and potentially leading to security
vulnerabilities.
Scaling
containerized workloads is essential for maintaining optimal performance,
ensuring cost efficiency, enhancing reliability, and providing the adaptability
required in modern dynamic environments. Effective scaling strategies on
platforms like AWS EKS enable businesses to leverage the full potential of
containerized applications, leading to better user experiences and operational
efficiencies.
Scaling Strategies in AWS EKS
Amazon Elastic
Kubernetes Service (EKS) provides several robust scaling strategies to
efficiently manage and optimize your containerized workloads. Understanding
these strategies is essential for maintaining application performance, cost
efficiency, and reliability.
1. Horizontal Pod Autoscaler (HPA)
The Horizontal
Pod Autoscaler (HPA) automatically scales the number of pods in a
Kubernetes cluster based on observed CPU utilization or other select metrics.
- Metrics-Based Scaling: HPA monitors resource
metrics, such as CPU and memory usage, to determine the need for scaling.
- Custom Metrics: It can also be configured to
use custom metrics from services like AWS CloudWatch.
Example Configuration:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: hpa-example
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: example-deployment
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
2. Cluster Autoscaler
The Cluster
Autoscaler automatically adjusts the size of the Kubernetes cluster by
adding or removing EC2 instances (nodes) based on the pending pods that cannot
be scheduled due to resource constraints.
- Node Group Scaling: It scales the node groups
up or down to meet the demands of the pods.
- Efficient Resource Use: Ensures optimal
resource utilization by only adding nodes when necessary and removing them
when they are no longer needed.
3. AWS Fargate
AWS Fargate is a
serverless compute engine for containers that works with EKS, allowing you to
run containers without managing the underlying EC2 instances.
- Automatic Scaling: Fargate scales the number
of tasks running your containers based on the workload automatically.
- Simplified Management: Eliminates the need for
managing infrastructure, making it a good choice for dynamic and
unpredictable workloads.
4. AWS Auto Scaling Groups
Auto Scaling
Groups (ASGs) in AWS allow you to automatically adjust the number of EC2
instances in your cluster.
- Scaling Policies: These can be based on
various metrics like CPU usage, network traffic, or even custom metrics.
- Scheduled Scaling: Allows predefined scaling
actions to meet anticipated demands (e.g., scale out at peak business
hours).
Example Policy:
{
"AutoScalingGroupName": "my-asg",
"PolicyName": "scale-out-policy",
"AdjustmentType": "ChangeInCapacity",
"ScalingAdjustment": 1,
"Cooldown": 300
}
5. Right-Sizing Workloads
Right-sizing
involves optimizing the resource requests and limits for your pods to ensure
they have enough resources to run efficiently without over-provisioning.
- Resource Requests and Limits: Define how much
CPU and memory each pod should request and limit to.
- Monitoring and Adjusting: Continuously monitor
and adjust these settings based on performance metrics.
6. Custom Metrics and Alarms
Leverage AWS
CloudWatch to set up custom metrics and alarms to trigger scaling actions based
on specific application needs.
- Application-Specific Metrics: Such as request
count per second, latency, etc.
- Automated Actions: Automatically trigger
scaling actions when certain thresholds are met.
Scaling
strategies in AWS EKS enable efficient management of containerized workloads by
ensuring optimal performance, cost efficiency, and reliability. By leveraging
tools like HPA, Cluster Autoscaler, Fargate, ASGs, and custom metrics,
organizations can dynamically adjust their resources to meet real-time demands,
ultimately enhancing application performance and user satisfaction.
Step-by-Step Guide to Scaling on AWS EKS
Scaling
containerized workloads on Amazon Elastic Kubernetes Service (EKS) ensures your
applications run efficiently, can handle increased loads, and remain
cost-effective. This guide will walk you through the essential steps for
implementing scaling strategies on AWS EKS, emphasizing automatic scaling
mechanisms like Horizontal Pod Autoscaler (HPA), Cluster Autoscaler, and AWS
Fargate.
Step 1: Preparing Your EKS Cluster
- Create an EKS Cluster:
eksctl create cluster --name my-cluster --region us-west-2 --nodegroup-name linux-nodes --node-type t3.medium --nodes 3 --nodes-min 1 --nodes-max 4 --managed
This command
creates an EKS cluster with a managed node group.
- Install Metrics Server:
Metrics Server is
required for HPA to function. Install it using the following commands:kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Step 2: Implementing Horizontal Pod Autoscaler
- Deploy an Application:
Deploy a sample
application, for example, an NGINX deployment.apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "250m"
memory: "256Mi"
Apply the deployment:
kubectl apply -f nginx-deployment.yaml
- Create HPA:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-deployment
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
Apply the HPA:
kubectl apply -f nginx-hpa.yaml
Step 3: Setting Up Cluster Autoscaler
- Install Cluster Autoscaler:
kubectl apply -f https://github.com/kubernetes/autoscaler/releases/download/cluster-autoscaler-chart/cluster-autoscaler-chart.yaml
- Configure Cluster Autoscaler:
Modify the
deployment to include your cluster name and the AWS region.apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
app: cluster-autoscaler
spec:
replicas: 1
selector:
matchLabels:
app: cluster-autoscaler
template:
metadata:
labels:
app: cluster-autoscaler
spec:
containers:
- image: k8s.gcr.io/autoscaler/cluster-autoscaler:v1.21.2
name: cluster-autoscaler
resources:
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 100m
memory: 300Mi
command:
- ./cluster-autoscaler
- --v=4
- --stderrthreshold=info
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --expander=least-waste
- --nodes=1:10:my-cluster-ng-a2e2db7f.k8s.local
Apply the configuration:
kubectl apply -f cluster-autoscaler.yaml
Step 4: Utilizing AWS Fargate
- Create a Fargate Profile:
eksctl create fargateprofile --cluster my-cluster --name my-fargate-profile --namespace fargate
This command
creates a Fargate profile that specifies which pods should run on Fargate.
- Deploy a Pod on Fargate:
apiVersion: v1
kind: Pod
metadata:
name: fargate-pod
namespace: fargate
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
Apply the pod configuration:
kubectl apply -f fargate-pod.yaml
Step 5: Autoscaling with Custom Metrics using Prometheus and CloudWatch
- Set Up Prometheus:
Deploy Prometheus in your
EKS cluster to collect custom metrics.kubectl apply -f https://github.com/prometheus-operator/prometheus-operator/raw/main/bundle.yaml
- Configure CloudWatch Container Insights:
Install and configure CloudWatch Container Insights to send custom metrics
to CloudWatch.kubectl apply -f https://amazon-eks.s3.us-west-2.amazonaws.com/docs/eks-logging-quickstart.yaml
- Create a Custom Metric:
Define and collect a
custom metric using Prometheus. For example, create a custom metric for
request count:apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: example-servicemonitor
spec:
selector:
matchLabels:
app: example
endpoints:
- port: web
path: /metrics
Apply the
ServiceMonitor:
kubectl apply -f example-servicemonitor.yaml
- Set Up HPA with Custom Metrics:
Create an HPA
that uses custom metrics from Prometheus.apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: custom-metric-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-deployment
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metric:
name: custom_metric
target:
type: AverageValue
averageValue: 100
Apply the HPA:
kubectl apply -f custom-metric-hpa.yaml
Step 6: Using KEDA for Event-Driven Scaling
- Deploy KEDA:
KEDA (Kubernetes Event-Driven
Autoscaling) allows scaling based on event sources. Install KEDA in your
EKS cluster.kubectl apply -f https://github.com/kedacore/keda/releases/download/v2.4.0/keda-2.4.0.yaml
- Configure KEDA ScaledObject:
Create a
ScaledObject that defines the scaling behavior based on event sources.apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: queue-scaledobject
spec:
scaleTargetRef:
name: nginx-deployment
minReplicaCount: 1
maxReplicaCount: 10
triggers:
- type: aws-sqs-queue
metadata:
queueURL:
queueLength: "5"
Apply the
ScaledObject:
kubectl apply -f queue-scaledobject.yaml
Step 7: Implementing Karpenter for Efficient Node Management
- Install Karpenter:
Karpenter is a Kubernetes
cluster autoscaler built to work with EKS for better node provisioning.helm repo add karpenter https://charts.karpenter.sh
helm repo update
helm install karpenter karpenter/karpenter --namespace karpenter --create-namespace
helm repo add
karpenter https://charts.karpenter.sh helm repo update helm install karpenter karpenter/karpenter
--namespace karpenter --create-namespace
- Configure Provisioner:
Define a provisioner
for Karpenter that specifies how to scale nodes.apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: default
spec:
cluster:
name: my-cluster
constraints:
labels:
purpose: spot
requirements:
- key: "karpenter.k8s.aws/capacity-type"
operator: In
values: ["spot"]
- key: "topology.kubernetes.io/zone"
operator: In
values: ["us-west-2a", "us-west-2b"]
limits:
resources:
cpu: "1000"
memory: "4000Gi"
Apply the
provisioner:
kubectl apply -f karpenter-provisioner.yaml
Step 8: Testing and Monitoring
Load Testing:
Perform load testing on your application to ensure that your scaling configurations work as expected. Tools like Apache JMeter or k6 can be useful.Monitor Scaling Events:
Use CloudWatch dashboards to monitor scaling events, resource usage, and custom metrics. Ensure that the scaling is occurring as intended and make adjustments as needed.Step 9: Leveraging Spot Instances for Cost-Effective Scaling
- Integrate Spot Instances:
Spot Instances allow
you to utilize spare AWS compute capacity at a reduced cost. Configure
your EKS cluster to use Spot Instances for non-critical workloads.apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: spot-instance-provisioner
spec:
requirements:
- key: "kubernetes.io/instance-type"
operator: In
values: ["m5.large", "m5a.large"]
- key: "karpenter.k8s.aws/capacity-type"
operator: In
values: ["spot"]
limits:
resources:
cpu: "500"
memory: "1000Gi"
Apply the
provisioner:
kubectl apply -f spot-instance-provisioner.yaml
- Monitor Spot Instance Usage:
Ensure you
monitor the usage and performance of Spot Instances. Utilize AWS
CloudWatch and other monitoring tools to keep track of any interruptions
and cost savings.Step 10: Implementing Multi-Region and Multi-AZ Scaling
- Set Up Multi-AZ Clusters:
Ensure your EKS
cluster spans multiple Availability Zones (AZs) for high availability and
fault tolerance. Configure your EKS cluster to deploy nodes across
multiple AZs.apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: my-cluster
region: us-west-2
availabilityZones: ["us-west-2a", "us-west-2b"]
nodeGroups:
- name: ng-1
instanceType: m5.large
desiredCapacity: 2
Apply the cluster
configuration using eksctl:
eksctl create cluster -f multi-az-cluster-config.yaml
- Multi-Region Deployment:
Set up a secondary
EKS cluster in another AWS region to ensure disaster recovery and business
continuity. Use AWS Global Accelerator or Route 53 for traffic routing.Step 11: Advanced Monitoring and Logging
- Integrate Prometheus and Grafana:
Use
Prometheus for monitoring and Grafana for visualization of metrics.kubectl apply -f https://github.com/prometheus-operator/prometheus-operator/raw/main/bundle.yaml
helm repo add grafana https://grafana.github.io/helm-charts
helm install grafana grafana/grafana
- Set Up CloudWatch Logs and Metrics:
Ensure
that your EKS cluster logs and metrics are being sent to AWS CloudWatch
for centralized logging and monitoring.kubectl apply -f https://amazon-eks.s3.us-west-2.amazonaws.com/docs/eks-logging-quickstart.yaml
Step 12: Scaling Stateful Workloads
- Use StatefulSets for Stateful Applications:
Deploy applications that require stable storage using StatefulSets. Ensure
that your Persistent Volume Claims (PVCs) are correctly configured.apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
serviceName: "mysql"
replicas: 3
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:5.7
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- name: mysql-persistent-storage
mountPath: /var/lib/mysql
volumeClaimTemplates:
- metadata:
name: mysql-persistent-storage
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi
Apply the
StatefulSet:
kubectl apply -f mysql-statefulset.yaml
Best Practices and Considerations for Scaling on AWS EKS
Scaling on AWS
Elastic Kubernetes Service (EKS) requires a blend of strategic planning and
tactical execution to ensure high availability, cost efficiency, and
performance. Here, we'll cover the best practices and considerations essential
for optimizing your EKS scaling strategy.
1. Use Cluster Autoscaler
Cluster
Autoscaler automatically adjusts the size of your Kubernetes cluster so
that all pods have a place to run and resources are optimized.
- Installation: Deploy the Cluster Autoscaler in
your EKS cluster.
- Configuration: Ensure it is properly
configured to add or remove nodes based on pod demand.
2. Implement Horizontal Pod Autoscaler (HPA)
HPA
automatically scales the number of pod replicas based on observed CPU
utilization (or other application-provided metrics).
- Metrics Server: Deploy a metrics server to
collect and provide metrics to the HPA.
- Configuration: Define HPA policies in your
deployment files.
3. Leverage Managed Node Groups
Managed Node
Groups simplify node management, including updates and scaling.
- Setup: Use managed node groups for easier node
lifecycle management.
- Scaling: Set policies to automatically adjust
the number of nodes in a node group based on demand.
4. Use Spot Instances
Spot Instances
allow you to use spare AWS compute capacity at a reduced cost.
- Spot Instance Configuration: Configure your
EKS cluster to use Spot Instances for non-critical workloads.
- Cost Savings: Monitor and analyze cost savings
while ensuring critical workloads are on On-Demand or Reserved Instances.
5. Implement Multi-AZ Deployment
Deploy your EKS
clusters across multiple Availability Zones (AZs) for high availability and
fault tolerance.
- Configuration: Ensure node groups are
distributed across multiple AZs.
- Resilience: This ensures that the failure of a
single AZ does not affect the entire cluster.
6. Monitor and Optimize Resource Requests and Limits
Properly
configure resource requests and limits for your pods to ensure optimal
utilization and avoid resource contention.
- Requests and Limits: Define resource requests
and limits for each container in your deployment manifests.
- Optimization: Regularly review and adjust
based on actual usage.
7. Use Infrastructure as Code (IaC)
Manage your
Kubernetes clusters and infrastructure using IaC tools such as Terraform or AWS
CloudFormation.
- Consistency: Ensure consistent and
reproducible infrastructure setup.
- Automation: Automate the deployment and
scaling of resources.
Considerations for Effective Scaling
1. Performance and Load Testing
Regularly perform
scalability testing to understand the limits and performance characteristics of
your EKS cluster.
- SLIs and SLOs: Define and measure Service
Level Indicators (SLIs) and Service Level Objectives (SLOs) to guide
scaling decisions.
2. Cost Management
Balance
performance and cost by optimizing resource usage and leveraging cost-effective
solutions like Spot Instances.
- Cost Analysis: Use AWS Cost Explorer and other
tools to monitor and analyze costs.
3. Resilience and High Availability
Design your
applications and clusters for high availability and resilience to minimize
downtime and impact during failures.
- Multi-Region Deployment: Consider deploying in
multiple regions for disaster recovery.
4. Security Best Practices
Ensure that
scaling does not compromise security. Implement Kubernetes security best
practices.
- IAM Roles: Use AWS IAM roles and policies to
control access.
- Network Policies: Define and enforce network
policies to secure communication between pods.
5. Monitoring and Logging
Implement robust
monitoring and logging to gain insights into cluster performance and issues.
- Tools: Use tools like Prometheus, Grafana, and
AWS CloudWatch for monitoring and logging.
6. Continuous Learning and Adaptation
Continuously
review and adapt your scaling strategies based on performance data and changing
requirements.
- Feedback Loops: Establish feedback loops to
incorporate lessons learned and improve scaling practices.
Conclusion
Scaling
containerized workloads on AWS EKS based on real-time metrics is
a powerful way to ensure that your applications remain responsive and
cost-effective. By leveraging tools like HPA, VPA, Cluster Autoscaler, and
robust monitoring solutions like Prometheus and Grafana, you can automate
scaling effectively. Following the steps outlined in this guide, you can
achieve a dynamic, responsive, and efficient Kubernetes environment on AWS.
π Sources
- Deep dive into Amazon EKS scalability testing
- Operating resilient workloads on Amazon EKS
- EKS Best Practices Guides
- aws-eks-best-practices/content/performance
- Key considerations - Containers on AWS
- Autoscaling Amazon EKS services based on custom Prometheus metrics
- Deep dive into Amazon EKS scalability testing
- Scalable and Cost-Effective Event-Driven Workloads with KEDA and Karpenter
- Autoscaling Kubernetes workloads with KEDA
- Operating resilient workloads on Amazon EKS
Additional Resources:
You might be interested to explore the following additional resources;
ΓΌ What is Amazon EKS and How does It Works?
ΓΌ What are the benefits of using Amazon EKS?
ΓΌ What are the pricing models for Amazon EKS?
ΓΌ What are the best alternatives to Amazon EKS?
ΓΌ How to create, deploy, secure and manage Amazon EKS Clusters?
ΓΌ Amazon EKS vs. Amazon ECS: Which one to choose?
ΓΌ Migrate existing workloads to AWS EKS with minimal downtime
ΓΌ Cost comparison: Running containerized applications on AWS EKS vs. on-premises Kubernetes
ΓΌ Best practices for deploying serverless applications on AWS EKS
ΓΌ Securing a multi-tenant Kubernetes cluster on AWS EKS
ΓΌ Integrating CI/CD pipelines with AWS EKS for automated deployments
ΓΌ How to implement GPU acceleration for machine learning workloads on Amazon EKS
ΓΌ How to configure Amazon EKS cluster for HIPAA compliance
ΓΌ How to troubleshoot network latency issues in Amazon EKS clusters
ΓΌ How to automate Amazon EKS cluster deployments using CI/CD pipelines
ΓΌ How to integrate Amazon EKS with serverless technologies like AWS Lambda
ΓΌ How to optimize Amazon EKS cluster costs for large-scale deployments
ΓΌ How to implement disaster recovery for Amazon EKS clusters
ΓΌ How to create a private Amazon EKS cluster with VPC Endpoints
ΓΌ How to configure AWS IAM roles for service accounts in Amazon EKS
ΓΌ How to troubleshoot pod scheduling issues in Amazon EKS clusters
ΓΌ How to monitor Amazon EKS cluster health using CloudWatch metrics
ΓΌ How to deploy containerized applications with Helm charts on Amazon EKS
ΓΌ How to enable logging for applications running on Amazon EKS clusters
ΓΌ How to integrate Amazon EKS with Amazon EFS for persistent storage
ΓΌ How to configure autoscaling for pods in Amazon EKS clusters
ΓΌ How to enable ArgoCD for GitOps deployments on Amazon EKS