- Kubernetes Horizontal Pod AutoScaler
- Kubernetes Vertial Pod AutoScaler
- AutoScaler (scale number of nodes)
- Deployment, service, and load test
Horizontal Pod AutoScaler#
The HPA scales based on default or custom (external) metrics. How it works?
- Metrics are specified in the HPA definition
- Once during the period (15 seconds), the controller manager queries the metrics
- HPA access and adjust the scale parameter in Deployment or StatefulSet
It is possible to setup custom metrics, such as SQS length from AWS CloudWatch, or use a Lambda functionn to trigger Kubernetes scale via updating, setting Deployment, StatefulSet with a new number of replica . There are important parameters
- horizontal-pod-autoscaler-sync-period default is 15 seconds
- [horizontal-pod-autoscaler-initial-readiness-delay] default 30 seconds
- [horizontal-pod-autoscaler-cpu-initialization-period] default 5 minutes
- [horizontal-pod-autoscaler-downscale-stabilization] of stabilization window default is 300 seconds
- Stabilization Window
- ceil[currentReplicas * (Current / Desired)]
First, setup metrics server which monitor CPU usage
export class MetricServerStack extends Stack {constructor(scope: Construct, id: string, props: MetricServerProps) {super(scope, id, props)const cluster = props.clusterreadYamlFile(path.join(__dirname, './../yaml/metric_server.yaml'), cluster)}}
deploy a HPA
apiVersion: autoscaling/v2beta2kind: HorizontalPodAutoscalermetadata:name: cdk8s-app-webhorizontalautoscaler-c82a277espec:maxReplicas: 1000metrics:- resource:name: cputarget:averageUtilization: 5type: Utilizationtype: ResourceminReplicas: 2scaleTargetRef:apiVersion: apps/v1kind: Deploymentname: cdk8s-app-deployment-c8f953f2
For scaling based on custom metrics, there are some methods
- CloudWatch => Adapter => External metrics => HPA
- CloudWatch => Lambda => Update deployment
External metrics example, this CRD resource tells the adapter how to retrieve metric data from CW
apiVersion: ExternalMetric:metadata:name: hello-queue-lengthspec:name: hello-queue-lengthresource:resource: "deployment"queries:- id: sqs_helloworldmetricStat:metric:namespace: "AWS/SQS"metricName: "ApproximateNumberOfMessagesVisible"dimensions:- name: QueueNamevalue: "helloworld"period: 300stat: Averageunit: CountreturnData: true
HPA based on the custom metric
kind: HorizontalPodAutoscalerapiVersion: autoscaling/v2beta1metadata:name: sqs-consumer-scalerspec:scaleTargetRef:apiVersion: apps/v1beta1kind: Deploymentname: sqs-consumerminReplicas: 1maxReplicas: 10metrics:- type: Externalexternal:metricName: hello-queue-lengthtargetAverageValue: 30
Cluster AutoScaler#
How scale up work?
- It checkes any unscheduled pods every 10 seconds (scan-interval)
- Change size (desired size) of the nodegroup of auto-scaling group
- Launch new nodes using templates
How scale down group? CA check for unneeded ndoes
- Every 10 seconds, if no scale up, CA checks which nodes are unneeded by some conditions (CPU, Mem)
- All pods running on the node can be moved to other nodes
- If a node is unneeded for more than 10 minutes, it will be terminated
Install the AutoScaler, for simple demo
- Update role for ec2 node, so it can scale the autoscaling group
- More secure way is to use service account
- Install AutoScaler yaml by kubectl
- Install AutoScaler by reading yaml and add to the cluster by CDK
There are some important parameters
- AutoScaler reaction time
- scan-interval 10 seconds by default which check for unscheduled pods via API servers
- --scale-down-unneeded-time
- --max-node-provision-time how log requested nodes to appear, within 15 minutes
Update role for ec2 node to work with auto-scaling group
nodeRole.addToPolicy(new aws_iam.PolicyStatement({effect: aws_iam.Effect.ALLOW,actions: ['autoscaling:DescribeAutoScalingGroups','autoscaling:DescribeAutoScalingInstances','autoscaling:DescribeLaunchConfigurations','autoscaling:DescribeTags','autoscaling:SetDesiredCapacity','autoscaling:TerminateInstanceInAutoScalingGroup','ec2:DescribeLaunchTemplateVersions'],resources: ['*']}))
Optionally, update autoscaling tags
props.nodeGroups.forEach(element => {new Tag('' + props.cluster.clusterName, 'owned', {applyToLaunchedInstances: true})new Tag('', 'true', {applyToLaunchedInstances: true})policy.attachToRole(element.role)})
Also update the scaling configuration of the nodegroup
scalingConfig: {desiredSize: 2,maxSize: 22,minSize: 1,},
Install AutoScaler by kubectl. Download the yaml and replace YOUR CLUSTER NAME with the cluster name Optionall, use affinity to launch this AutoScaler to the EC2 nodegroup only, no Faragte profile.
curl -O
Install AutoScaler using kubectl
kubect apply -f cluster-autoscaler-autodiscover.yaml
In case of CDK Construct level 2, it is possible to deploy the AutoScaler yaml by adding manifest to the cluster
readYamlFile(path.join(__dirname, './../yaml/cluster-autoscaler-autodiscover.yaml'),cluster)
Add the AutoScaler to cluster using CDK
const autoScaler = new AutoScalerHemlStack(app, 'AutoScalerHemlStack', {cluster: eks.cluster,nodeGroups: eks.nodeGroups})autoScaler.addDependency(eks)
For load test, prepare a few things
- Update the cdk8s-app/dist/deployemt.yaml to max 1000 pods
- Update the Nodegroup with max 20 instances
- Artillery load test with 500 threads
- Check autoscaling console to the activity
artillery quick --num 10000 --count 100 "http://$ELB_ENDPOINT"kubect get hpa --watchkubect top pod -n defaultkubect top node
Monitor logs of the AutoScaler
kubectl -n kube-system logs -f deployment.apps/cluster-autoscaler
Create a simple deployment as the following
apiVersion: v1kind: Servicemetadata:name: cdk8s-app-service-c8a84b3espec:ports:- port: 80targetPort: 8080selector:app: hello-cdk8stype: LoadBalancer---apiVersion: apps/v1kind: Deploymentmetadata:name: cdk8s-app-deployment-c8f953f2spec:replicas: 2selector:matchLabels:app: hello-cdk8stemplate:metadata:labels:app: hello-cdk8sspec:containers:- image: 'paulbouwer/hello-kubernetes:1.7'name: hello-kubernetesports:- containerPort: 8080resources:limits:cpu: 100mrequests:cpu: 100m---apiVersion: autoscaling/v2beta2kind: HorizontalPodAutoscalermetadata:name: cdk8s-app-webhorizontalautoscaler-c82a277espec:maxReplicas: 1000metrics:- resource:name: cputarget:averageUtilization: 5type: Utilizationtype: ResourceminReplicas: 2scaleTargetRef:apiVersion: apps/v1kind: Deploymentname: cdk8s-app-deployment-c8f953f2