Introduction#

Github this note shows

  • Kubernetes Horizontal Pod AutoScaler
  • Kubernetes Vertial Pod AutoScaler
  • AutoScaler (scale number of nodes)
  • Deployment, service, and load test
  • My note

Horizontal Pod AutoScaler#

eks hpa

The HPA scales based on default or custom (external) metrics. How it works?

  • Metrics are specified in the HPA definition
  • Once during the period (15 seconds), the controller manager queries the metrics
  • HPA access and adjust the scale parameter in Deployment or StatefulSet

It is possible to setup custom metrics, such as SQS length from AWS CloudWatch, or use a Lambda functionn to trigger Kubernetes scale via updating, setting Deployment, StatefulSet with a new number of replica . There are important parameters

  • horizontal-pod-autoscaler-sync-period default is 15 seconds
  • [horizontal-pod-autoscaler-initial-readiness-delay] default 30 seconds
  • [horizontal-pod-autoscaler-cpu-initialization-period] default 5 minutes
  • [horizontal-pod-autoscaler-downscale-stabilization] of stabilization window default is 300 seconds
  • Stabilization Window

Scaling algorithm

  • ceil[currentReplicas * (Current / Desired)]

First, setup metrics server which monitor CPU usage

export class MetricServerStack extends Stack {
constructor(scope: Construct, id: string, props: MetricServerProps) {
super(scope, id, props)
const cluster = props.cluster
readYamlFile(path.join(__dirname, './../yaml/metric_server.yaml'), cluster)
}
}

deploy a HPA

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: cdk8s-app-webhorizontalautoscaler-c82a277e
spec:
maxReplicas: 1000
metrics:
- resource:
name: cpu
target:
averageUtilization: 5
type: Utilization
type: Resource
minReplicas: 2
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: cdk8s-app-deployment-c8f953f2

For scaling based on custom metrics, there are some methods

  • CloudWatch => Adapter => External metrics => HPA
  • CloudWatch => Lambda => Update deployment

External metrics example, this CRD resource tells the adapter how to retrieve metric data from CW

apiVersion: metrics.aws/v1alpha1
kind: ExternalMetric:
metadata:
name: hello-queue-length
spec:
name: hello-queue-length
resource:
resource: "deployment"
queries:
- id: sqs_helloworld
metricStat:
metric:
namespace: "AWS/SQS"
metricName: "ApproximateNumberOfMessagesVisible"
dimensions:
- name: QueueName
value: "helloworld"
period: 300
stat: Average
unit: Count
returnData: true

HPA based on the custom metric

kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta1
metadata:
name: sqs-consumer-scaler
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: sqs-consumer
minReplicas: 1
maxReplicas: 10
metrics:
- type: External
external:
metricName: hello-queue-length
targetAverageValue: 30

Cluster AutoScaler#

eks ca

How scale up work?

  • It checkes any unscheduled pods every 10 seconds (scan-interval)
  • Change size (desired size) of the nodegroup of auto-scaling group
  • Launch new nodes using templates

How scale down group? CA check for unneeded ndoes

  • Every 10 seconds, if no scale up, CA checks which nodes are unneeded by some conditions (CPU, Mem)
  • All pods running on the node can be moved to other nodes
  • If a node is unneeded for more than 10 minutes, it will be terminated

Install the AutoScaler, for simple demo

  • Update role for ec2 node, so it can scale the autoscaling group
  • More secure way is to use service account
  • Install AutoScaler yaml by kubectl
  • Install AutoScaler by reading yaml and add to the cluster by CDK

There are some important parameters

Update role for ec2 node to work with auto-scaling group

nodeRole.addToPolicy(
new aws_iam.PolicyStatement({
effect: aws_iam.Effect.ALLOW,
actions: [
'autoscaling:DescribeAutoScalingGroups',
'autoscaling:DescribeAutoScalingInstances',
'autoscaling:DescribeLaunchConfigurations',
'autoscaling:DescribeTags',
'autoscaling:SetDesiredCapacity',
'autoscaling:TerminateInstanceInAutoScalingGroup',
'ec2:DescribeLaunchTemplateVersions'
],
resources: ['*']
})
)

Optionally, update autoscaling tags

props.nodeGroups.forEach(element => {
new Tag('k8s.io/cluster-autoscaler/' + props.cluster.clusterName, 'owned', {
applyToLaunchedInstances: true
})
new Tag('k8s.io/cluster-autoscaler/enabled', 'true', {
applyToLaunchedInstances: true
})
policy.attachToRole(element.role)
})

Also update the scaling configuration of the nodegroup

scalingConfig: {
desiredSize: 2,
maxSize: 22,
minSize: 1,
},

Install AutoScaler by kubectl. Download the yaml and replace YOUR CLUSTER NAME with the cluster name Optionall, use affinity to launch this AutoScaler to the EC2 nodegroup only, no Faragte profile.

curl -O https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml

Install AutoScaler using kubectl

kubect apply -f cluster-autoscaler-autodiscover.yaml

In case of CDK Construct level 2, it is possible to deploy the AutoScaler yaml by adding manifest to the cluster

readYamlFile(
path.join(__dirname, './../yaml/cluster-autoscaler-autodiscover.yaml'),
cluster
)

Add the AutoScaler to cluster using CDK

const autoScaler = new AutoScalerHemlStack(app, 'AutoScalerHemlStack', {
cluster: eks.cluster,
nodeGroups: eks.nodeGroups
})
autoScaler.addDependency(eks)

For load test, prepare a few things

  • Update the cdk8s-app/dist/deployemt.yaml to max 1000 pods
  • Update the Nodegroup with max 20 instances
  • Artillery load test with 500 threads
  • Check autoscaling console to the activity
artillery quick --num 10000 --count 100 "http://$ELB_ENDPOINT"
kubect get hpa --watch
kubect top pod -n default
kubect top node

Monitor logs of the AutoScaler

kubectl -n kube-system logs -f deployment.apps/cluster-autoscaler

Deployment#

Create a simple deployment as the following

apiVersion: v1
kind: Service
metadata:
name: cdk8s-app-service-c8a84b3e
spec:
ports:
- port: 80
targetPort: 8080
selector:
app: hello-cdk8s
type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: cdk8s-app-deployment-c8f953f2
spec:
replicas: 2
selector:
matchLabels:
app: hello-cdk8s
template:
metadata:
labels:
app: hello-cdk8s
spec:
containers:
- image: 'paulbouwer/hello-kubernetes:1.7'
name: hello-kubernetes
ports:
- containerPort: 8080
resources:
limits:
cpu: 100m
requests:
cpu: 100m
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: cdk8s-app-webhorizontalautoscaler-c82a277e
spec:
maxReplicas: 1000
metrics:
- resource:
name: cpu
target:
averageUtilization: 5
type: Utilization
type: Resource
minReplicas: 2
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: cdk8s-app-deployment-c8f953f2

References#