CDK Entest

Introduction#

Github this note shows

Kubernetes Horizontal Pod AutoScaler
Kubernetes Vertial Pod AutoScaler
AutoScaler (scale number of nodes)
Deployment, service, and load test
My note

Horizontal Pod AutoScaler#

The HPA scales based on default or custom (external) metrics. How it works?

Metrics are specified in the HPA definition
Once during the period (15 seconds), the controller manager queries the metrics
HPA access and adjust the scale parameter in Deployment or StatefulSet

It is possible to setup custom metrics, such as SQS length from AWS CloudWatch, or use a Lambda functionn to trigger Kubernetes scale via updating, setting Deployment, StatefulSet with a new number of replica . There are important parameters

horizontal-pod-autoscaler-sync-period default is 15 seconds
[horizontal-pod-autoscaler-initial-readiness-delay] default 30 seconds
[horizontal-pod-autoscaler-cpu-initialization-period] default 5 minutes
[horizontal-pod-autoscaler-downscale-stabilization] of stabilization window default is 300 seconds
Stabilization Window

Scaling algorithm

ceil[currentReplicas * (Current / Desired)]

First, setup metrics server which monitor CPU usage

export class MetricServerStack extends Stack {
  constructor(scope: Construct, id: string, props: MetricServerProps) {
    super(scope, id, props)

    const cluster = props.cluster

    readYamlFile(path.join(__dirname, './../yaml/metric_server.yaml'), cluster)
  }
}

deploy a HPA

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: cdk8s-app-webhorizontalautoscaler-c82a277e
spec:
  maxReplicas: 1000
  metrics:
    - resource:
        name: cpu
        target:
          averageUtilization: 5
          type: Utilization
      type: Resource
  minReplicas: 2
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: cdk8s-app-deployment-c8f953f2

For scaling based on custom metrics, there are some methods

CloudWatch => Adapter => External metrics => HPA
CloudWatch => Lambda => Update deployment

External metrics example, this CRD resource tells the adapter how to retrieve metric data from CW

apiVersion: metrics.aws/v1alpha1
kind: ExternalMetric:
  metadata:
    name: hello-queue-length
  spec:
    name: hello-queue-length
    resource:
      resource: "deployment"
    queries:
      - id: sqs_helloworld
        metricStat:
          metric:
            namespace: "AWS/SQS"
            metricName: "ApproximateNumberOfMessagesVisible"
            dimensions:
              - name: QueueName
                value: "helloworld"
          period: 300
          stat: Average
          unit: Count
        returnData: true

HPA based on the custom metric

kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta1
metadata:
  name: sqs-consumer-scaler
spec:
  scaleTargetRef:
    apiVersion: apps/v1beta1
    kind: Deployment
    name: sqs-consumer
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: External
      external:
        metricName: hello-queue-length
        targetAverageValue: 30

Cluster AutoScaler#

How scale up work?

It checkes any unscheduled pods every 10 seconds (scan-interval)
Change size (desired size) of the nodegroup of auto-scaling group
Launch new nodes using templates

How scale down group? CA check for unneeded ndoes

Every 10 seconds, if no scale up, CA checks which nodes are unneeded by some conditions (CPU, Mem)
All pods running on the node can be moved to other nodes
If a node is unneeded for more than 10 minutes, it will be terminated

Install the AutoScaler, for simple demo

Update role for ec2 node, so it can scale the autoscaling group
More secure way is to use service account
Install AutoScaler yaml by kubectl
Install AutoScaler by reading yaml and add to the cluster by CDK

There are some important parameters

AutoScaler reaction time
scan-interval 10 seconds by default which check for unscheduled pods via API servers
--scale-down-unneeded-time
--max-node-provision-time how log requested nodes to appear, within 15 minutes

Update role for ec2 node to work with auto-scaling group

nodeRole.addToPolicy(
  new aws_iam.PolicyStatement({
    effect: aws_iam.Effect.ALLOW,
    actions: [
      'autoscaling:DescribeAutoScalingGroups',
      'autoscaling:DescribeAutoScalingInstances',
      'autoscaling:DescribeLaunchConfigurations',
      'autoscaling:DescribeTags',
      'autoscaling:SetDesiredCapacity',
      'autoscaling:TerminateInstanceInAutoScalingGroup',
      'ec2:DescribeLaunchTemplateVersions'
    ],
    resources: ['*']
  })
)

Optionally, update autoscaling tags

props.nodeGroups.forEach(element => {
  new Tag('k8s.io/cluster-autoscaler/' + props.cluster.clusterName, 'owned', {
    applyToLaunchedInstances: true
  })

  new Tag('k8s.io/cluster-autoscaler/enabled', 'true', {
    applyToLaunchedInstances: true
  })
  policy.attachToRole(element.role)
})

Also update the scaling configuration of the nodegroup

scalingConfig: {
          desiredSize: 2,
          maxSize: 22,
          minSize: 1,
        },

Install AutoScaler by kubectl. Download the yaml and replace YOUR CLUSTER NAME with the cluster name Optionall, use affinity to launch this AutoScaler to the EC2 nodegroup only, no Faragte profile.

curl -O https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml

Install AutoScaler using kubectl

kubect apply -f cluster-autoscaler-autodiscover.yaml

In case of CDK Construct level 2, it is possible to deploy the AutoScaler yaml by adding manifest to the cluster

readYamlFile(
  path.join(__dirname, './../yaml/cluster-autoscaler-autodiscover.yaml'),
  cluster
)

Add the AutoScaler to cluster using CDK

const autoScaler = new AutoScalerHemlStack(app, 'AutoScalerHemlStack', {
  cluster: eks.cluster,
  nodeGroups: eks.nodeGroups
})
autoScaler.addDependency(eks)

For load test, prepare a few things

Update the cdk8s-app/dist/deployemt.yaml to max 1000 pods
Update the Nodegroup with max 20 instances
Artillery load test with 500 threads
Check autoscaling console to the activity

artillery quick --num 10000 --count 100 "http://$ELB_ENDPOINT"
kubect get hpa --watch
kubect top pod -n default
kubect top node

Monitor logs of the AutoScaler

kubectl -n kube-system logs -f deployment.apps/cluster-autoscaler

Deployment#

Create a simple deployment as the following

apiVersion: v1
kind: Service
metadata:
  name: cdk8s-app-service-c8a84b3e
spec:
  ports:
    - port: 80
      targetPort: 8080
  selector:
    app: hello-cdk8s
  type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cdk8s-app-deployment-c8f953f2
spec:
  replicas: 2
  selector:
    matchLabels:
      app: hello-cdk8s
  template:
    metadata:
      labels:
        app: hello-cdk8s
    spec:
      containers:
        - image: 'paulbouwer/hello-kubernetes:1.7'
          name: hello-kubernetes
          ports:
            - containerPort: 8080
          resources:
            limits:
              cpu: 100m
            requests:
              cpu: 100m
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: cdk8s-app-webhorizontalautoscaler-c82a277e
spec:
  maxReplicas: 1000
  metrics:
    - resource:
        name: cpu
        target:
          averageUtilization: 5
          type: Utilization
      type: Resource
  minReplicas: 2
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: cdk8s-app-deployment-c8f953f2

Horizontal Pod AutoScaler#

Cluster AutoScaler#

Deployment#

References#