Kubernetes Horizontal Scaling/Vertical Scaling 概念

Kiwi lee

9 min readDec 28, 2021

服務不炸掉，自動擴展

Vertical Scaling (垂直擴展)

增加 resources 在 pod 上面
增加在 node 上的 CPU/RAM 的資源

Horizontal Scaling (水平擴展)

增加 pod 數量
將 replica count 的數量增加

Vertical Scaling Autoscaler 簡介

VPA Recommender

根據過去的使用率，來推算目前適合分配的 CPU/RAM 的使用量。

Horizontal Scaling Autoscaler 的簡介

Horizontal Pod Autoscaling

In Kubernetes, a HorizontalPodAutoscaler automatically updates a workload resource (such as a Deployment or…

kubernetes.io

HorizontalPodAutoscaler 藉由計算 resource 的使用率，來推斷需增加或減少掌控在 Deployment 底下 replicaSet 的數量。

計算使用率頻率

基本上是每 15 秒檢查一次，這個秒數是可以透過 --horizontal-pod-autoscaler-sync-period 來做設定。

使用率計算

基本上是計算 CPU/RAM 的使用率，不過你可以自行設定要監控的資源。Controller 會拉取所有被 HPA 管控的 pod 裡的資源。

基本上不會考慮 missing metric 或是 unready pods ，但若已經決定調整往上或往下時，則會將其納入最後決定數字的策略。

desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

如果 currentMetricValue 是 200m ，而 desiredMetricValue 是 100m，則代表目前要增加 200/100 = 2 倍的 replica 的數量。

HPA 的控管

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  - type: Pods
    pods:
      metric:
        name: packets-per-second
      target:
        type: AverageValue
        averageValue: 1k
  - type: Object
    object:
      metric:
        name: requests-per-second
      describedObject:
        apiVersion: networking.k8s.io/v1beta1
        kind: Ingress
        name: main-route
      target:
        type: Value
        value: 10k

spec.minReplicas 定義最少的 replica，不能設定 0

spec.maxReplicas 定義 replica 最多的數目

spec.metrics 定義需要監控的資源，及其搭配的使用量調整

HPA metric 的參數

type: resources [CPU/RAM]

target.type = Utilization 是指在 container 上指定資源的百分比。

target.type = AverageValue 是指絕對數值

下面兩個 type 都是屬於 custom metrics ， pod/object。

type: Pod

Pod 本身的資源

只支援 target.type 是 averageValue

type: object

GitHub - kubernetes-sigs/metrics-server: Scalable and efficient source of container resource…

Metrics Server is a scalable, efficient source of container resource metrics for Kubernetes built-in autoscaling…

github.com

在同個 namespace 下的不同物件

HPA 的 scaling 策略

behavior:
  scaleDown:
    stabilizationWindowSeconds: 300
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15
  scaleUp:
    stabilizationWindowSeconds: 0
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15
    - type: Pods
      value: 4
      periodSeconds: 15
    selectPolicy: Max

關於 scale 的 policy 選擇

可以新增多個 policy ，每個 policy 它對應的 pods 增減數量。
透過 behavior.scale*.selectPolicy 來選擇使用的 policy，設定為 min ，則挑選增減數量最少的 policy ，反之 max ，則是最多的；你還可以設置成 Disabled ，來將 scale 暫時關閉。預設為 max。
policy 影響的 pods 數量，仍會受 maxReplicas/minReplicas 給限制

Scale policy 的內容

類型 (type) 可分為 percent/pods
Percent — 目前 pod 總數的幾 %，採無條件進位
[ScaleUp] percent = 100% 則代表以目前數量增幅一倍，即從 5 ->10。
[ScaleUp] Value = 5，pod 數量從 1->6
[ScaleUp] percent = 80% ，新增 目前數量 * 80%，即 10 -> 10+10*0.8 =18。
[ScaleDown] percent = 80%，關閉 目前數量 * 80%，即 10-> 10–10*0.8 =2
[ScaleDown] percent = 100% 則代表以目前數量減到 0 ，但因 minReplicas 最小的限制為 1，所以會剩一個
periodSeconds policy 執行後的 CD 時間

Policy 穩定性

如果資源使用率持續的忽高忽低，可能會導致 policy 影響 replica 數量開開關關，反而嚴重影響效能（因為 pod 起始會吃掉不少資源），因此可以設定 behavior.scale*.stabilizationWindowSeconds 來防止 pod 開開關關。

scaleDown:
  stabilizationWindowSeconds: 300

如上例，會查看過去 5 分鐘 desiredNumber 中的最大值，不會立即用當下的數值。

HPA 的侷限

有些 object 不能使用，像是 daemonSet

Kubectl Reference Docs

This section contains the most basic commands for getting a workload running on your cluster. run will start running 1…

kubernetes.io

或是你覺得以上都太麻煩，你可以試試看 autoscale 指令

$ kubectl autoscale (-f FILENAME | TYPE NAME | TYPE/NAME) [--min=MINPODS] --max=MAXPODS [--cpu-percent=CPU]

由 kubernetes cluster 幫你自動控管 deployments, replica sets, stateful set, replication 。

Example#1

# Auto scale deployment "foo" 動態調整 pods 數量在 2-10 ，使用 default autoscaling policy
kubectl autoscale deployment foo --min=2 --max=10

Example#2

kubectl autoscale rc foo --max=5 --cpu-percent=80

小結

挺神奇的，配合上服務的特性，去思考 HPA 的策略，感覺蠻有趣的。

像是講師提得 dcard 案例，12 點的抽卡，如何去妥善應付，而不是叫工程師在半夜 12 點都固定增量，是要用機器去管理機器，然後人放假 (ﾉ>ω<)ﾉ

Reference

Understanding Kubernetes Autoscaling

Kubernetes provides a series of features to ensure your clusters have the right size to handle any type of load. In…

blog.scaleway.com

Advanced HPA in Kubernetes

This blog post is the successor of the HPA blog post my colleague Furkan published . We will go more deeper into HPA…

www.kloia.com

Vertical Pod Autoscaler

Kubernetes Vertical Pod Autoscaler 可自動調整保留給裝置的 CPU 和記憶體，以協助保持應用程式的「適當大小」。這項調整可以改善叢集資源使用率，並釋放 CPU 和記憶體給其他裝置。本主題協助您將…

docs.aws.amazon.com