123
Istio Canary Deployments
先決條件 Flagger
需要 Kubernetes 集群版本 v1.16 或更新版本,以及 Istio v1.5 或更新版本。 安裝 Istio 時需要啟用遙測支援和 Prometheus:
istioctl manifest install --set profile=default
# Suggestion: Please change release-1.8 in below command, to your real istio version.
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.18/samples/addons/prometheus.yaml
在 istio-system
命名空間中安裝 Flagger:
kubectl apply -k github.com/fluxcd/flagger//kustomize/istio
創建一個入口網關,以將示範應用程式暴露在mesh之外:
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: public-gateway
namespace: istio-system
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "*"
Bootstrap
Flagger 接管一個 Kubernetes 部署,以及可選的水平 Pod 自動縮放器(HPA),然後創建一系列對象(Kubernetes 部署、ClusterIP 服務、Istio 目標規則和虛擬服務)。這些對象在網格內部暴露應用程序,並驅動金絲雀分析和升級過程。
創建一個啟用了 Istio sidecar 注入的測試命名空間:
kubectl create ns test
kubectl label namespace test istio-injection=enabled
創建一個部署和一個HPA:
kubectl apply -k https://github.com/fluxcd/flagger//kustomize/podinfo?ref=main
部署負載測試服務,以在金絲雀分析期間生成流量:
kubectl apply -k https://github.com/fluxcd/flagger//kustomize/tester?ref=main
創建一個金絲雀自定義資源(將 example.com 替換為您自己的域名):
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: podinfo
namespace: test
spec:
# deployment reference
targetRef:
apiVersion: apps/v1
kind: Deployment
name: podinfo
# the maximum time in seconds for the canary deployment
# to make progress before it is rollback (default 600s)
progressDeadlineSeconds: 60
# HPA reference (optional)
autoscalerRef:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
name: podinfo
service:
# service port number
port: 9898
# container port number or name (optional)
targetPort: 9898
# Istio gateways (optional)
gateways:
- istio-system/public-gateway
# Istio virtual service host names (optional)
hosts:
- app.example.com
# Istio traffic policy (optional)
trafficPolicy:
tls:
# use ISTIO_MUTUAL when mTLS is enabled
mode: DISABLE
# Istio retry policy (optional)
retries:
attempts: 3
perTryTimeout: 1s
retryOn: "gateway-error,connect-failure,refused-stream"
analysis:
# schedule interval (default 60s)
interval: 1m
# max number of failed metric checks before rollback
threshold: 5
# max traffic percentage routed to canary
# percentage (0-100)
maxWeight: 50
# canary increment step
# percentage (0-100)
stepWeight: 10
metrics:
- name: request-success-rate
# minimum req success rate (non 5xx responses)
# percentage (0-100)
thresholdRange:
min: 99
interval: 1m
- name: request-duration
# maximum req duration P99
# milliseconds
thresholdRange:
max: 500
interval: 30s
# testing (optional)
webhooks:
- name: acceptance-test
type: pre-rollout
url: http://flagger-loadtester.test/
timeout: 30s
metadata:
type: bash
cmd: "curl -sd 'test' http://podinfo-canary:9898/token | grep token"
- name: load-test
url: http://flagger-loadtester.test/
timeout: 5s
metadata:
cmd: "hey -z 1m -q 10 -c 2 http://podinfo-canary.test:9898/"
注意,當使用 Istio 1.4 時,您必須將 request-duration
替換為一個度量標準模板。 將上述資源保存為 podinfo-canary.yaml,然後應用它:
kubectl apply -f ./podinfo-canary.yaml
當金絲雀分析開始時,Flagger 會在將流量路由到金絲雀版本之前調用預先發布的 webhook。金絲雀分析將運行五分鐘,期間每分鐘驗證一次 HTTP 指標和發布觸發器。
幾秒鐘後,Flagger 將創建金絲雀對象:
Automated canary promotion
通過更新容器鏡像來觸發金絲雀部署:
kubectl -n test set image deployment/podinfo \
podinfod=ghcr.io/stefanprodan/podinfo:6.0.1
Flagger 檢測到部署修訂版本發生變化,並開始新的發布:
kubectl -n test describe canary/podinfo
Status:
Canary Weight: 0
Failed Checks: 0
Phase: Succeeded
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Synced 3m flagger New revision detected podinfo.test
Normal Synced 3m flagger Scaling up podinfo.test
Warning Synced 3m flagger Waiting for podinfo.test rollout to finish: 0 of 1 updated replicas are available
Normal Synced 3m flagger Advance podinfo.test canary weight 5
Normal Synced 3m flagger Advance podinfo.test canary weight 10
Normal Synced 3m flagger Advance podinfo.test canary weight 15
Normal Synced 2m flagger Advance podinfo.test canary weight 20
Normal Synced 2m flagger Advance podinfo.test canary weight 25
Normal Synced 1m flagger Advance podinfo.test canary weight 30
Normal Synced 1m flagger Advance podinfo.test canary weight 35
Normal Synced 55s flagger Advance podinfo.test canary weight 40
Normal Synced 45s flagger Advance podinfo.test canary weight 45
Normal Synced 35s flagger Advance podinfo.test canary weight 50
Normal Synced 25s flagger Copying podinfo.test template spec to podinfo-primary.test
Warning Synced 15s flagger Waiting for podinfo-primary.test rollout to finish: 1 of 2 updated replicas are available
Normal Synced 5s flagger Promotion completed! Scaling down podinfo.test
注意,如果您在金絲雀分析期間對部署應用新的變更,Flagger 將重新啟動分析。 以下任何對象的變更都會觸發金絲雀部署:
- 部署 PodSpec(容器鏡像、命令、端口、環境變數、資源等)
- 作為卷掛載或映射到環境變數的 ConfigMap
- 作為卷掛載或映射到環境變數的 Secret
Automated rollback
在金絲雀分析期間,您可以生成 HTTP 500 錯誤和高延遲來測試 Flagger 是否會暫停發布。
觸發另一次金絲雀部署:
kubectl -n test set image deployment/podinfo \
podinfod=ghcr.io/stefanprodan/podinfo:6.0.2
使用以下命令進入負載測試器的 pod:
kubectl -n test exec -it flagger-loadtester-xx-xx sh
生成 HTTP 500 錯誤:
watch curl http://podinfo-canary:9898/status/500
生成延遲:
watch curl http://podinfo-canary:9898/delay/1
當失敗檢查的次數達到金絲雀分析閾值時,流量會被路由回主版本,金絲雀版本會被縮減為零,且發布會被標記為失敗。
kubectl -n test describe canary/podinfo