Argo Rolloutsは、Kubernetesで利用できるRolling Updateよりも高度なデプロイ・リリース方式を利用することができます。その中には「Progressive Delivery」という、デプロイ後に特定の分析を行い、デプロイの結果を評価するという方式も含まれています。
Argo RolloutsにはAnalysisTemplate
AnalysisRun
など、分析に関するCRDが含まれており、この結果をもとに自動ロールバックを実行することができます。
今回はArgo Rolloutsで利用できる自動ロールバックを試してみました。Argo Rolloutsの概要については、前回の記事をご覧ください。
検証環境
今回の環境は以下の通りです。
Argo Rolloutsのデプロイ
まずはArgo Rolloutsを利用できるよう、Kubernetes環境にデプロイします。
$ git clone https://github.com/argoproj/argo-rollouts.git
$ cd argo-rollouts/manifests/
$ kubectl create ns argo-rollouts
namespace/argo-rollouts created
$ kubectl get ns
NAME STATUS AGE
argo-rollouts Active 5s
default Active 63m
kube-node-lease Active 63m
kube-public Active 63m
kube-system Active 63m
$ kubectl apply -n argo-rollouts -f install.yaml
customresourcedefinition.apiextensions.k8s.io/analysisruns.argoproj.io created
customresourcedefinition.apiextensions.k8s.io/analysistemplates.argoproj.io created
customresourcedefinition.apiextensions.k8s.io/clusteranalysistemplates.argoproj.io created
customresourcedefinition.apiextensions.k8s.io/experiments.argoproj.io created
customresourcedefinition.apiextensions.k8s.io/rollouts.argoproj.io created
serviceaccount/argo-rollouts created
role.rbac.authorization.k8s.io/argo-rollouts-role created
clusterrole.rbac.authorization.k8s.io/argo-rollouts-aggregate-to-admin created
clusterrole.rbac.authorization.k8s.io/argo-rollouts-aggregate-to-edit created
clusterrole.rbac.authorization.k8s.io/argo-rollouts-aggregate-to-view created
clusterrole.rbac.authorization.k8s.io/argo-rollouts-clusterrole created
rolebinding.rbac.authorization.k8s.io/argo-rollouts-role-binding created
clusterrolebinding.rbac.authorization.k8s.io/argo-rollouts-clusterrolebinding created
service/argo-rollouts-metrics created
deployment.apps/argo-rollouts created
$ kubectl get all -n argo-rollouts
NAME READY STATUS RESTARTS AGE
pod/argo-rollouts-8454b64759-rhf47 1/1 Running 0 9s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/argo-rollouts-metrics ClusterIP 10.100.217.123 <none> 8090/TCP 10s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/argo-rollouts 1/1 1 1 9s
NAME DESIRED CURRENT READY AGE
replicaset.apps/argo-rollouts-8454b64759 1 1 1 9s
AnalysisTemplate
を利用しない場合
ここから、実際にArgo Rolloutsを利用します。今回はBlue/Green Deploymentを利用したときの様子を見ていきます。またAnalysisTemplate
を利用しない場合と、利用した場合を試し、AnalysisTemplate
を利用することでどう変わるかを見ていきます。
まずはAnalysisTemplate
を利用しない場合を見てみます。
Rollout
のデプロイ
今回は以下のようなRollout
Service
用のファイルを利用しました。Argo RolloutsでBlue/Green Deploymentを利用する場合、activeService
というServiceを指定する必要があります。指定したServiceが存在しない場合、Rolloutが作成された後もPodが作成されません。
rollout-bg-test.yml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: rollout-bg-test
spec:
replicas: 2
revisionHistoryLimit: 2
selector:
matchLabels:
app: rollout-bg
template:
metadata:
labels:
app: rollout-bg
spec:
containers:
- name: nginx-container
image: nginx:latest
ports:
- containerPort: 80
strategy:
blueGreen:
autoPromotionEnabled: true
activeService: rollout-active-service
rollout-service.yml
apiVersion: v1
kind: Service
metadata:
name: rollout-active-service
spec:
ports:
- port: 8080
targetPort: 80
protocol: TCP
selector:
app: rollout-bg
上記2つのリソースをデプロイします。
$ kubectl apply -f rollout-service.yml
service/rollout-active-service created
$ kubectl apply -f rollout-bg-test.yml
rollout.argoproj.io/rollout-bg-test created
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.100.0.1 <none> 443/TCP 21h
rollout-active-service ClusterIP 10.100.121.64 <none> 8080/TCP 33s
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
rollout-bg-test-797d88cdd8-4ww9s 1/1 Running 0 14s
rollout-bg-test-797d88cdd8-stc74 1/1 Running 0 14s
$ kubectl get rollout
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE
rollout-bg-test 2 2 2 2
$ kubectl argo rollouts get rollout rollout-bg-test
Name: rollout-bg-test
Namespace: default
Status: ✔ Healthy
Strategy: BlueGreen
Images: nginx:latest (active)
Replicas:
Desired: 2
Current: 2
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test Rollout ✔ Healthy 62s
└──# revision:1
└──⧉ rollout-bg-test-797d88cdd8 ReplicaSet ✔ Healthy 62s active
├──□ rollout-bg-test-797d88cdd8-4ww9s Pod ✔ Running 62s ready:1/1
└──□ rollout-bg-test-797d88cdd8-stc74 Pod ✔ Running 62s ready:1/1
Rollout
のアップデート
次に、デプロイしたRollout
のイメージタグを変更し、アップデートされる様子を見ていきます。
$ kubectl argo rollouts get rollout rollout-bg-test -w
$ kubectl argo rollouts set image rollout-bg-test nginx-container=nginx:stable
rollout "rollout-bg-test" image updated
Name: rollout-bg-test
Namespace: default
Status: ✔ Healthy
Strategy: BlueGreen
Images: nginx:latest (active)
Replicas:
Desired: 2
Current: 2
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test Rollout ✔ Healthy 2m49s
└──# revision:1
└──⧉ rollout-bg-test-797d88cdd8 ReplicaSet ✔ Healthy 2m49s active
├──□ rollout-bg-test-797d88cdd8-4ww9s Pod ✔ Running 2m49s ready:1/1
└──□ rollout-bg-test-797d88cdd8-stc74 Pod ✔ Running 2m49s ready:1/1
Name: rollout-bg-test
Namespace: default
Status: ✔ Healthy
Strategy: BlueGreen
Images: nginx:latest (active)
Replicas:
Desired: 2
Current: 2
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test Rollout ✔ Healthy 2m50s
├──# revision:2
│ └──⧉ rollout-bg-test-5fd48d44d ReplicaSet ◌ Progressing 0s
└──# revision:1
└──⧉ rollout-bg-test-797d88cdd8 ReplicaSet ✔ Healthy 2m50s active
├──□ rollout-bg-test-797d88cdd8-4ww9s Pod ✔ Running 2m50s ready:1/1
└──□ rollout-bg-test-797d88cdd8-stc74 Pod ✔ Running 2m50s ready:1/1
Name: rollout-bg-test
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: nginx:latest (active)
nginx:stable
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 2
Available: 0
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test Rollout ◌ Progressing 2m50s
├──# revision:2
│ └──⧉ rollout-bg-test-5fd48d44d ReplicaSet ◌ Progressing 0s
│ ├──□ rollout-bg-test-5fd48d44d-89nnw Pod ◌ ContainerCreating 0s ready:0/1
│ └──□ rollout-bg-test-5fd48d44d-s8ntr Pod ◌ ContainerCreating 0s ready:0/1
└──# revision:1
└──⧉ rollout-bg-test-797d88cdd8 ReplicaSet ✔ Healthy 2m50s active
├──□ rollout-bg-test-797d88cdd8-4ww9s Pod ✔ Running 2m50s ready:1/1
└──□ rollout-bg-test-797d88cdd8-stc74 Pod ✔ Running 2m50s ready:1/1
Name: rollout-bg-test
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: nginx:latest
nginx:stable (active)
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 4
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test Rollout ◌ Progressing 2m52s
├──# revision:2
│ └──⧉ rollout-bg-test-5fd48d44d ReplicaSet ✔ Healthy 1s active
│ ├──□ rollout-bg-test-5fd48d44d-89nnw Pod ✔ Running 1s ready:1/1
│ └──□ rollout-bg-test-5fd48d44d-s8ntr Pod ✔ Running 1s ready:1/1
└──# revision:1
└──⧉ rollout-bg-test-797d88cdd8 ReplicaSet ✔ Healthy 2m52s delay:30s
├──□ rollout-bg-test-797d88cdd8-4ww9s Pod ✔ Running 2m52s ready:1/1
└──□ rollout-bg-test-797d88cdd8-stc74 Pod ✔ Running 2m52s ready:1/1
Name: rollout-bg-test
Namespace: default
Status: ✔ Healthy
Strategy: BlueGreen
Images: nginx:stable (active)
Replicas:
Desired: 2
Current: 2
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test Rollout ✔ Healthy 6m7s
├──# revision:2
│ └──⧉ rollout-bg-test-5fd48d44d ReplicaSet ✔ Healthy 3m16s active
│ ├──□ rollout-bg-test-5fd48d44d-89nnw Pod ✔ Running 3m16s ready:1/1
│ └──□ rollout-bg-test-5fd48d44d-s8ntr Pod ✔ Running 3m16s ready:1/1
└──# revision:1
└──⧉ rollout-bg-test-797d88cdd8 ReplicaSet • ScaledDown 6m7s
完了後のリソースは以下の通りです。
$ kubectl get rollout
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE
rollout-bg-test 2 2 2 2
$ kubectl describe rollout
Name: rollout-bg-test
Namespace: default
Labels: <none>
Annotations: rollout.argoproj.io/revision: 2
API Version: argoproj.io/v1alpha1
Kind: Rollout
Metadata:
Creation Timestamp: 2020-10-10T01:57:48Z
Generation: 14
Resource Version: 237409
Self Link: /apis/argoproj.io/v1alpha1/namespaces/default/rollouts/rollout-bg-test
UID: d5ba96ef-2292-4f3c-91f1-cd0427c50f59
Spec:
Replicas: 2
Revision History Limit: 2
Selector:
Match Labels:
App: rollout-bg
Strategy:
Blue Green:
Active Service: rollout-active-service
Auto Promotion Enabled: true
Template:
Metadata:
Creation Timestamp: <nil>
Labels:
App: rollout-bg
Spec:
Containers:
Image: nginx:stable
Name: nginx-container
Ports:
Container Port: 80
Resources:
Status:
HPA Replicas: 2
Available Replicas: 2
Blue Green:
Active Selector: 5fd48d44d
Canary:
Conditions:
Last Transition Time: 2020-10-10T01:57:52Z
Last Update Time: 2020-10-10T01:57:52Z
Message: Rollout has minimum availability
Reason: AvailableReason
Status: True
Type: Available
Last Transition Time: 2020-10-10T01:57:48Z
Last Update Time: 2020-10-10T02:00:41Z
Message: ReplicaSet "rollout-bg-test-5fd48d44d" has successfully progressed.
Reason: NewReplicaSetAvailable
Status: True
Type: Progressing
Current Pod Hash: 5fd48d44d
Observed Generation: 576f58fbb8
Ready Replicas: 2
Replicas: 2
Selector: app=rollout-bg,rollouts-pod-template-hash=5fd48d44d
Stable RS: 5fd48d44d
Updated Replicas: 2
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 7m52s rollouts-controller Scaled up replica set rollout-bg-test-797d88cdd8 to 2
Normal SwitchService 7m48s rollouts-controller Switched selector for service 'rollout-active-service' to value '797d88cdd8'
Normal ScalingReplicaSet 5m1s rollouts-controller Scaled up replica set rollout-bg-test-5fd48d44d to 2
Normal SwitchService 4m59s rollouts-controller Switched selector for service 'rollout-active-service' to value '5fd48d44d'
Normal ScalingReplicaSet 4m29s rollouts-controller Scaled down replica set rollout-bg-test-797d88cdd8 to 0
AnalysisTemplate
を利用する場合
ここからはRollout
Service
に加えAnalysisTemplate
リソースを作成し、ロールアウト実行時にAnalysisが実行されるようにします。
今回は、Analysisに成功した場合・失敗した場合を見るために、実行後exit 0
を返す(=Analysisに必ず成功する)ようなAnalysisTemplate
を用意します。そして、デプロイ後にkubectl edit
コマンドによってAnalysisTemplate
を編集し、実行後exit 1
を返す(=Analysisに必ず失敗する)ようにして失敗した場合を見てみます。
AnalysisTemplate
は、Rolloutリソース中で宣言をされると、Rolloutのアップデート時に実行されます。実行と書きましたが、実際はAnalysisRun
という、分析を実行するためのCRDが作成され、AnalysisTemplate
に定義された内容を元に分析を実行します。
AnalysisTemplate
で実行する分析では様々な種類のメトリクスを利用することができます。今回はKubernetesリソースの1つであるJobを利用しました。AnalysisRun
によって分析が実行されると、分析を行うためのJobが作成され、Jobが正常に終了すれば成功となります。
Argo RolloutsのBlue/Green Deploymentは、Analysisを実行するタイミングとしてprePromotionAnalysis
postPromotionAnalysis
のどちらかを利用することができ、それぞれ切り替え前・切り替え後に実行をすることができます。今回はpostPromotionAnalysis
を利用し、切り替え後にAnalysisを実行して、失敗した場合は元のバージョンへの切り戻しを行い、古いバージョンへロールバックするようにします。
Rollout
のデプロイ
ここでは以下の3つのファイルを利用します。
rollout-bg-test-analysis.yml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: rollout-bg-test-analysis
spec:
replicas: 2
revisionHistoryLimit: 2
selector:
matchLabels:
app: rollout-bg-analysis
template:
metadata:
labels:
app: rollout-bg-analysis
spec:
containers:
- name: nginx-container
image: nginx:latest
ports:
- containerPort: 80
strategy:
blueGreen:
activeService: rollout-active-service-analysis
postPromotionAnalysis:
templates:
- templateName: test-analysis
rollout-active-service-analysis.yml
apiVersion: v1
kind: Service
metadata:
name: rollout-active-service-analysis
spec:
ports:
- port: 8080
targetPort: 80
protocol: TCP
selector:
app: rollout-bg-analysis
test-analysistemp.yml
kind: AnalysisTemplate
apiVersion: argoproj.io/v1alpha1
metadata:
name: test-analysis
spec:
metrics:
- name: test-analysis
provider:
job:
spec:
template:
spec:
containers:
- name: sleep
image: alpine:3.8
command: [sh, -c]
args: [exit 0]
restartPolicy: Never
backoffLimit: 1
上記3つのファイルをデプロイします。
$ kubectl apply -f test-analysistemp.yml
analysistemplate.argoproj.io/test-analysis created
$ kubectl apply -f rollout-active-service-analysis.yml
service/rollout-active-service-analysis created
$ kubectl apply -f rollout-bg-test-analysis.yml
rollout.argoproj.io/rollout-bg-test-analysis created
$ kubectl get analysistemplate
NAME AGE
test-analysis 23s
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.100.0.1 <none> 443/TCP 21h
rollout-active-service-analysis ClusterIP 10.100.255.31 <none> 8080/TCP 20s
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
rollout-bg-test-analysis-766b7567dc-qgpzx 1/1 Running 0 15s
rollout-bg-test-analysis-766b7567dc-tzlft 1/1 Running 0 15s
$ kubectl get rollout
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE
rollout-bg-test-analysis 2 2 2 2
$ kubectl argo rollouts get rollout rollout-bg-test-analysis
Name: rollout-bg-test-analysis
Namespace: default
Status: ✔ Healthy
Strategy: BlueGreen
Images: nginx:latest (active)
Replicas:
Desired: 2
Current: 2
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-analysis Rollout ✔ Healthy 26s
└──# revision:1
└──⧉ rollout-bg-test-analysis-766b7567dc ReplicaSet ✔ Healthy 26s active
├──□ rollout-bg-test-analysis-766b7567dc-qgpzx Pod ✔ Running 26s ready:1/1
└──□ rollout-bg-test-analysis-766b7567dc-tzlft Pod ✔ Running 26s ready:1/1
Rollout
のアップデート (Analysisに成功した場合)
まずはAnalysisに成功した場合を見てみます。先ほどと同様にイメージタグを更新してみます。
$ kubectl argo rollouts get rollout rollout-bg-test-analysis --watch
$ kubectl argo rollouts set image rollout-bg-test-analysis nginx-container=nginx:stable
rollout "rollout-bg-test-analysis" image updated
Name: rollout-bg-test-analysis
Namespace: default
Status: ✔ Healthy
Strategy: BlueGreen
Images: nginx:latest (active)
Replicas:
Desired: 2
Current: 2
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-analysis Rollout ✔ Healthy 2m9s
├──# revision:2
│ └──⧉ rollout-bg-test-analysis-6bcfbc585f ReplicaSet ◌ Progressing 0s
└──# revision:1
└──⧉ rollout-bg-test-analysis-766b7567dc ReplicaSet ✔ Healthy 2m9s active
├──□ rollout-bg-test-analysis-766b7567dc-qgpzx Pod ✔ Running 2m9s ready:1/1
└──□ rollout-bg-test-analysis-766b7567dc-tzlft Pod ✔ Running 2m9s ready:1/1
Name: rollout-bg-test-analysis
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: nginx:latest (active)
nginx:stable
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 3
Available: 1
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-analysis Rollout ◌ Progressing 2m10s
├──# revision:2
│ └──⧉ rollout-bg-test-analysis-6bcfbc585f ReplicaSet ◌ Progressing 0s
│ ├──□ rollout-bg-test-analysis-6bcfbc585f-cq2q5 Pod ✔ Running 0s ready:1/1
│ └──□ rollout-bg-test-analysis-6bcfbc585f-dtxdc Pod ✔ Running 0s ready:1/1
└──# revision:1
└──⧉ rollout-bg-test-analysis-766b7567dc ReplicaSet ✔ Healthy 2m10s active
├──□ rollout-bg-test-analysis-766b7567dc-qgpzx Pod ✔ Running 2m10s ready:1/1
└──□ rollout-bg-test-analysis-766b7567dc-tzlft Pod ✔ Running 2m10s ready:1/1
Name: rollout-bg-test-analysis
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: nginx:latest
nginx:stable (active)
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 4
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-analysis Rollout ◌ Progressing 2m10s
├──# revision:2
│ ├──⧉ rollout-bg-test-analysis-6bcfbc585f ReplicaSet ✔ Healthy 0s active
│ │ ├──□ rollout-bg-test-analysis-6bcfbc585f-cq2q5 Pod ✔ Running 0s ready:1/1
│ │ └──□ rollout-bg-test-analysis-6bcfbc585f-dtxdc Pod ✔ Running 0s ready:1/1
│ └──α rollout-bg-test-analysis-6bcfbc585f-2 AnalysisRun ◌ Running 0s
│ └──⊞ 471f5e5b-553b-4f94-bae3-cefca88afcd6.test-analysis.1 Job ◌ Running 0s
└──# revision:1
└──⧉ rollout-bg-test-analysis-766b7567dc ReplicaSet ✔ Healthy 2m10s delay:30s
├──□ rollout-bg-test-analysis-766b7567dc-qgpzx Pod ✔ Running 2m10s ready:1/1
└──□ rollout-bg-test-analysis-766b7567dc-tzlft Pod ✔ Running 2m10s ready:1/1
Name: rollout-bg-test-analysis
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: nginx:latest
nginx:stable (active)
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 4
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-analysis Rollout ◌ Progressing 2m11s
├──# revision:2
│ ├──⧉ rollout-bg-test-analysis-6bcfbc585f ReplicaSet ✔ Healthy 1s active
│ │ ├──□ rollout-bg-test-analysis-6bcfbc585f-cq2q5 Pod ✔ Running 1s ready:1/1
│ │ └──□ rollout-bg-test-analysis-6bcfbc585f-dtxdc Pod ✔ Running 1s ready:1/1
│ └──α rollout-bg-test-analysis-6bcfbc585f-2 AnalysisRun ✔ Successful 0s ✔ 1
│ └──⊞ 471f5e5b-553b-4f94-bae3-cefca88afcd6.test-analysis.1 Job ✔ Successful 0s
└──# revision:1
└──⧉ rollout-bg-test-analysis-766b7567dc ReplicaSet ✔ Healthy 2m11s delay:29s
├──□ rollout-bg-test-analysis-766b7567dc-qgpzx Pod ✔ Running 2m11s ready:1/1
└──□ rollout-bg-test-analysis-766b7567dc-tzlft Pod ✔ Running 2m11s ready:1/1
Name: rollout-bg-test-analysis
Namespace: default
Status: ✔ Healthy
Strategy: BlueGreen
Images: nginx:stable (active)
Replicas:
Desired: 2
Current: 2
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-analysis Rollout ✔ Healthy 2m45s
├──# revision:2
│ ├──⧉ rollout-bg-test-analysis-6bcfbc585f ReplicaSet ✔ Healthy 35s active
│ │ ├──□ rollout-bg-test-analysis-6bcfbc585f-cq2q5 Pod ✔ Running 35s ready:1/1
│ │ └──□ rollout-bg-test-analysis-6bcfbc585f-dtxdc Pod ✔ Running 35s ready:1/1
│ └──α rollout-bg-test-analysis-6bcfbc585f-2 AnalysisRun ✔ Successful 34s ✔ 1
│ └──⊞ 471f5e5b-553b-4f94-bae3-cefca88afcd6.test-analysis.1 Job ✔ Successful 34s
└──# revision:1
└──⧉ rollout-bg-test-analysis-766b7567dc ReplicaSet • ScaledDown 2m45s
新しいバージョンのデプロイ後、ロールアウトが完了し、通信が切り替わった後にAnalysisを実行する様子、そしてAnalysisに成功した場合、そのまま新しいバージョンのほうに切り替わったまま、古いバージョンが削除される様子が確認できました。
なお、完了後のリソースは以下の通りです。ロールアウトが実行されることでAnalysisRun
というリソースが作成・実行され、その結果を確認することができます。
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
471f5e5b-553b-4f94-bae3-cefca88afcd6.test-analysis.1-8hz2s 0/1 Completed 0 4m26s
rollout-bg-test-analysis-6bcfbc585f-cq2q5 1/1 Running 0 4m27s
rollout-bg-test-analysis-6bcfbc585f-dtxdc 1/1 Running 0 4m27s
$ kubectl get analysisrun
NAME STATUS
rollout-bg-test-analysis-6bcfbc585f-2 Successful
$ kubectl get rollout
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE
rollout-bg-test-analysis 2 2 2 2
$ kubectl describe analysisrun rollout-bg-test-analysis-6bcfbc585f-2
Name: rollout-bg-test-analysis-6bcfbc585f-2
Namespace: default
Labels: rollout-type=PostPromotion
rollouts-pod-template-hash=6bcfbc585f
Annotations: rollout.argoproj.io/revision: 2
API Version: argoproj.io/v1alpha1
Kind: AnalysisRun
Metadata:
Creation Timestamp: 2020-10-10T02:27:13Z
Generation: 3
Owner References:
API Version: argoproj.io/v1alpha1
Block Owner Deletion: true
Controller: true
Kind: Rollout
Name: rollout-bg-test-analysis
UID: e9ba7057-29db-491a-8618-13180c406fef
Resource Version: 242399
Self Link: /apis/argoproj.io/v1alpha1/namespaces/default/analysisruns/rollout-bg-test-analysis-6bcfbc585f-2
UID: 471f5e5b-553b-4f94-bae3-cefca88afcd6
Spec:
Metrics:
Name: test-analysis
Provider:
Job:
Metadata:
Creation Timestamp: <nil>
Spec:
Backoff Limit: 1
Template:
Metadata:
Creation Timestamp: <nil>
Spec:
Containers:
Args:
exit 0
Command:
sh
-c
Image: alpine:3.8
Name: sleep
Resources:
Restart Policy: Never
Status:
Metric Results:
Count: 1
Measurements:
Finished At: 2020-10-10T02:27:14Z
Metadata:
Job - Name: 471f5e5b-553b-4f94-bae3-cefca88afcd6.test-analysis.1
Phase: Successful
Started At: 2020-10-10T02:27:13Z
Name: test-analysis
Phase: Successful
Successful: 1
Phase: Successful
Started At: 2020-10-10T02:27:13Z
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Complete 20m rollouts-controller metric 'test-analysis' completed Successful
Normal Complete 20m rollouts-controller analysis completed Successful
$ kubectl describe rollout rollout-bg-test-analysis
Name: rollout-bg-test-analysis
Namespace: default
Labels: <none>
Annotations: rollout.argoproj.io/revision: 2
API Version: argoproj.io/v1alpha1
Kind: Rollout
Metadata:
Creation Timestamp: 2020-10-10T02:25:02Z
Generation: 16
Resource Version: 242503
Self Link: /apis/argoproj.io/v1alpha1/namespaces/default/rollouts/rollout-bg-test-analysis
UID: e9ba7057-29db-491a-8618-13180c406fef
Spec:
Replicas: 2
Revision History Limit: 2
Selector:
Match Labels:
App: rollout-bg-analysis
Strategy:
Blue Green:
Active Service: rollout-active-service-analysis
Post Promotion Analysis:
Templates:
Template Name: test-analysis
Template:
Metadata:
Creation Timestamp: <nil>
Labels:
App: rollout-bg-analysis
Spec:
Containers:
Image: nginx:stable
Name: nginx-container
Ports:
Container Port: 80
Resources:
Status:
HPA Replicas: 2
Available Replicas: 2
Blue Green:
Active Selector: 6bcfbc585f
Canary:
Conditions:
Last Transition Time: 2020-10-10T02:25:07Z
Last Update Time: 2020-10-10T02:25:07Z
Message: Rollout has minimum availability
Reason: AvailableReason
Status: True
Type: Available
Last Transition Time: 2020-10-10T02:25:02Z
Last Update Time: 2020-10-10T02:27:13Z
Message: ReplicaSet "rollout-bg-test-analysis-6bcfbc585f" has successfully progressed.
Reason: NewReplicaSetAvailable
Status: True
Type: Progressing
Current Pod Hash: 6bcfbc585f
Observed Generation: 786976f646
Ready Replicas: 2
Replicas: 2
Selector: app=rollout-bg-analysis,rollouts-pod-template-hash=6bcfbc585f
Stable RS: 6bcfbc585f
Updated Replicas: 2
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 7m6s rollouts-controller Scaled up replica set rollout-bg-test-analysis-766b7567dc to 2
Normal SwitchService 7m1s rollouts-controller Switched selector for service 'rollout-active-service-analysis' to value '766b7567dc'
Normal ScalingReplicaSet 4m56s rollouts-controller Scaled up replica set rollout-bg-test-analysis-6bcfbc585f to 2
Normal SwitchService 4m55s rollouts-controller Switched selector for service 'rollout-active-service-analysis' to value '6bcfbc585f'
Normal AnalysisRunStatusChange 4m55s rollouts-controller PostPromotion Analysis Run 'rollout-bg-test-analysis-6bcfbc585f-2' Status New: '' Previous: 'NoPreviousStatus'
Normal AnalysisRunStatusChange 4m55s rollouts-controller PostPromotion Analysis Run 'rollout-bg-test-analysis-6bcfbc585f-2' Status New: 'Running' Previous: ''
Normal AnalysisRunStatusChange 4m54s rollouts-controller PostPromotion Analysis Run 'rollout-bg-test-analysis-6bcfbc585f-2' Status New: 'Successful' Previous: 'Running'
Normal ScalingReplicaSet 4m25s rollouts-controller Scaled down replica set rollout-bg-test-analysis-766b7567dc to 0
Rollout
のアップデート (Analysisに失敗した場合)
次にAnalysisに失敗した場合を見てみます。
まずはデプロイ済みのAnalysisTemplate
を一部編集し、Analysis実行時に失敗するようにします。
$ kubectl edit analysistemplate test-analysis
analysistemplate.argoproj.io/test-analysis edited
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"argoproj.io/v1alpha1","kind":"AnalysisTemplate","metadata":{"annotations":{},"name":"test-analysis","namespace":"default"},"spec":{"metrics":[{"name":"test-analysis","provider":{"job":{"spec":{"backoffLimit":1,"template":{"spec":{"containers":[{"args":["exit 0"],"command":["sh","-c"],"image":"alpine:3.8","name":"sleep"}],"restartPolicy":"Never"}}}}}}]}}
creationTimestamp: "2020-10-10T02:24:45Z"
generation: 1
name: test-analysis
namespace: default
resourceVersion: "241845"
selfLink: /apis/argoproj.io/v1alpha1/namespaces/default/analysistemplates/test-analysis
uid: 2fc98c6c-f40e-4903-a134-a6430aaa0b0d
spec:
metrics:
- name: test-analysis
provider:
job:
spec:
backoffLimit: 1
template:
spec:
containers:
- args:
- exit 1
command:
- sh
- -c
image: alpine:3.8
name: sleep
restartPolicy: Never
編集が完了したら、先ほどと同様にイメージタグの更新を行い、アップデートの様子を確認します。
$ kubectl argo rollouts get rollout rollout-bg-test-analysis --watch
$ kubectl argo rollouts set image rollout-bg-test-analysis nginx-container=nginx:1.19
rollout "rollout-bg-test-analysis" image updated
Name: rollout-bg-test-analysis
Namespace: default
Status: ✔ Healthy
Strategy: BlueGreen
Images: nginx:stable (active)
Replicas:
Desired: 2
Current: 2
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-analysis Rollout ✔ Healthy 10m
├──# revision:2
│ ├──⧉ rollout-bg-test-analysis-6bcfbc585f ReplicaSet ✔ Healthy 7m53s active
│ │ ├──□ rollout-bg-test-analysis-6bcfbc585f-cq2q5 Pod ✔ Running 7m53s ready:1/1
│ │ └──□ rollout-bg-test-analysis-6bcfbc585f-dtxdc Pod ✔ Running 7m53s ready:1/1
│ └──α rollout-bg-test-analysis-6bcfbc585f-2 AnalysisRun ✔ Successful 7m52s ✔ 1
│ └──⊞ 471f5e5b-553b-4f94-bae3-cefca88afcd6.test-analysis.1 Job ✔ Successful 7m52s
└──# revision:1
└──⧉ rollout-bg-test-analysis-766b7567dc ReplicaSet • ScaledDown 10m
Name: rollout-bg-test-analysis
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: nginx:stable (active)
Replicas:
Desired: 2
Current: 2
Updated: 0
Ready: 2
Available: 0
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-analysis Rollout ◌ Progressing 10m
├──# revision:3
│ └──⧉ rollout-bg-test-analysis-6b7c8784cc ReplicaSet ◌ Progressing 0s
│ ├──□ rollout-bg-test-analysis-6b7c8784cc-b29n5 Pod ◌ Pending 0s ready:0/1
│ └──□ rollout-bg-test-analysis-6b7c8784cc-ztnjn Pod ◌ ContainerCreating 0s ready:0/1
├──# revision:2
│ ├──⧉ rollout-bg-test-analysis-6bcfbc585f ReplicaSet ✔ Healthy 7m56s active
│ │ ├──□ rollout-bg-test-analysis-6bcfbc585f-cq2q5 Pod ✔ Running 7m56s ready:1/1
│ │ └──□ rollout-bg-test-analysis-6bcfbc585f-dtxdc Pod ✔ Running 7m56s ready:1/1
│ └──α rollout-bg-test-analysis-6bcfbc585f-2 AnalysisRun ✔ Successful 7m55s ✔ 1
│ └──⊞ 471f5e5b-553b-4f94-bae3-cefca88afcd6.test-analysis.1 Job ✔ Successful 7m55s
└──# revision:1
└──⧉ rollout-bg-test-analysis-766b7567dc ReplicaSet • ScaledDown 10m
Name: rollout-bg-test-analysis
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: nginx:1.19 (active)
nginx:stable
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 4
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-analysis Rollout ◌ Progressing 10m
├──# revision:3
│ ├──⧉ rollout-bg-test-analysis-6b7c8784cc ReplicaSet ✔ Healthy 0s active
│ │ ├──□ rollout-bg-test-analysis-6b7c8784cc-b29n5 Pod ✔ Running 0s ready:1/1
│ │ └──□ rollout-bg-test-analysis-6b7c8784cc-ztnjn Pod ✔ Running 0s ready:1/1
│ └──α rollout-bg-test-analysis-6b7c8784cc-3 AnalysisRun ◌ Running 0s
│ └──⊞ 0208d4a6-ee21-4ec5-b969-4b75b6784a4b.test-analysis.1 Job ◌ Running 0s
├──# revision:2
│ ├──⧉ rollout-bg-test-analysis-6bcfbc585f ReplicaSet ✔ Healthy 7m57s delay:30s
│ │ ├──□ rollout-bg-test-analysis-6bcfbc585f-cq2q5 Pod ✔ Running 7m57s ready:1/1
│ │ └──□ rollout-bg-test-analysis-6bcfbc585f-dtxdc Pod ✔ Running 7m57s ready:1/1
│ └──α rollout-bg-test-analysis-6bcfbc585f-2 AnalysisRun ✔ Successful 7m56s ✔ 1
│ └──⊞ 471f5e5b-553b-4f94-bae3-cefca88afcd6.test-analysis.1 Job ✔ Successful 7m56s
└──# revision:1
└──⧉ rollout-bg-test-analysis-766b7567dc ReplicaSet • ScaledDown 10m
Name: rollout-bg-test-analysis
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: nginx:1.19 (active)
nginx:stable
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 4
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-analysis Rollout ◌ Progressing 10m
├──# revision:3
│ ├──⧉ rollout-bg-test-analysis-6b7c8784cc ReplicaSet ✔ Healthy 12s active
│ │ ├──□ rollout-bg-test-analysis-6b7c8784cc-b29n5 Pod ✔ Running 12s ready:1/1
│ │ └──□ rollout-bg-test-analysis-6b7c8784cc-ztnjn Pod ✔ Running 12s ready:1/1
│ └──α rollout-bg-test-analysis-6b7c8784cc-3 AnalysisRun ✖ Failed 11s ✖ 1
│ └──⊞ 0208d4a6-ee21-4ec5-b969-4b75b6784a4b.test-analysis.1 Job ✖ Failed 11s
├──# revision:2
│ ├──⧉ rollout-bg-test-analysis-6bcfbc585f ReplicaSet ✔ Healthy 8m9s delay:18s
│ │ ├──□ rollout-bg-test-analysis-6bcfbc585f-cq2q5 Pod ✔ Running 8m9s ready:1/1
│ │ └──□ rollout-bg-test-analysis-6bcfbc585f-dtxdc Pod ✔ Running 8m9s ready:1/1
│ └──α rollout-bg-test-analysis-6bcfbc585f-2 AnalysisRun ✔ Successful 8m8s ✔ 1
│ └──⊞ 471f5e5b-553b-4f94-bae3-cefca88afcd6.test-analysis.1 Job ✔ Successful 8m8s
└──# revision:1
└──⧉ rollout-bg-test-analysis-766b7567dc ReplicaSet • ScaledDown 10m
Name: rollout-bg-test-analysis
Namespace: default
Status: ✖ Degraded
Strategy: BlueGreen
Images: nginx:1.19
nginx:stable (active)
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 4
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-analysis Rollout ✖ Degraded 10m
├──# revision:3
│ ├──⧉ rollout-bg-test-analysis-6b7c8784cc ReplicaSet ✔ Healthy 12s
│ │ ├──□ rollout-bg-test-analysis-6b7c8784cc-b29n5 Pod ✔ Running 12s ready:1/1
│ │ └──□ rollout-bg-test-analysis-6b7c8784cc-ztnjn Pod ✔ Running 12s ready:1/1
│ └──α rollout-bg-test-analysis-6b7c8784cc-3 AnalysisRun ✖ Failed 11s ✖ 1
│ └──⊞ 0208d4a6-ee21-4ec5-b969-4b75b6784a4b.test-analysis.1 Job ✖ Failed 11s
├──# revision:2
│ ├──⧉ rollout-bg-test-analysis-6bcfbc585f ReplicaSet ✔ Healthy 8m9s active
│ │ ├──□ rollout-bg-test-analysis-6bcfbc585f-cq2q5 Pod ✔ Running 8m9s ready:1/1
│ │ └──□ rollout-bg-test-analysis-6bcfbc585f-dtxdc Pod ✔ Running 8m9s ready:1/1
│ └──α rollout-bg-test-analysis-6bcfbc585f-2 AnalysisRun ✔ Successful 8m8s ✔ 1
│ └──⊞ 471f5e5b-553b-4f94-bae3-cefca88afcd6.test-analysis.1 Job ✔ Successful 8m8s
└──# revision:1
└──⧉ rollout-bg-test-analysis-766b7567dc ReplicaSet • ScaledDown 10m
上記の通り、新しいバージョンがデプロイされ、一度はそちらのReplicaSetがactive
になったものの、AnalysisRun
が失敗したため、古いバージョンがactive
になったこと、またRolloutのStatusがDegreded
の状態となることが確認できました。
RolloutのStatusをDegreded
からHealthy
に戻すには、元の設定(ここではAnalysisTemplate
の修正)に戻すよう再デプロイする必要があります。
なお、完了後のリソースは以下のようになります。AnalysisRun
がFailed
であること、また失敗したJobが残っていることなどが確認できます。
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
0208d4a6-ee21-4ec5-b969-4b75b6784a4b.test-analysis.1-fk7hn 0/1 Error 0 7m23s
0208d4a6-ee21-4ec5-b969-4b75b6784a4b.test-analysis.1-mdfzs 0/1 Error 0 7m25s
471f5e5b-553b-4f94-bae3-cefca88afcd6.test-analysis.1-8hz2s 0/1 Completed 0 15m
rollout-bg-test-analysis-6b7c8784cc-b29n5 1/1 Running 0 7m26s
rollout-bg-test-analysis-6b7c8784cc-ztnjn 1/1 Running 0 7m26s
rollout-bg-test-analysis-6bcfbc585f-cq2q5 1/1 Running 0 15m
rollout-bg-test-analysis-6bcfbc585f-dtxdc 1/1 Running 0 15m
$ kubectl get analysisrun
NAME STATUS
rollout-bg-test-analysis-6b7c8784cc-3 Failed
rollout-bg-test-analysis-6bcfbc585f-2 Successful
$ kubectl get rollout
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE
rollout-bg-test-analysis 2 4 2 2
$ kubectl describe analysisrun rollout-bg-test-analysis-6b7c8784cc-3
Name: rollout-bg-test-analysis-6b7c8784cc-3
Namespace: default
Labels: rollout-type=PostPromotion
rollouts-pod-template-hash=6b7c8784cc
Annotations: rollout.argoproj.io/revision: 3
API Version: argoproj.io/v1alpha1
Kind: AnalysisRun
Metadata:
Creation Timestamp: 2020-10-10T02:35:10Z
Generation: 3
Owner References:
API Version: argoproj.io/v1alpha1
Block Owner Deletion: true
Controller: true
Kind: Rollout
Name: rollout-bg-test-analysis
UID: e9ba7057-29db-491a-8618-13180c406fef
Resource Version: 243985
Self Link: /apis/argoproj.io/v1alpha1/namespaces/default/analysisruns/rollout-bg-test-analysis-6b7c8784cc-3
UID: 0208d4a6-ee21-4ec5-b969-4b75b6784a4b
Spec:
Metrics:
Name: test-analysis
Provider:
Job:
Metadata:
Creation Timestamp: <nil>
Spec:
Backoff Limit: 1
Template:
Metadata:
Creation Timestamp: <nil>
Spec:
Containers:
Args:
exit 1
Command:
sh
-c
Image: alpine:3.8
Name: sleep
Resources:
Restart Policy: Never
Status:
Message: metric "test-analysis" assessed Failed due to failed (1) > failureLimit (0)
Metric Results:
Count: 1
Failed: 1
Measurements:
Finished At: 2020-10-10T02:35:22Z
Metadata:
Job - Name: 0208d4a6-ee21-4ec5-b969-4b75b6784a4b.test-analysis.1
Phase: Failed
Started At: 2020-10-10T02:35:10Z
Name: test-analysis
Phase: Failed
Phase: Failed
Started At: 2020-10-10T02:35:10Z
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Failed 8m27s rollouts-controller metric 'test-analysis' completed Failed
Warning Failed 8m27s rollouts-controller analysis completed Failed
$ kubectl describe rollout rollout-bg-test-analysis
Name: rollout-bg-test-analysis
Namespace: default
Labels: <none>
Annotations: rollout.argoproj.io/revision: 3
API Version: argoproj.io/v1alpha1
Kind: Rollout
Metadata:
Creation Timestamp: 2020-10-10T02:25:02Z
Generation: 27
Resource Version: 244302
Self Link: /apis/argoproj.io/v1alpha1/namespaces/default/rollouts/rollout-bg-test-analysis
UID: e9ba7057-29db-491a-8618-13180c406fef
Spec:
Replicas: 2
Revision History Limit: 2
Selector:
Match Labels:
App: rollout-bg-analysis
Strategy:
Blue Green:
Active Service: rollout-active-service-analysis
Post Promotion Analysis:
Templates:
Template Name: test-analysis
Template:
Metadata:
Creation Timestamp: <nil>
Labels:
App: rollout-bg-analysis
Spec:
Containers:
Image: nginx:1.19
Name: nginx-container
Ports:
Container Port: 80
Resources:
Status:
HPA Replicas: 2
Abort: true
Aborted At: 2020-10-10T02:37:02Z
Available Replicas: 2
Blue Green:
Active Selector: 6bcfbc585f
Post Promotion Analysis Run: rollout-bg-test-analysis-6b7c8784cc-3
Post Promotion Analysis Run Status:
Message: metric "test-analysis" assessed Failed due to failed (1) > failureLimit (0)
Name: rollout-bg-test-analysis-6b7c8784cc-3
Status: Failed
Canary:
Conditions:
Last Transition Time: 2020-10-10T02:25:07Z
Last Update Time: 2020-10-10T02:25:07Z
Message: Rollout has minimum availability
Reason: AvailableReason
Status: True
Type: Available
Last Transition Time: 2020-10-10T02:35:22Z
Last Update Time: 2020-10-10T02:35:22Z
Message: metric "test-analysis" assessed Failed due to failed (1) > failureLimit (0)
Reason: RolloutAborted
Status: False
Type: Progressing
Current Pod Hash: 6b7c8784cc
Observed Generation: 57cf5bd85b
Ready Replicas: 4
Replicas: 4
Selector: app=rollout-bg-analysis,rollouts-pod-template-hash=6bcfbc585f
Stable RS: 6bcfbc585f
Updated Replicas: 2
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 19m rollouts-controller Scaled up replica set rollout-bg-test-analysis-766b7567dc to 2
Normal SwitchService 19m rollouts-controller Switched selector for service 'rollout-active-service-analysis' to value '766b7567dc'
Normal ScalingReplicaSet 17m rollouts-controller Scaled up replica set rollout-bg-test-analysis-6bcfbc585f to 2
Normal AnalysisRunStatusChange 17m rollouts-controller PostPromotion Analysis Run 'rollout-bg-test-analysis-6bcfbc585f-2' Status New: 'Running' Previous: ''
Normal AnalysisRunStatusChange 17m rollouts-controller PostPromotion Analysis Run 'rollout-bg-test-analysis-6bcfbc585f-2' Status New: '' Previous: 'NoPreviousStatus'
Normal AnalysisRunStatusChange 17m rollouts-controller PostPromotion Analysis Run 'rollout-bg-test-analysis-6bcfbc585f-2' Status New: 'Successful' Previous: 'Running'
Normal ScalingReplicaSet 17m rollouts-controller Scaled down replica set rollout-bg-test-analysis-766b7567dc to 0
Normal ScalingReplicaSet 9m51s rollouts-controller Scaled up replica set rollout-bg-test-analysis-6b7c8784cc to 2
Normal SwitchService 9m50s rollouts-controller Switched selector for service 'rollout-active-service-analysis' to value '6b7c8784cc'
Normal AnalysisRunStatusChange 9m50s rollouts-controller PostPromotion Analysis Run 'rollout-bg-test-analysis-6b7c8784cc-3' Status New: '' Previous: 'NoPreviousStatus'
Normal AnalysisRunStatusChange 9m50s rollouts-controller PostPromotion Analysis Run 'rollout-bg-test-analysis-6b7c8784cc-3' Status New: 'Running' Previous: ''
Normal SwitchService 9m38s (x2 over 17m) rollouts-controller Switched selector for service 'rollout-active-service-analysis' to value '6bcfbc585f'
Warning AnalysisRunStatusChange 9m38s rollouts-controller PostPromotion Analysis Run 'rollout-bg-test-analysis-6b7c8784cc-3' Status New: 'Failed' Previous: 'Running'
実際のアップデート時に失敗しそうなケースを見てみる
上記ではAnalysisTemplate
を利用し、条件を満たさない場合に自動的にロールバックする様子を見ました。ここからは、実際にKubernetes上で動かすアプリケーションに対してアップデートを行った時、アップデートが失敗する原因となりうる2つのケースについて、追検証をしてみました。
コンテナ起動に失敗した場合
1つ目は、コンテナの起動に失敗した場合です。今回は、わざと起動に失敗するようなコンテナイメージを用意し、イメージ更新時にそのイメージを指定して、どのような挙動を取るかを確認しました。
コンテナの起動に失敗するように、以下のDockerfile
を利用しました。ここでは、存在しないファイルに対してhead
コマンドを実行しています。
Dockerfile
FROM nginx:latest
CMD head /foo/bar
Dockerfileを用いてビルドを行った後、Amazon ECRへプッシュをして、EKSからそのイメージを利用する形で検証をしました。
$ docker build -t test/test03:fail .
$ docker tag test/test03:fail 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:fail
$ docker push 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:fail
また、起動に成功するようなコンテナイメージも、合わせて用意しておきます。
Dockerfile
FROM nginx:latest
$ docker build -t test/test03:success .
$ docker tag test/test03:success 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success
$ docker push 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success
今回の検証で利用したマニフェストファイルは以下の通りです。
rollout-bg-test-start-fail.yml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: rollout-bg-test-start-fail
spec:
replicas: 2
revisionHistoryLimit: 2
selector:
matchLabels:
app: rollout-bg-analysis-start-fail
template:
metadata:
labels:
app: rollout-bg-analysis-start-fail
spec:
containers:
- name: nginx-container
image: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success
ports:
- containerPort: 80
strategy:
blueGreen:
activeService: rollout-active-service-start-fail
postPromotionAnalysis:
templates:
- templateName: test-analysis
rollout-active-service-start-fail.yml
apiVersion: v1
kind: Service
metadata:
name: rollout-active-service-start-fail
spec:
ports:
- port: 8080
targetPort: 80
protocol: TCP
selector:
app: rollout-bg-analysis-start-fail
まずは上記Yamlファイル、そしてtest-analysis
を含むAnalysisTemplate
を作成しておきます。
$ kubectl apply -f test-analysistemp.yml
analysistemplate.argoproj.io/test-analysis created
$ kubectl apply -f rollout-active-service-start-fail.yml
service/rollout-active-service-start-fail created
$ kubectl apply -f rollout-bg-test-start-fail.yml
rollout.argoproj.io/rollout-bg-test-start-fail created
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.100.0.1 <none> 443/TCP 85m
rollout-active-service-start-fail ClusterIP 10.100.32.113 <none> 8080/TCP 104s
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
rollout-bg-test-start-fail-8cdb4dcc6-cnjlt 1/1 Running 0 13s
rollout-bg-test-start-fail-8cdb4dcc6-phsgq 1/1 Running 0 13s
$ kubectl get rollout
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE
rollout-bg-test-start-fail 2 2 2 2
$ kubectl argo rollouts get rollout rollout-bg-test-start-fail
Name: rollout-bg-test-start-fail
Namespace: default
Status: ✔ Healthy
Strategy: BlueGreen
Images: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success (active)
Replicas:
Desired: 2
Current: 2
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-start-fail Rollout ✔ Healthy 11m
└──# revision:1
└──⧉ rollout-bg-test-start-fail-8cdb4dcc6 ReplicaSet ✔ Healthy 11m active
├──□ rollout-bg-test-start-fail-8cdb4dcc6-cnjlt Pod ✔ Running 11m ready:1/1
└──□ rollout-bg-test-start-fail-8cdb4dcc6-phsgq Pod ✔ Running 11m ready:1/1
次に、イメージの更新を実行し、アップデートの様子を確認します。
$ kubectl argo rollouts get rollout rollout-bg-test-start-fail --watch
$ kubectl argo rollouts set image rollout-bg-test-start-fail nginx-container=111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:fail
rollout "rollout-bg-test-start-fail" image updated
Name: rollout-bg-test-start-fail
Namespace: default
Status: ✔ Healthy
Strategy: BlueGreen
Images: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success (active)
Replicas:
Desired: 2
Current: 2
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-start-fail Rollout ✔ Healthy 12m
└──# revision:1
└──⧉ rollout-bg-test-start-fail-8cdb4dcc6 ReplicaSet ✔ Healthy 12m active
├──□ rollout-bg-test-start-fail-8cdb4dcc6-cnjlt Pod ✔ Running 12m ready:1/1
└──□ rollout-bg-test-start-fail-8cdb4dcc6-phsgq Pod ✔ Running 12m ready:1/1
Name: rollout-bg-test-start-fail
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:fail
111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success (active)
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-start-fail Rollout ◌ Progressing 12m
├──# revision:2
│ └──⧉ rollout-bg-test-start-fail-7fb74fd5f5 ReplicaSet ◌ Progressing 0s
│ ├──□ rollout-bg-test-start-fail-7fb74fd5f5-948tm Pod ◌ ContainerCreating 0s ready:0/1
│ └──□ rollout-bg-test-start-fail-7fb74fd5f5-mr4n4 Pod ◌ ContainerCreating 0s ready:0/1
└──# revision:1
└──⧉ rollout-bg-test-start-fail-8cdb4dcc6 ReplicaSet ✔ Healthy 12m active
├──□ rollout-bg-test-start-fail-8cdb4dcc6-cnjlt Pod ✔ Running 12m ready:1/1
└──□ rollout-bg-test-start-fail-8cdb4dcc6-phsgq Pod ✔ Running 12m ready:1/1
Name: rollout-bg-test-start-fail
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:fail
111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success (active)
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-start-fail Rollout ◌ Progressing 12m
├──# revision:2
│ └──⧉ rollout-bg-test-start-fail-7fb74fd5f5 ReplicaSet ◌ Progressing 1s
│ ├──□ rollout-bg-test-start-fail-7fb74fd5f5-948tm Pod ⚠ Error 1s ready:0/1
│ └──□ rollout-bg-test-start-fail-7fb74fd5f5-mr4n4 Pod ◌ ContainerCreating 1s ready:0/1
└──# revision:1
└──⧉ rollout-bg-test-start-fail-8cdb4dcc6 ReplicaSet ✔ Healthy 12m active
├──□ rollout-bg-test-start-fail-8cdb4dcc6-cnjlt Pod ✔ Running 12m ready:1/1
└──□ rollout-bg-test-start-fail-8cdb4dcc6-phsgq Pod ✔ Running 12m ready:1/1
Name: rollout-bg-test-start-fail
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:fail
111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success (active)
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-start-fail Rollout ◌ Progressing 12m
├──# revision:2
│ └──⧉ rollout-bg-test-start-fail-7fb74fd5f5 ReplicaSet ◌ Progressing 1s
│ ├──□ rollout-bg-test-start-fail-7fb74fd5f5-948tm Pod ⚠ Error 1s ready:0/1
│ └──□ rollout-bg-test-start-fail-7fb74fd5f5-mr4n4 Pod ⚠ Error 1s ready:0/1
└──# revision:1
└──⧉ rollout-bg-test-start-fail-8cdb4dcc6 ReplicaSet ✔ Healthy 12m active
├──□ rollout-bg-test-start-fail-8cdb4dcc6-cnjlt Pod ✔ Running 12m ready:1/1
└──□ rollout-bg-test-start-fail-8cdb4dcc6-phsgq Pod ✔ Running 12m ready:1/1
Name: rollout-bg-test-start-fail
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:fail
111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success (active)
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-start-fail Rollout ◌ Progressing 12m
├──# revision:2
│ └──⧉ rollout-bg-test-start-fail-7fb74fd5f5 ReplicaSet ◌ Progressing 3s
│ ├──□ rollout-bg-test-start-fail-7fb74fd5f5-948tm Pod ✖ CrashLoopBackOff 3s ready:0/1,restarts:1
│ └──□ rollout-bg-test-start-fail-7fb74fd5f5-mr4n4 Pod ⚠ Error 3s ready:0/1,restarts:1
└──# revision:1
└──⧉ rollout-bg-test-start-fail-8cdb4dcc6 ReplicaSet ✔ Healthy 12m active
├──□ rollout-bg-test-start-fail-8cdb4dcc6-cnjlt Pod ✔ Running 12m ready:1/1
└──□ rollout-bg-test-start-fail-8cdb4dcc6-phsgq Pod ✔ Running 12m ready:1/1
Name: rollout-bg-test-start-fail
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:fail
111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success (active)
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-start-fail Rollout ◌ Progressing 12m
├──# revision:2
│ └──⧉ rollout-bg-test-start-fail-7fb74fd5f5 ReplicaSet ◌ Progressing 3s
│ ├──□ rollout-bg-test-start-fail-7fb74fd5f5-948tm Pod ✖ CrashLoopBackOff 3s ready:0/1,restarts:1
│ └──□ rollout-bg-test-start-fail-7fb74fd5f5-mr4n4 Pod ✖ CrashLoopBackOff 3s ready:0/1,restarts:1
└──# revision:1
└──⧉ rollout-bg-test-start-fail-8cdb4dcc6 ReplicaSet ✔ Healthy 12m active
├──□ rollout-bg-test-start-fail-8cdb4dcc6-cnjlt Pod ✔ Running 12m ready:1/1
└──□ rollout-bg-test-start-fail-8cdb4dcc6-phsgq Pod ✔ Running 12m ready:1/1
Name: rollout-bg-test-start-fail
Namespace: default
Status: ✖ Degraded
Strategy: BlueGreen
Images: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:fail
111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success (active)
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-start-fail Rollout ✖ Degraded 25m
├──# revision:2
│ └──⧉ rollout-bg-test-start-fail-7fb74fd5f5 ReplicaSet ◌ Progressing 13m
│ ├──□ rollout-bg-test-start-fail-7fb74fd5f5-948tm Pod ✖ CrashLoopBackOff 13m ready:0/1,restarts:7
│ └──□ rollout-bg-test-start-fail-7fb74fd5f5-mr4n4 Pod ✖ CrashLoopBackOff 13m ready:0/1,restarts:7
└──# revision:1
└──⧉ rollout-bg-test-start-fail-8cdb4dcc6 ReplicaSet ✔ Healthy 25m active
├──□ rollout-bg-test-start-fail-8cdb4dcc6-cnjlt Pod ✔ Running 25m ready:1/1
└──□ rollout-bg-test-start-fail-8cdb4dcc6-phsgq Pod ✔ Running 25m ready:1/1
上記の通り、コンテナの起動に失敗した場合は、Analysisが実行される前に起動に失敗するため、新しいバージョンへの切り替えは発生しませんでした。StatusはDegraded
となってしまいますが、コンテナの起動に失敗した場合は、アプリケーションの稼働時間に対しての影響はなさそうに見えます。
なお、RolloutのStatusはDegraded
になるため、元のコンテナイメージを用いて再デプロイをすることでHealthy
にすることができます。
Liveness Probeに失敗し続けた場合
次にLiveness Probeで失敗する場合について見ていきます。今回は以下のようなマニフェストファイルを用意し、あとからLiveness Probeの条件を変更することで、Probeに失敗する状況を作っています。
なお、今回はLiveness Probeの設定(initialDelaySeconds
periodSeconds
)により、Liveness Probeより先にAnalysisRun
が起動する形となっております。
rollout-bg-test-liveness-fail.yml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: rollout-bg-test-liveness-fail
spec:
replicas: 2
revisionHistoryLimit: 2
selector:
matchLabels:
app: rollout-bg-analysis-liveness-fail
template:
metadata:
labels:
app: rollout-bg-analysis-liveness-fail
spec:
containers:
- name: nginx-container
image: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success
ports:
- containerPort: 80
livenessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 1
successThreshold: 1
failureThreshold: 1
strategy:
blueGreen:
activeService: rollout-active-service-liveness-fail
postPromotionAnalysis:
templates:
- templateName: test-analysis
rollout-active-service-liveness-fail.yml
apiVersion: v1
kind: Service
metadata:
name: rollout-active-service-liveness-fail
spec:
ports:
- port: 8080
targetPort: 80
protocol: TCP
selector:
app: rollout-bg-analysis-liveness-fail
上記2つのYamlファイル、そしてtest-analysis
を含むAnalysisTemplate
を作成しておきます。
$ kubectl apply -f rollout-active-service-liveness-fail.yml
service/rollout-active-service-liveness-fail created
$ kubectl apply -f rollout-bg-test-liveness-fail.yml
rollout.argoproj.io/rollout-bg-test-liveness-fail created
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.100.0.1 <none> 443/TCP 150m
rollout-active-service-liveness-fail ClusterIP 10.100.21.204 <none> 8080/TCP 9m1s
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
rollout-bg-test-liveness-fail-54495f7df4-df2dk 1/1 Running 0 15s
rollout-bg-test-liveness-fail-54495f7df4-svb25 1/1 Running 0 15s
$ kubectl get rollout
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE
rollout-bg-test-liveness-fail 2 2 2 2
$ kubectl argo rollouts get rollout rollout-bg-test-liveness-fail
Name: rollout-bg-test-liveness-fail
Namespace: default
Status: ✔ Healthy
Strategy: BlueGreen
Images: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success (active)
Replicas:
Desired: 2
Current: 2
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-liveness-fail Rollout ✔ Healthy 2m37s
└──# revision:1
└──⧉ rollout-bg-test-liveness-fail-54495f7df4 ReplicaSet ✔ Healthy 2m36s active
├──□ rollout-bg-test-liveness-fail-54495f7df4-df2dk Pod ✔ Running 2m36s ready:1/1
└──□ rollout-bg-test-liveness-fail-54495f7df4-svb25 Pod ✔ Running 2m36s ready:1/1
次にLiveness Probeの内容を一部変更し、それによるアップデートの推移を確認してみます。
$ kubectl argo rollouts get rollout rollout-bg-test-liveness-fail --watch
$ kubectl edit rollout rollout-bg-test-liveness-fail
rollout.argoproj.io/rollout-bg-test-liveness-fail edited
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"argoproj.io/v1alpha1","kind":"Rollout","metadata":{"annotations":{},"name":"rollout-bg-test-liveness-fail","namespace":"default"},"spec":{"replicas":2,"revisionHistoryLimit":2,"selector":{"matchLabels":{"app":"rollout-bg-analysis-liveness-fail"}},"strategy":{"blueGreen":{"activeService":"rollout-active-service-liveness-fail","postPromotionAnalysis":{"templates":[{"templateName":"test-analysis"}]}}},"template":{"metadata":{"labels":{"app":"rollout-bg-analysis-liveness-fail"}},"spec":{"containers":[{"image":"111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success","livenessProbe":{"failureThreshold":1,"initialDelaySeconds":5,"periodSeconds":5,"successThreshold":1,"tcpSocket":{"port":80},"timeoutSeconds":1},"name":"nginx-container","ports":[{"containerPort":80}]}]}}}}
rollout.argoproj.io/revision: "1"
creationTimestamp: "2020-10-12T06:52:16Z"
generation: 6
name: rollout-bg-test-liveness-fail
namespace: default
resourceVersion: "28459"
selfLink: /apis/argoproj.io/v1alpha1/namespaces/default/rollouts/rollout-bg-test-liveness-fail
uid: 70fa4d0b-fb3a-46c6-8bfd-321e2d829694
spec:
replicas: 2
revisionHistoryLimit: 2
selector:
matchLabels:
app: rollout-bg-analysis-liveness-fail
strategy:
blueGreen:
activeService: rollout-active-service-liveness-fail
postPromotionAnalysis:
templates:
- templateName: test-analysis
template:
metadata:
creationTimestamp: null
labels:
app: rollout-bg-analysis-liveness-fail
spec:
containers:
- image: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success
livenessProbe:
failureThreshold: 1
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 1
tcpSocket:
port: 8080
timeoutSeconds: 1
name: nginx-container
ports:
- containerPort: 80
resources: {}
status:
HPAReplicas: 2
availableReplicas: 2
blueGreen:
activeSelector: 54495f7df4
canary: {}
conditions:
- lastTransitionTime: "2020-10-12T06:52:17Z"
lastUpdateTime: "2020-10-12T06:52:18Z"
message: ReplicaSet "rollout-bg-test-liveness-fail-54495f7df4" has successfully
progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
- lastTransitionTime: "2020-10-12T06:52:18Z"
lastUpdateTime: "2020-10-12T06:52:18Z"
message: Rollout has minimum availability
reason: AvailableReason
status: "True"
type: Available
currentPodHash: 54495f7df4
observedGeneration: 75d6d6f664
readyReplicas: 2
replicas: 2
selector: app=rollout-bg-analysis-liveness-fail,rollouts-pod-template-hash=54495f7df4
stableRS: 54495f7df4
updatedReplicas: 2
Name: rollout-bg-test-liveness-fail
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success (active)
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-liveness-fail Rollout ◌ Progressing 14m
├──# revision:2
│ └──⧉ rollout-bg-test-liveness-fail-7bb5898d6 ReplicaSet ◌ Progressing 0s
│ ├──□ rollout-bg-test-liveness-fail-7bb5898d6-tw8lt Pod ◌ ContainerCreating 0s ready:0/1
│ └──□ rollout-bg-test-liveness-fail-7bb5898d6-wb944 Pod ◌ ContainerCreating 0s ready:0/1
└──# revision:1
└──⧉ rollout-bg-test-liveness-fail-54495f7df4 ReplicaSet ✔ Healthy 14m active
├──□ rollout-bg-test-liveness-fail-54495f7df4-df2dk Pod ✔ Running 14m ready:1/1
└──□ rollout-bg-test-liveness-fail-54495f7df4-svb25 Pod ✔ Running 14m ready:1/1
Name: rollout-bg-test-liveness-fail
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success (active)
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-liveness-fail Rollout ◌ Progressing 14m
├──# revision:2
│ ├──⧉ rollout-bg-test-liveness-fail-7bb5898d6 ReplicaSet ✔ Healthy 1s active
│ │ ├──□ rollout-bg-test-liveness-fail-7bb5898d6-tw8lt Pod ✔ Running 1s ready:1/1
│ │ └──□ rollout-bg-test-liveness-fail-7bb5898d6-wb944 Pod ✔ Running 1s ready:1/1
│ └──α rollout-bg-test-liveness-fail-7bb5898d6-2-post AnalysisRun ◌ Running 0s
│ └──⊞ 3bf49f08-e851-4185-8cbf-88886c0da2ec.test-analysis.1 Job ◌ Running 0s
└──# revision:1
└──⧉ rollout-bg-test-liveness-fail-54495f7df4 ReplicaSet ✔ Healthy 14m delay:29s
├──□ rollout-bg-test-liveness-fail-54495f7df4-df2dk Pod ✔ Running 14m ready:1/1
└──□ rollout-bg-test-liveness-fail-54495f7df4-svb25 Pod ✔ Running 14m ready:1/1
Name: rollout-bg-test-liveness-fail
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success (active)
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-liveness-fail Rollout ◌ Progressing 14m
├──# revision:2
│ ├──⧉ rollout-bg-test-liveness-fail-7bb5898d6 ReplicaSet ✔ Healthy 5s active
│ │ ├──□ rollout-bg-test-liveness-fail-7bb5898d6-tw8lt Pod ✔ Running 5s ready:1/1
│ │ └──□ rollout-bg-test-liveness-fail-7bb5898d6-wb944 Pod ✔ Running 5s ready:1/1
│ └──α rollout-bg-test-liveness-fail-7bb5898d6-2-post AnalysisRun ✔ Successful 4s ✔ 1
│ └──⊞ 3bf49f08-e851-4185-8cbf-88886c0da2ec.test-analysis.1 Job ✔ Successful 4s
└──# revision:1
└──⧉ rollout-bg-test-liveness-fail-54495f7df4 ReplicaSet ✔ Healthy 14m delay:25s
├──□ rollout-bg-test-liveness-fail-54495f7df4-df2dk Pod ✔ Running 14m ready:1/1
└──□ rollout-bg-test-liveness-fail-54495f7df4-svb25 Pod ✔ Running 14m ready:1/1
Name: rollout-bg-test-liveness-fail
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success (active)
Replicas:
Desired: 2
Current: 4
Updated: 2
Ready: 2
Available: 2
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-liveness-fail Rollout ◌ Progressing 14m
├──# revision:2
│ ├──⧉ rollout-bg-test-liveness-fail-7bb5898d6 ReplicaSet ✔ Healthy 8s active
│ │ ├──□ rollout-bg-test-liveness-fail-7bb5898d6-tw8lt Pod ✔ Running 8s ready:1/1,restarts:1
│ │ └──□ rollout-bg-test-liveness-fail-7bb5898d6-wb944 Pod ✔ Running 8s ready:1/1,restarts:1
│ └──α rollout-bg-test-liveness-fail-7bb5898d6-2-post AnalysisRun ✔ Successful 7s ✔ 1
│ └──⊞ 3bf49f08-e851-4185-8cbf-88886c0da2ec.test-analysis.1 Job ✔ Successful 7s
└──# revision:1
└──⧉ rollout-bg-test-liveness-fail-54495f7df4 ReplicaSet ✔ Healthy 14m delay:22s
├──□ rollout-bg-test-liveness-fail-54495f7df4-df2dk Pod ✔ Running 14m ready:1/1
└──□ rollout-bg-test-liveness-fail-54495f7df4-svb25 Pod ✔ Running 14m ready:1/1
Name: rollout-bg-test-liveness-fail
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success (active)
Replicas:
Desired: 2
Current: 2
Updated: 2
Ready: 1
Available: 1
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-liveness-fail Rollout ◌ Progressing 14m
├──# revision:2
│ ├──⧉ rollout-bg-test-liveness-fail-7bb5898d6 ReplicaSet ◌ Progressing 31s active
│ │ ├──□ rollout-bg-test-liveness-fail-7bb5898d6-tw8lt Pod ✔ Running 31s ready:1/1,restarts:3
│ │ └──□ rollout-bg-test-liveness-fail-7bb5898d6-wb944 Pod ✖ CrashLoopBackOff 31s ready:0/1,restarts:2
│ └──α rollout-bg-test-liveness-fail-7bb5898d6-2-post AnalysisRun ✔ Successful 30s ✔ 1
│ └──⊞ 3bf49f08-e851-4185-8cbf-88886c0da2ec.test-analysis.1 Job ✔ Successful 30s
└──# revision:1
└──⧉ rollout-bg-test-liveness-fail-54495f7df4 ReplicaSet • ScaledDown 14m
├──□ rollout-bg-test-liveness-fail-54495f7df4-df2dk Pod ◌ Terminating 14m ready:0/1
└──□ rollout-bg-test-liveness-fail-54495f7df4-svb25 Pod ◌ Terminating 14m ready:0/1
Name: rollout-bg-test-liveness-fail
Namespace: default
Status: ◌ Progressing
Strategy: BlueGreen
Images: 111111111111.dkr.ecr.ap-northeast-1.amazonaws.com/test/test03:success (active)
Replicas:
Desired: 2
Current: 2
Updated: 2
Ready: 1
Available: 1
NAME KIND STATUS AGE INFO
⟳ rollout-bg-test-liveness-fail Rollout ◌ Progressing 15m
├──# revision:2
│ ├──⧉ rollout-bg-test-liveness-fail-7bb5898d6 ReplicaSet ◌ Progressing 43s active
│ │ ├──□ rollout-bg-test-liveness-fail-7bb5898d6-tw8lt Pod ✖ CrashLoopBackOff 43s ready:0/1,restarts:3
│ │ └──□ rollout-bg-test-liveness-fail-7bb5898d6-wb944 Pod ✖ CrashLoopBackOff 43s ready:0/1,restarts:4
│ └──α rollout-bg-test-liveness-fail-7bb5898d6-2-post AnalysisRun ✔ Successful 42s ✔ 1
│ └──⊞ 3bf49f08-e851-4185-8cbf-88886c0da2ec.test-analysis.1 Job ✔ Successful 42s
└──# revision:1
└──⧉ rollout-bg-test-liveness-fail-54495f7df4 ReplicaSet • ScaledDown 15m
上記の通り、Liveness Probeに失敗すると、一度はPodが作成され、トラフィックの切り替えも発生しますが、Probeに失敗する限りコンテナのRestartが繰り返される状況となることがわかりました。またRolloutのStatusはProgressing
の状態が続き、これをHealthy
に戻すには、やはり正常に動くRollout(ここではLiveness Probeの設定を修正したもの)を再デプロイする必要があります。
今回のAnalysisTemplate
は、Liveness Probeより先に実行されるようにしており、また実行すれば必ず成功するものだったため、あまり意味のないものでした。一方でAnalysisの実行内容を工夫することで(例えば起動後に一定時間Podへの疎通確認を行うなど?)、この問題を解決することができるかもしれません。またBlue/Green Deploymentを利用する場合prePromotionAnalysis
を設定することで、切り替え前の分析を実行することもできます。これにより、切り替え前にLiveness Probeの設定(不備?)によるコンテナの再起動の繰り返しが起きる場合に備え、Analysisを実行して検知をするよう設定することもできるのでは、と考えています。
Prometheus等の監視メトリクスによって問題を検知した場合
今回は検証を行いませんが、Argo RolloutsではAnalysisTemplate
にPrometheusのメトリクスなどを利用することができます。これにより、新バージョンのデプロイ・リリース完了前後でアプリケーション等に問題が見られた場合に、自動的にロールバックを行うこともできます。
公式ドキュメントでは、以下のようなマニフェストファイルの例が紹介されています。Analysisの成功・失敗の基準(successCondition
)と実際のAnalysis(provider.prometheus.query
)を定義し、条件を満たさない場合はロールバックを行います。
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: success-rate
spec:
args:
- name: service-name
- name: prometheus-port
value: 9090
metrics:
- name: success-rate
successCondition: result[0] >= 0.95
provider:
prometheus:
address: "http://prometheus.example.com:{{args.prometheus-port}}"
query: |
sum(irate(
istio_requests_total{reporter="source",destination_service=~"{{args.service-name}}",response_code!~"5.*"}[5m]
)) /
sum(irate(
istio_requests_total{reporter="source",destination_service=~"{{args.service-name}}"}[5m]
))
参考ドキュメント