以前GitLab agent for Kubernetesを使ってGitLabとKubernetesを接続した際にエラーが発生したため、その対応を簡単に記載しておきます。
発生した事象
Helmコマンドを使ってGitLab agentをインストールした際、Podに ImagePullBackOff
というエラーが発生しました。
# インストールコマンド [cloudshell-user@ip-10-130-48-71 ~]$ helm upgrade --install eks-cluster gitlab/gitlab-agent \ > --namespace gitlab-agent-eks-cluster \ > --create-namespace \ > --set image.tag=v17.4.0 \ > --set config.token=glagent-atE9yeHA9x1YNyEVjsP4m_CNyB4t9sAR4iPvz9nJdrPxshSTvw \ > --set config.kasAddress=wss://kas.gitlab.com # エラーが発生 [cloudshell-user@ip-10-130-48-71 ~]$ kubectl get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE gitlab-agent-eks-cluster eks-cluster-gitlab-agent-v2-7ff5f6ccbd-n4t7x 0/1 ImagePullBackOff 0 85s gitlab-agent-eks-cluster eks-cluster-gitlab-agent-v2-7ff5f6ccbd-slrw8 0/1 ImagePullBackOff 0 85s kube-system aws-node-9dp85 2/2 Running 0 7m26s kube-system aws-node-ljj6n 2/2 Running 0 7m23s kube-system coredns-676bf68468-m5gqm 1/1 Running 0 11m kube-system coredns-676bf68468-w74qt 1/1 Running 0 11m kube-system kube-proxy-cqp8n 1/1 Running 0 7m23s kube-system kube-proxy-mhxjh 1/1 Running 0 7m26s [cloudshell-user@ip-10-130-48-71 ~]$ kubectl describe pod eks-cluster-gitlab-agent-v2-7ff5f6ccbd-n4t7x -n gitlab-agent-eks-cluster Name: eks-cluster-gitlab-agent-v2-7ff5f6ccbd-n4t7x Namespace: gitlab-agent-eks-cluster Priority: 0 Service Account: eks-cluster-gitlab-agent Node: ip-192-168-117-1.ap-northeast-1.compute.internal/192.168.117.1 Start Time: Wed, 21 Aug 2024 23:34:37 +0000 Labels: app.kubernetes.io/instance=eks-cluster app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=gitlab-agent app.kubernetes.io/version=v17.3.0 helm.sh/chart=gitlab-agent-2.6.1 pod-template-hash=7ff5f6ccbd Annotations: checksum/token: 4b82d996ed8fdb09cce0b02eb5623a8eeae19fabff2cbdbfb7e2fcca02b7de76 prometheus.io/path: /metrics prometheus.io/port: 8080 prometheus.io/scrape: true Status: Pending IP: 192.168.112.157 IPs: IP: 192.168.112.157 Controlled By: ReplicaSet/eks-cluster-gitlab-agent-v2-7ff5f6ccbd Containers: gitlab-agent: Container ID: Image: registry.gitlab.com/gitlab-org/cluster-integration/gitlab-agent/agentk:v17.4.0 Image ID: Port: 8080/TCP Host Port: 0/TCP Args: --token-file=/etc/agentk/secrets/token --kas-address=wss://kas.gitlab.com State: Waiting Reason: ImagePullBackOff Ready: False Restart Count: 0 Liveness: http-get http://:8080/liveness delay=15s timeout=1s period=20s #success=1 #failure=3 Readiness: http-get http://:8080/readiness delay=5s timeout=1s period=10s #success=1 #failure=3 Environment: POD_NAMESPACE: gitlab-agent-eks-cluster (v1:metadata.namespace) POD_NAME: eks-cluster-gitlab-agent-v2-7ff5f6ccbd-n4t7x (v1:metadata.name) SERVICE_ACCOUNT_NAME: (v1:spec.serviceAccountName) OCS_ENABLED: true OCS_SERVICE_ACCOUNT_NAME: eks-cluster-gitlab-agent-ocs-scanning-pod-sa Mounts: /etc/agentk/secrets from secret-volume (ro) /var/run/secrets/kubernetes.io/serviceaccount from service-account-token-volume (ro) Conditions: Type Status PodReadyToStartContainers True Initialized True Ready False ContainersReady False PodScheduled True Volumes: service-account-token-volume: Type: Projected (a volume that contains injected data from multiple sources) ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true TokenExpirationSeconds: 3600 secret-volume: Type: Secret (a volume populated by a Secret) SecretName: eks-cluster-gitlab-agent-token Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 2m6s default-scheduler Successfully assigned gitlab-agent-eks-cluster/eks-cluster-gitlab-agent-v2-7ff5f6ccbd-n4t7x to ip-192-168-117-1.ap-northeast-1.compute.internal Warning FailedMount 2m5s kubelet MountVolume.SetUp failed for volume "secret-volume" : failed to sync secret cache: timed out waiting for the condition Normal Pulling 31s (x4 over 2m4s) kubelet Pulling image "registry.gitlab.com/gitlab-org/cluster-integration/gitlab-agent/agentk:v17.4.0" Warning Failed 30s (x4 over 2m3s) kubelet Failed to pull image "registry.gitlab.com/gitlab-org/cluster-integration/gitlab-agent/agentk:v17.4.0": rpc error: code = NotFound desc = failed to pull and unpack image "registry.gitlab.com/gitlab-org/cluster-integration/gitlab-agent/agentk:v17.4.0": failed to resolve reference "registry.gitlab.com/gitlab-org/cluster-integration/gitlab-agent/agentk:v17.4.0": registry.gitlab.com/gitlab-org/cluster-integration/gitlab-agent/agentk:v17.4.0: not found Warning Failed 30s (x4 over 2m3s) kubelet Error: ErrImagePull Normal BackOff 16s (x6 over 2m2s) kubelet Back-off pulling image "registry.gitlab.com/gitlab-org/cluster-integration/gitlab-agent/agentk:v17.4.0" Warning Failed 16s (x6 over 2m2s) kubelet Error: ImagePullBackOff [cloudshell-user@ip-10-130-48-71 ~]$
エラーの原因はメッセージの通りイメージを取得できないことですが、ネットワーク的に制限もかけておらず、なぜこのエラーが発生したかしばらくわかりませんでした。
原因
GitLab agentのコンテナイメージを管理するGItLab Projectに移動しコンテナレジストリを検索したところ、Helmコマンドで指定したイメージタグが存在しませんでした。
解決策
Helmコマンドを修正し、利用可能なイメージを指定するよう修正します。
# イメージタグを修正 [cloudshell-user@ip-10-130-48-71 ~]$ helm upgrade --install eks-cluster gitlab/gitlab-agent \ > --namespace gitlab-agent-eks-cluster \ > --create-namespace \ > --set image.tag=latest \ # latestを指定 > --set config.token=glagent-atE9yeHA9x1YNyEVjsP4m_CNyB4t9sAR4iPvz9nJdrPxshSTvw \ > --set config.kasAddress=wss://kas.gitlab.com # Podの起動を確認 [cloudshell-user@ip-10-130-48-71 ~]$ kubectl get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE gitlab-agent-eks-cluster eks-cluster-gitlab-agent-v2-6dddd7d8d5-bj7gf 1/1 Running 0 96s gitlab-agent-eks-cluster eks-cluster-gitlab-agent-v2-6dddd7d8d5-wzq4q 1/1 Running 0 116s kube-system aws-node-9dp85 2/2 Running 0 21m kube-system aws-node-ljj6n 2/2 Running 0 21m kube-system coredns-676bf68468-m5gqm 1/1 Running 0 25m kube-system coredns-676bf68468-w74qt 1/1 Running 0 25m kube-system kube-proxy-cqp8n 1/1 Running 0 21m kube-system kube-proxy-mhxjh 1/1 Running 0 21m
無事にコンテナイメージを取得し、Podが起動することを確認しました。
補足
後日別の機会でGtiLab agentをインストールしたのですが、Agent登録時に表示されるコマンドからイメージタグの指定箇所がなくなっていました。
今後は同様のエラーに悩まされることは無くなるかもしれません。