mingming

Kubernetes API Server Error 본문

kubernetes

Kubernetes API Server Error

mingming_96 2024. 12. 27. 23:47

실습용으로 사용중이던 VM을 오랜만에 켜보니 kube-apiserver 와 통신이 불가능하여 kubectl 명령이 수행되지 않았습니다.

kubectl get pods

E1224 22:11:19.766017   23306 memcache.go:265] couldn't get current server API group list: Get "https://192.168.56.103:6443/api?timeout=32s": dial tcp 192.168.56.103:6443: connect: connection refused
E1224 22:11:19.767835   23306 memcache.go:265] couldn't get current server API group list: Get "https://192.168.56.103:6443/api?timeout=32s": dial tcp 192.168.56.103:6443: connect: connection refused
E1224 22:11:19.768570   23306 memcache.go:265] couldn't get current server API group list: Get "https://192.168.56.103:6443/api?timeout=32s": dial tcp 192.168.56.103:6443: connect: connection refused
E1224 22:11:19.770982   23306 memcache.go:265] couldn't get current server API group list: Get "https://192.168.56.103:6443/api?timeout=32s": dial tcp 192.168.56.103:6443: connect: connection refused
E1224 22:11:19.772862   23306 memcache.go:265] couldn't get current server API group list: Get "https://192.168.56.103:6443/api?timeout=32s": dial tcp 192.168.56.103:6443: connect: connection refused

 

 

kubelet 데몬을 확인해보니 다음과 같은 에러가 지속적으로 발생하고 있었습니다.

pod_workers.go:1300] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-apiserver\" with CrashLoopBackOff: \"back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-master_kube-system(45240f852f58e4472000c22ec0dffbb7)\"" pod="kube-system/kube-apiserver-master" podUID="45240f852f58e4472000c22ec0dffbb7"

 

kube-apiserver 파드가 지속적으로 실패하여 crashloopback 상태이며, kube-apiserver와 동기화 작업을 시도하였으나, 실패해 작업을 건너뛰고 있다는 에러입니다.

 

컨테이너 상태 확인

kubernetes의 container runtime으로 containerd를 사용중에 있습니다. container rumtime interface인 crictl 명령어로 containerd와 통신하기 위해 runtime-endpoint를 지정해야 합니다.

 

컨테이너 런타임 확인

cat /var/lib/kubelet/kubeadm-flags.env

KUBELET_KUBEADM_ARGS="--container-runtime-endpoint=unix:///var/run/containerd/containerd.sock --pod-infra-container-image=registry.k8s.io/pause:3.9"

 

컨테이너 런타임 엔드포인트 지정 

## container runtime endpoint 지정
/etc/crictl.yaml

runtime-endpoint: "unix://var/run/containerd/containerd.sock"

 

crictl (container runtime interface)

## 컨테이너 목록 출력
crictl ps -a

CONTAINER           IMAGE               CREATED             STATE               NAME                      ATTEMPT             POD ID              POD
5c8cf109cc722       7fe0e6f37db33       3 minutes ago       Exited              kube-apiserver            96                  522975fbb500d       kube-apiserver-master
d50c8e87b3e7d       d058aa5ab969c       About an hour ago   Running             kube-controller-manager   176                 fe3a2b1488d06       kube-controller-manager-master
9e31945b46e27       e3db313c6dbc0       About an hour ago   Running             kube-scheduler            175                 a3dfc2a451144       kube-scheduler-master
c224d0a9a6017       73deb9a3f7025       About an hour ago   Running             etcd                      72                  24638b64d9ac9       etcd-master
e8d54cc3332d8       ead0a4a53df89       6 weeks ago         Exited              coredns                   58                  dbc26ed9ec26c       coredns-5dd5756b68-xw4gr
92feaaf8b807b       ead0a4a53df89       6 weeks ago         Exited              coredns                   58                  46c40def642cf       coredns-5dd5756b68-2sstm
f7a2af209ff7f       690c3345cc9c3       6 weeks ago         Exited              weave-npc                 64                  7a721f4c2530c       weave-net-6jk75
a177d162f2256       62fea85d60522       6 weeks ago         Exited              weave                     78                  7a721f4c2530c       weave-net-6jk75
ee6e775c09573       62fea85d60522       6 weeks ago         Exited              weave-init                0                   7a721f4c2530c       weave-net-6jk75
bdc3301b4aa50       83f6cc407eed8       6 weeks ago         Exited              kube-proxy                65                  7d98279bb1266       kube-proxy-tr22z
dde37a063002a       73deb9a3f7025       6 weeks ago         Exited              etcd                      71                  f5b7168c7390e       etcd-master
f66a9435e179b       d058aa5ab969c       6 weeks ago         Exited              kube-controller-manager   175                 7eab50ed5d574       kube-controller-manager-master
0b9d27dbef5fd       e3db313c6dbc0       6 weeks ago         Exited              kube-scheduler            174                 989995e8e97fa       kube-scheduler-master

 

 

kube-apiserver 컨테이너의 로그를 확인해 보니 TLS 인증 오류가 발생하고 있었습니다. 인증서가 만료되어 kube-apiserver가 ETCD와 연결 과정에서 TLS 핸드셰이크가 실패한 것으로 보입니다.

crictl logs <container ID>
W1224 13:21:37.234598       1 logging.go:59] [core] [Channel #5 SubChannel #6] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1", }. Err: connection error: desc = "transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2024-12-24T13:21:37Z is after 2024-11-19T12:59:23Z"

 

kube-apiserver에서 사용하는 인증서는 다음과 같습니다.

 

## API Server manifest 확인
cat /etc/kubernetes/manifest/kube-apiserver.yaml

    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
    - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
    - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
    - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
    - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
    - --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
    - --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    - --service-account-key-file=/etc/kubernetes/pki/sa.pub
    - --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
    - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
    - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key

 

실제로 인증서 위치로 이동하여 각각의 인증서의 정보를 확인해보니 인증서가 만료되어 있었습니다.

openssl x509 -in apiserver-etcd-client.crt --noout -text | grep -i not
Not Before: Nov 20 12:54:22 2023 GMT
Not After : Nov 19 12:59:25 2024 GMT

openssl x509 -in apiserver-kubelet-client.crt -noout -text | grep -i not
Not Before: Nov 20 12:54:19 2023 GMT
Not After : Nov 19 12:59:20 2024 GM

 

개별적으로 각각의 인증서를 갱신해도 되지만 간편하게 모든 인증서를 새롭게 발급받을 수 있습니다.

## 모든 인증서 갱신
kubectl certs renew all

 

인증서 재발급 후 kube-apiserver가 정상적으로 동작하는 것을 확인합니다. 위의 명령어로 모든 인증서를 갱신하게 되면 클라이언트 인증서 또한 갱신되기 때문에 kubeconfig 파일 또한 재 생성됩니다. 따라서 신규 생성된 config 파일을 다시 복사해와야 합니다.

 

kubernetes config 파일 복사

cp /etc/kubernetes/admin.conf ~/.kube/config

 

 

'kubernetes' 카테고리의 다른 글

CKAD - Killer.sh 문제 풀이  (0) 2025.03.09
Jenkins 버전 업그레이드  (0) 2024.12.26
Kubectl 버전 업그레이드  (1) 2024.09.01
Jenkins Plugin Version Error  (1) 2024.03.22
Killer.sh CKA 문제 풀이  (0) 2024.01.29