/
How to check Network Problems
How to check Network Problems
Problem
앞서 metrics-server
pod를 추가하였으나, 아래와 같이 서로 다른 node들에 걸친 pod 간 routing이 되지 않는 현상이 있을 수 있다.
기본적으로 k8s cni addon을 설치하였다면,
각 node간, pod간, node ↔︎ pod 간 모두 routing되어 통신이 가능해야 한다.
# Label k8s-app: metrics-server인 pod의 로그를 확인하는 명령어.
$ kubectl logs --tail=20 -n kube-system -l k8s-app=metrics-server
...
E0811 02:12:20.843207 1 manager.go:111] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:k8s-02: unable to fetch metrics from Kubelet k8s-02 (10.0.2.6): Get https://10.0.2.6:10250/stats/summary?only_cpu_and_memory=true: dial tcp 10.0.2.6:10250: connect: no route to host
E0811 02:13:16.345348 1 reststorage.go:135] unable to fetch node metrics for node "k8s-02": no metrics known for node
E0811 02:13:16.352394 1 reststorage.go:160] unable to fetch pod metrics for pod kube-system/netbox-86cdd5bdc6-jsbhn: no metrics known for pod
E0811 02:13:16.352406 1 reststorage.go:160] unable to fetch pod metrics for pod kube-system/kube-proxy-cmp85: no metrics known for pod
E0811 02:13:16.352411 1 reststorage.go:160] unable to fetch pod metrics for pod kube-system/kube-flannel-ds-amd64-5nktb: no metrics known for pod
E0811 05:23:16.347050 1 reststorage.go:160] unable to fetch pod metrics for pod kube-system/netbox-z6nbz: no metrics known for pod
Deploy netbox
보통 최적화된 어플리케이션 container의 경우 bash로 container내부에 접속한다 하더라도, networking 관련하여 확인할 수 있는 util이 포함되어 있지 않은 경우가 대부분이다.(bash나 ping도 없는 경우가 많다.)
해서, 아래와 같이 netbox라고 하는 k8s DaemonSet
을 배포하면 위의 curl -k -X Get https://10.0.2.6:10250/stats/summary?only_cpu_and_memory=true
쿼리가 잘 되는지 등, 갖가지 tool로 확인이 용이하다.(이 밖에도 tcpdump가 포함된 container등 문제 해결을 위한 container가 많이 존재한다. 익숙해지면, 주로 사용하는 util들을 담아서 직접 docker build 하여 사용할 것이다.)
# 편의상, metrics-server와 같은 namespace 및 serviceAccount로 지정하였다.
# 이렇게 하면, metrics-server가 사용하는 role 및 token을 사용할 수 있다.
$ vim netbox.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app: netbox
name: netbox
namespace: kube-system
spec:
selector:
matchLabels:
app: netbox
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
app: netbox
spec:
serviceAccountName: metrics-server
serviceAccount: metrics-server
containers:
- image: quay.io/gravitational/netbox:latest
imagePullPolicy: Always
name: netbox
securityContext:
runAsUser: 0
terminationGracePeriodSeconds: 30
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
$ kubectl apply -f netbox.yaml
$ kubectl get pods -A -owide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
...
kube-system netbox-kvlsf 1/1 Running 0 3m16s 10.244.0.21 k8s-01 <none> <none>
kube-system netbox-svqdr 1/1 Running 0 3m16s 10.244.1.40 k8s-02 <none> <none>
...
# netbox-kvlsf 내부 진입.
$ kubectl exec -n kube-system -it netbox-kvlsf -- /bin/bash
# netbox-kvlsf 내부
# telnet같은 툴은 설치되어 있지 않으므로, 간단한 python 스크립트를 작성하여 port 접속을 테스트 한다.
$ python
Python 3.3.6 (default, Sep 14 2017, 23:28:12)
[GCC 4.9.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket
>>> socket.create_connection(('10.0.2.5', 10250))
<socket.socket object, fd=3, family=2, type=1, proto=6> # 성공!
>>>
>>> socket.create_connection(('10.0.2.6', 10250))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.3/socket.py", line 435, in create_connection
raise err
File "/usr/local/lib/python3.3/socket.py", line 426, in create_connection
sock.connect(sa)
OSError: [Errno 113] No route to host # 실패!
# 접속이 잘 된다면, 아래와 같이 실제 kubelet api도 호출하여 본다.
# metrics-server가 사용하는 token이 아래 경로에 마운트되어 있을 것이다.
$ cat /run/secrets/kubernetes.io/serviceaccount/token
eyJhbGciOiJSUzI1NiIsImtpZCI6IjNqSm8xaXJ0MDZsaGxjdzVndWozY1A5VXBGbTdwX3VDUzBpd0J2a3ItR0EifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJtZXRyaWNzLXNlcnZlci10b2tlbi01bTdtdCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJtZXRyaWNzLXNlcnZlciIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImI4NGZjMmRjLTQ1MDItNGJhNi1iNWE5LWEwMjA0NmVhOTdjNiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlLXN5c3RlbTptZXRyaWNzLXNlcnZlciJ9.LPzyvfQiT294NE-53AaVDkR9SV-AKJhs62g0LAX3iril2H3wvqfF2w6h0vz5SpZhVSLC9rKEHClbDSF1w88rdGr6bn3R4dlmogzb6nw2N1dcCHR8LnDlA2AbZsSBYAYrIWpYIV1mxu4r60HFPoGE3JbpnRxKeC3KKXEfhnOILDulox_xNyvLd46_T4wZqglwqJvo-Ogkl8GBlw8-kRr04_TXB1hrTuDCGfRnNpb7RGcBVHlsIq_qZFXMsWEGp_pGf24_nYQ5w-dOWlKPMeoZ44BfVS_mas6ZFdraFoiCdPXlNC3GeeN0t1n4fbix1VTxxJtsLCcwcY8aG3THCC0PHw
$ curl -k https://10.0.2.5:10250/stats/summary?only_cpu_and_memory=true -H 'Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IjNqSm8xaXJ0MDZsaGxjdzVndWozY1A5VXBGbTdwX3VDUzBpd0J2a3ItR0EifQ.eyJpc3MiOiJdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJtZXRyaWNzLXNlcnZlci10b2tlbi01bTdtdCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJtZXRyaWNzLXNlcnZlciIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImI4NGZjMmRjLTQ1MDItNGJhNi1iNWE5LWEwMjA0NmVhOTdjNiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlLXN5c3RlbTptZXRyaWNzLXNlcnZlciJ9.LPzyvfQiT294NE-53AaVDkR9SV-AKJhs62g0LAX3iril2H3wvqfF2w6h0vz5SpZhVSLC9rKEHClbDSF1w88rdGr6bn3R4dlmogzb6nw2N1dcCHR8LnDlA2AbZsSBYAYrIWpYIV1mxu4r60HFPoGE3JbpnRxKeC3KKXEfhnOILDulox_xNyvLd46_T4wZqglwqJvo-Ogkl8GBlw8-kRr04_TXB1hrTuDCGfRnNpb7RGcBVHlsIq_qZFXMsWEGp_pGf24_nYQ5w-dOWlKPMeoZ44BfVS_mas6ZFdraFoiCdPXlNC3GeeN0t1n4fbix1VTxxJtsLCcwcY8aG3THCC0PHw'
{
"node": {
"nodeName": "k8s-01",
"systemContainers": [
{
"name": "pods",
"startTime": "2020-08-06T23:38:19Z",
"cpu": {
"time": "2020-08-11T06:07:17Z",
"usageNanoCores": 66319281,
"usageCoreNanoSeconds": 34786278186285
},
"memory": {
"time": "2020-08-11T06:07:17Z",
"availableBytes": 3032084480,
...
# 이렇게 출력되면 정상이다.
Resolve
사용하는 port를 모두 firewalld에서 open하였는데도, 마찬가지 현상이라면 대부분 firewalld에 masquerade
가 추가/적용되어 있지 않기 때문일 것이다.
아래 페이지를 참고하여 해결한다.
, multiple selections available,
Related content
Setting up K8s Metrics Server Addon
Setting up K8s Metrics Server Addon
More like this
Creating a single control-plane cluster with kubeadm
Creating a single control-plane cluster with kubeadm
More like this
[Certain Version] Installation for test Env.
[Certain Version] Installation for test Env.
More like this
Using Service & Ingress
Using Service & Ingress
More like this
Highly Available topology of Control Plane
Highly Available topology of Control Plane
Read with this
Service Discovery
Service Discovery
More like this