在公司kubernetes项目部署中,脚本运行了大量的pod,其中包括环境中必不可少的监控,而对于kubernetes来说,最适合的监控莫过于Prometheus了,再通过grafana去展示监控图表。
Grafana中默认是没有关于kubernetes的展示插件的所以需要手动下载
首先要在项目中找到grafana的pod,进入pod去下载
代理问题
公司环境中,是需要通过代理网络去访问外网的,在代理的设置中的no_proxy是添加了pod的ip的,意思是使用该pod_ip不会使用代理网络,也就是pod无法上网,需要下载kubernetes插件时,也是必须上网的,这就需要在宿主机的环境的代理设置no_proxy中将pod_ip删除
查看pod_ip所在网段
通过以下命令查看可以找到grafana的pod_ip网段为10.16.0.0/16
[root@controller1-gtc126 ~]# kubectl get pods -n {namespace} -o wide
NAME READY STATUS RESTARTS AGE IP NODE
cadvisor-p58h6 2/2 Running 2 51d 10.16.0.21 worker1-gtc4
collectd-r6j7w 2/2 Running 2 51d 10.238.149.4 worker1-gtc4
custom-metrics-apiserver-54699b845f-twspx 1/1 Running 1 51d 10.16.0.13 controller1-gtc126
grafana-6b79c984b-6bl7w 2/2 Running 0 8m12s 10.16.0.35 controller1-gtc126
otel-collector-7d5b75bbdf-6mzld 2/2 Running 2 51d 10.16.0.11 worker1-gtc4
prometheus-node-exporter-5hzg5 1/1 Running 5 51d 10.16.0.20 worker1-gtc4
prometheus-server-76c96b9497-bbnk2 3/3 Running 3 51d 10.16.0.10 controller1-gtc126
telemetry-aware-scheduling-68467c4ccd-plclb 2/2 Running 3 51d 10.16.0.14 controller1-gtc126
telemetry-collector-certs-xrqtg 0/1 Completed 0 51d 10.16.0.12 worker1-gtc4
telemetry-node-certs-dvnw8 1/1 Running 1 51d 10.16.0.23 worker1-gtc4
在宿主机中的no_proxy设置中去掉该网段即可
寻找container_name
一个pod中可以有单个container,也可以是多个。单个情况下,直接进入pod即可,而多个情况下就需要指定container进入,否则会进入默认的第一个container
公司的环境中pod中就是多个容器,需要先看pod中有几个container,名字分别是什么
使用以下命令,虽然信息多,但是要看的很简单,我省略掉了很多无用信息,直接看Init Containers:**下面的部分,每个Container ID**的上一行就是一个container_name
[root@controller1-gtc126 ~]# kubectl describe -n {namespace} pod {grafana_pod_name}
...
Controlled By: ReplicaSet/grafana-6b79c984b
Init Containers:
init-chown-data:
Container ID: docker://680f077ed9cd9e8564ec1c335e368db4ec593b7b7df286c3c886315cf4a93fec
Image: busybox:1.31.1
Image ID: docker-pullable://busybox@sha256:95cf004f559831017cdf4628aaf1bb30133677be8702a8c5f2994629f637a209
Port: <none>
Host Port: <none>
Command:
chown
-R
472:472
/var/lib/grafana
State: Terminated
Reason: Completed
Exit Code: 0
Started: Sat, 10 Oct 2020 14:07:31 +0800
Finished: Sat, 10 Oct 2020 14:07:31 +0800
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/lib/grafana from storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from grafana-token-2bd8f (ro)
grafana-sc-datasources:
Container ID: docker://7ffe874f5e51fbf974443c0bc073b1d3c9e182137e5977b9a9e67bebdbc960df
Image: kiwigrid/k8s-sidecar:0.1.151
Image ID: docker-pullable://kiwigrid/k8s-sidecar@sha256:7b98eecdf6d117b053622e9f317c632a4b2b97636e8b2e96b311a5fd5c68d211
Port: <none>
Host Port: <none>
State: Terminated
Reason: Completed
Exit Code: 0
Started: Sat, 10 Oct 2020 14:07:33 +0800
Finished: Sat, 10 Oct 2020 14:07:33 +0800
Ready: True
Restart Count: 0
Environment:
METHOD: LIST
LABEL: grafana_datasource
FOLDER: /etc/grafana/provisioning/datasources
RESOURCE: both
Mounts:
/etc/grafana/provisioning/datasources from sc-datasources-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from grafana-token-2bd8f (ro)
Containers:
grafana-sc-dashboard:
Container ID: docker://7ab8d7072125c63686deee6bdca2092de162318fbb8f2a81603d8c7c9fa78998
Image: kiwigrid/k8s-sidecar:0.1.151
Image ID: docker-pullable://kiwigrid/k8s-sidecar@sha256:7b98eecdf6d117b053622e9f317c632a4b2b97636e8b2e96b311a5fd5c68d211
Port: <none>
Host Port: <none>
State: Running
Started: Sat, 10 Oct 2020 14:07:34 +0800
Ready: True
Restart Count: 0
Environment:
METHOD:
LABEL: grafana_dashboard
FOLDER: /tmp/dashboards
RESOURCE: both
NAMESPACE: true
Mounts:
/tmp/dashboards from sc-dashboard-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from grafana-token-2bd8f (ro)
grafana:
Container ID: docker://65a9a255ea8ef2ccd14fdb0dc21c4245d14995b5981d0182824510cbcd8e0c67
Image: grafana/grafana:7.0.3
Image ID: docker-pullable://grafana/grafana@sha256:d72946c8e5d57a9a121bcc3ae8e4a8ccab96960d81031d18a4c31ad1f7aea03e
Ports: 80/TCP, 3000/TCP
Host Ports: 0/TCP, 0/TCP
State: Running
Started: Sat, 10 Oct 2020 14:07:34 +0800
Ready: True
Restart Count: 0
Liveness: http-get http://:3000/api/health delay=60s timeout=30s period=10s #success=1 #failure=10
Readiness: http-get http://:3000/api/health delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
GF_SECURITY_ADMIN_USER: <set to the key 'admin-user' in secret 'grafana'> Optional: false
GF_SECURITY_ADMIN_PASSWORD: <set to the key 'admin-password' in secret 'grafana'> Optional: false
Mounts:
/etc/grafana/grafana.ini from config (rw,path="grafana.ini")
/etc/grafana/provisioning/dashboards/sc-dashboardproviders.yaml from sc-dashboard-provider (rw,path="provider.yaml")
/etc/grafana/provisioning/datasources from sc-datasources-volume (rw)
/tmp/dashboards from sc-dashboard-volume (rw)
/var/lib/grafana from storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from grafana-token-2bd8f (ro)
Conditions:
...
从以上信息中获取到的container_name有:init-chown-data、grafana-sc-datasources、grafana
因为grafana的服务端口号为3000,只需要查看信息中有端口号信息的Ports: 80/TCP, 3000/TCP
这一部分的container_name就是grafana容器
也可以很便捷的从events信息中看到创建了三个container
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned telemetry/grafana-6b79c984b-6bl7w to controller1-gtc126
Normal Pulled 17m kubelet, controller1-gtc126 Container image "busybox:1.31.1" already present on machine
Normal Created 17m kubelet, controller1-gtc126 Created container init-chown-data
Normal Started 17m kubelet, controller1-gtc126 Started container init-chown-data
Normal Pulled 17m kubelet, controller1-gtc126 Container image "kiwigrid/k8s-sidecar:0.1.151" already present on machine
Normal Created 17m kubelet, controller1-gtc126 Created container grafana-sc-datasources
Normal Started 17m kubelet, controller1-gtc126 Started container grafana-sc-datasources
Normal Pulled 17m kubelet, controller1-gtc126 Container image "kiwigrid/k8s-sidecar:0.1.151" already present on machine
Normal Created 17m kubelet, controller1-gtc126 Created container grafana-sc-dashboard
Normal Started 17m kubelet, controller1-gtc126 Started container grafana-sc-dashboard
Normal Pulled 17m kubelet, controller1-gtc126 Container image "grafana/grafana:7.0.3" already present on machine
Normal Created 17m kubelet, controller1-gtc126 Created container grafana
Normal Started 17m kubelet, controller1-gtc126 Started container grafana
进入grafana下载插件
[root@controller1-gtc126 ~]# kubectl exec -it -n {namespace} {pod_name} --container {container_name} -- bash
bash-5.0$ grafana-cli plugins install grafana-kubernetes-app
# 如果这里显示下载超时,就查看代理网络的问题,是否将pod_ip从no_proxy中删除,或者是否需要在容器中添加代理
# 输出以下则下载成功
installing grafana-kubernetes-app @ 1.0.1
from: https://grafana.com/api/plugins/grafana-kubernetes-app/versions/1.0.1/download
into: /var/lib/grafana/plugins
✔ Installed grafana-kubernetes-app successfully
Restart grafana after installing plugins . <service grafana-server restart>
重启grafana_container
# 将pod重启即可
kubectl get pods -n {namespace} {pod_name} -o yaml | kubectl replace --force -f -
验证插件
根据项目中使用的svc去访问grafana,查看插件是否成功导入
设置—plugins
查看插件
该插件需要手动启动,点击进入插件,点击enable即可