使用 Prom 监控 Kubernetes 所需要哪些指标
节点资源指标
Node 数量
Pod 数量
资源对象状态
容器资源利用
…
节点资源指标可以通过,node_exporter 来获取,其他的指标需要通过 cAdvisor
和 kube-state-metrics
,其中 cAdvisor
是已经集成到 kubelet
中,可以通过 kubelet 的 API 地址直接获取指标的,指标可以参考 Prom 官网说明:https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config
授权 Prom 访问 k8s api 地址
prom 需要通过访问 k8s 来获取指标信息
我这里已经有搭建好的 3 节点的 k8s 集群,所以直接授权了
$ vim prom_rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
namespace: kube-system
secrets:
- name: prometheus-secret
---
apiVersion: v1
kind: Secret
metadata:
name: prometheus-secrets
namespace: kube-system
annotations:
kubernetes.io/service-account.name: prometheus # 指定 serviceaccount的name
type: kubernetes.io/service-account-token
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups:
- ""
resources:
- nodes
- nodes/metrics
- services
- endpoints
- pods
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- configmaps
verbs:
- get
- nonResourceURLs:
- "/metrics"
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: kube-system
获取 k8s 集群 token
$ kubectl describe secrets -n kube-system prometheus-secrets | grep "token:"
token: eyJhbGciOiJSUzI1NiIsImtpZCI6IklpUm9iemdBVXpiX3EyajRvT0Y2QzluUGo4aVZzUVdPMUs4Vm1Sa2VCWU0ifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJwcm9tZXRoZXVzLXNlY3JldHMiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoicHJvbWV0aGV1cyIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjFkNTA2MTZiLTIzZmMtNGU0Yy1hMDM2LWU3ZmJhNTA3ODU0ZSIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlLXN5c3RlbTpwcm9tZXRoZXVzIn0.zwmhJ3HZg3U0wfoQQFibqBvNhP7n7qpHiYQutid847kkgHQsKwJ62Cam5PeuYqTgvi8ctkPCj1WhzN6njkr_j9mv9_qGpqYBdRNi0rIOUULB-8BtD7xjnUX1qAa6YOIZzkse81EX1pOw5oSxkwK4y8Fb4BwJg3KRFmfyMowjZC-vAmtBKyxkZToBvyKCB6cNuECX6L2ZbtjgNQ18vFFAuWVqDwENlieEbMWwKc8kJeq_pt0qsh9_TVriKZPLz8jPunHfaP-bM0JAJaj3jTVuGdgSBZ8VouGHWFQWUoWexmxLAR9Mk5msfwy1K25bOPACbM7M_bEpoQDgA-a-cMfJVA
prom 就可以通过这个token来访问 k8s 集群。
将该 token 保存到一个文件中,放置到 prom 所在服务器的目录中
/opt/monitoring/prometheus/prom_k8s_token
k8s 服务发现
在 prom 中编写基于 k8s 的服务发现
采集 kubelet 自带 cAdvisor 的指标
# prometheus.yml
- job_name: "kubernetes_container"
metrics_path: /metrics
scheme: https
bearer_token_file: /opt/monitoring/prometheus/prom_k8s_token
tls_config:
insecure_skip_verify: true
kubernetes_sd_configs:
#### node ####
- role: node
api_server: https://192.168.1.11:6443
bearer_token_file: /opt/monitoring/prometheus/prom_k8s_token
tls_config:
insecure_skip_verify: true
relabel_configs:
# 将标签(.+)作为新标签名,原有值不变
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
# 直接访问kubelet的/metrics/cadvisor地址
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: metrics/cadvisor
然后重启 prom 即可
部署 kube-state-metrics
参考: https://github.com/starsliao/Prometheus/tree/master/kubernetes
包括dashboard也可以使用参考
我这里将 service 改为了 NodePort 的形式,因为要写入到 Prometheus 中
# prometheus.yml
- job_name: 'kube-state-metrics'
static_configs:
- targets: ['192.168.1.11:32222'] # 这里就是 kube-state-metrics 的地址
然后导入参考的 dashboard