第3章应用监控
应用监控说明
prometheus的数据指标都是通过http实现的metrics接口获取到的,所以应用只需要暴露metrics接口,prometheus就可以定期的去拉取数据。
随着容器和k8s的流行,现在很多服务都自己内置了metrics接口,对于本身没有提供metrics的应用,promtheus官方也提供了很多可以直接使用的exporter来获取指标数据,比如redis_exporter,mysql_exporter等。
自带/metrics接口的应用检测
k8s里的coredns自带的metrics接口,所以我们可以先拿来试试手,查看croedns的配置文件可以发现提供prometheus服务采集的端口是9153。
[root@node1 prom]# kubectl -n kube-system describe cm coredns
Name:         coredns
Namespace:    kube-system
Labels:       <none>
Annotations:  <none>
Data
====
Corefile:
----
.:53 {
    errors
    health {
       lameduck 5s
    }
    ready
    kubernetes cluster.local in-addr.arpa ip6.arpa {
       pods insecure
       fallthrough in-addr.arpa ip6.arpa
       ttl 30
    }
    prometheus :9153		#自带的prometheus监控暴露服务
    forward . /etc/resolv.conf {
       max_concurrent 1000
    }
    cache 30
    loop
    reload
    loadbalance
}
Events:  <none>
查看CoreDNS的Pod地址
[root@node1 prom]# kubectl -n kube-system get pod -l k8s-app=kube-dns -o wide
NAME                       READY   STATUS    RESTARTS   AGE   IP          NODE    NOMINATED NODE   READINESS GATES
coredns-6d56c8448f-ckwhg   1/1     Running   5          10d   10.2.0.17   node1   <none>           <none>
coredns-6d56c8448f-rvmdf   1/1     Running   5          10d   10.2.0.16   node1   <none>           <none>
直接访问/metrics接口
[root@node1 prom]# curl -I 10.2.0.17:9153/metrics
HTTP/1.1 200 OK
Content-Type: text/plain; version=0.0.4; charset=utf-8
Date: Wed, 11 Aug 2021 07:14:25 GMT
[root@node1 prom]# curl 10.2.0.17:9153/metrics
# HELP coredns_build_info A metric with a constant '1' value labeled by version, revision, and goversion from which CoreDNS was built.
# TYPE coredns_build_info gauge
coredns_build_info{goversion="go1.14.4",revision="f59c03d",version="1.7.0"} 1
# HELP coredns_cache_entries The number of elements in the cache.
# TYPE coredns_cache_entries gauge
coredns_cache_entries{server="dns://:53",type="denial"} 12
coredns_cache_entries{server="dns://:53",type="success"} 1
# HELP coredns_cache_misses_total The count of cache misses.
# TYPE coredns_cache_misses_total counter
coredns_cache_misses_total{server="dns://:53"} 13
# HELP coredns_dns_request_duration_seconds Histogram of the time (in seconds) each request took.
# TYPE coredns_dns_request_duration_seconds histogram
...........................................
知道了端口,也确认了可以访问,那么接下来我们就可以编辑prometheus的配置文件来发现这个服务了。
编辑prom-cm配置文件
[root@node1 prom]# cat prom-cm.yml
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: prom
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s
      scrape_timeout: 15s
    scrape_configs:
    - job_name: 'prometheus'
      static_configs:
      - targets: ['localhost:9090']
    - job_name: 'coredns'		#任务名称
      static_configs:				#静态配置
      - targets: ['10.2.0.16:9153','10.2.0.17:9153']		#这里我们直接写coredns的ClusterIP
更新prom-cm资源配置
[root@node1 prom]# kubectl apply -f prom-cm.yml
因为我们在prometheus的配置文件里配置了热更新的参数,所以可以不用重启pod在线热更新配置使其生效。
热更新promtheus配置
[root@node1 prom]# kubectl -n prom get pod -o wide
NAME                          READY   STATUS    RESTARTS   AGE   IP          NODE    NOMINATED NODE   READINESS GATES
prometheus-796566c67c-lhrns   1/1     Running   0          22m   10.2.2.86   node2   <none>           <none>
#注意要等一会,因为configmap更新到pod里需要点时间
[root@node1 prom]# curl -X POST "http://10.2.2.86:9090/-/reload"
查看promtheus发现

使用exporter监控
刚才我们说了,有些应用自带的metrics接口,那么对于没有自带metrics接口的应用,我们可以使用各种exporter监控,官方已经给我们提供了非常多的exporter,具体可以去官网查阅,地址如下:
https://prometheus.io/docs/instrumenting/exporters/
下面以mysql的exporter举例,具体的做法就是在每个mysql的pod里部署一个exporter服务来监控mysql的各项数据。
cat >mysql-prom.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mysql-dp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:5.7
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 3306
        env:
        - name: MYSQL_ROOT_PASSWORD
          value: "123456"
      - name: mysql-exporter
        image: prom/mysqld-exporter
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9104
        env:
        - name: DATA_SOURCE_NAME
          value: "root:123456@(localhost:3306)/"
---
kind: Service
apiVersion: v1
metadata:
  name: mysql-svc
spec:
  selector:
    app: mysql
  ports:
  - name: mysql
    port: 3306
    targetPort: 3306
  - name: mysql-prom
    port: 9104
    targetPort: 9104
EOF
应用后查看
[root@node1 prom]# kubectl apply -f mysql-prom.yaml
deployment.apps/mysql-dp created
service/mysql-svc created
[root@node1 prom]# kubectl get pod
NAME                          READY   STATUS    RESTARTS   AGE
mysql-dp-79b48cff96-m96bz     2/2     Running   0          73s
[root@node1 prom]# kubectlget svc
NAME         TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)             AGE
mysql-svc    ClusterIP   10.1.213.31   <none>        3306/TCP,9104/TCP   4m9s
修改prom配置文件,注意!因为prom和mysql在不同的命名空间,所以prom采集地址的时候需要使用service名称+命名空间
cat > prom-cm.yml << 'EOF'
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: prom
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s
      scrape_timeout: 15s
    scrape_configs:
    - job_name: 'prometheus'
      static_configs:
      - targets: ['localhost:9090']
    - job_name: 'coredns'
      static_configs:
      - targets: ['10.2.0.16:9153','10.2.0.17:9153']
    - job_name: 'mysql'
      static_configs:
      - targets: ['mysql-svc.default:9104']
EOF
更新配置
[root@node1 prom]# kubectl apply -f prom-cm.yml
configmap/prometheus-config configured
[root@node1 prom]# curl -X POST "http://10.2.2.86:9090/-/reload"
查看promtheus
更新: 2024-09-21 15:52:29