2019-07-26-Kubernetes应用
应用
1. 安装metrics-server 监控
1.1 介绍
metrics-server
是一个用于收集 Kubernetes 集群中节点和容器指标的组件。它通过 API
服务器提供节点和容器的指标数据,包括 CPU、内存、磁盘和网络使用情况。
1.2 安装步骤
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml -O metrics-server.yaml sed -i '/- args:/a\ - --kubelet-insecure-tls' metrics-server.yaml sed -i 's#registry.k8s.io/metrics-server#docker.m.daocloud.io/rancher/mirrored-metrics-server#g' metrics-server.yaml sed -i 's#memory: 200Mi#memory: 80Mi#g' metrics-server.yaml kubectl apply -f metrics-server.yaml kubectl top nodes kubectl top pods -A --sort-by=memory
1.3 验证
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 free -h kubectl top pods -A --sort-by=memory kubectl get pods -A kubectl create deployment stress --image=nginx:alpine kubectl scale deployment stress --replicas=10sleep 60 kubectl get pods kubectl delete deployment stress
2. Prometheus + Grafana
完整教程
2.1 介绍
prometheus
是一个监控系统和时间序列数据库。它通过一个 HTTP API
接口暴露指标数据,支持通过一个查询语言(PromQL)来查询和处理指标数据。
它的架构包括一个主节点(Prometheus 服务器)和多个从节点(Prometheus
从节点)。主节点负责接收、存储和查询指标数据,从节点负责采集指标数据并将其发送到主节点。
它的架构允许在集群中部署多个从节点,以实现高可用和可扩展性。
grafana
是一个开源的监控和分析平台。它提供了一个用户友好的界面,用于可视化指标数据和创建自定义的监控仪表板。
它支持从多个数据源(包括
Prometheus)获取指标数据,并提供丰富的可视化图表和警报功能。
它的架构包括一个主节点(Grafana 服务器)和多个从节点(Grafana
从节点)。主节点负责接收、存储和查询指标数据,从节点负责将指标数据发送到主节点。
它的架构允许在集群中部署多个从节点,以实现高可用和可扩展性。
2.2 安装步骤
方案
特点
资源占用
推荐度
kube-prometheus-stack (Helm)
一套全家桶,含 Prometheus Operator、Grafana、各种
exporter、告警规则
1.5G+
⭐⭐⭐(功能全但重)
单独装 Prometheus + Grafana
按需组合,轻量
500-800M
⭐⭐⭐⭐⭐(推荐使用,资源紧张)
2.2.1 前置步骤
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 kubectl label nodes node01 workload=apps --overwrite kubectl label nodes node02 workload=apps --overwrite kubectl get nodes --show-labels | grep workload curl -fsSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash helm version kubectl create namespace monitoring
2.2.2.1 方案一: 安装
kube-prometheus-stack
1 helm install kube-prometheus-stack kube-prometheus-stack --namespace monitoring --create-namespace
2.2.2.2 方案二: 安装 Prometheus
+ Grafana
安装Prometheus
1 2 3 4 5 6 7 8 helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update helm install prometheus prometheus-community/prometheus \ -f prometheus-values.yaml \ -n monitoring
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 cat > prometheus-values.yaml <<EOF server: nodeSelector: workload: apps resources: requests: memory: 256Mi cpu: 100m limits: memory: 512Mi cpu: 500m retention: "7d" persistentVolume: enabled: false service: type: NodePort nodePort: 30090 alertmanager: enabled: false prometheus-pushgateway: enabled: false prometheus-node-exporter: resources: requests: memory: 32Mi cpu: 50m limits: memory: 64Mi cpu: 100m kube-state-metrics: nodeSelector: workload: apps resources: requests: memory: 64Mi cpu: 50m limits: memory: 128Mi cpu: 200m configmapReload: prometheus: nodeSelector: workload: apps EOF
验证Prometheus
1 2 3 4 5 6 7 8 9 10 11 [root@master01 monitor ] NAME READY STATUS RESTARTS AGE prometheus-kube-state-metrics-6 d7d7d9b78-f6rl9 1 /1 Running 0 2m 40s prometheus-prometheus-node -exporter-25knx 1 /1 Running 0 21m prometheus-prometheus-node -exporter-lmvpp 1 /1 Running 0 21m prometheus-prometheus-node -exporter-ndqqw 1 /1 Running 0 21m prometheus-server-59 d754bf78-hbwk8 2 /2 Running 0 21m
安装 Grafana
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 helm repo add grafana https://grafana.github.io/helm-charts helm repo update helm install grafana grafana/grafana \ -f grafana-values.yaml \ -n monitoring helm pull grafana/grafana --version 10.5.15 helm install grafana ./grafana-10.5.15.tgz \ -f grafana-values.yaml -n monitoring kubectl get pods -n monitoring -l app.kubernetes.io/name=grafana kubectl get secret grafana -n monitoring -o jsonpath='{.data.admin-password}' | base64 -decho
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 cat > grafana-values.yaml <<'EOF' nodeSelector: workload: apps resources: requests: memory: 150Mi cpu: 50m limits: memory: 400Mi cpu: 300m persistence: enabled: false adminPassword: admin123 service: type: NodePort nodePort: 30030 datasources: datasources.yaml: apiVersion: 1 datasources: - name: Prometheus type: prometheus url: http://prometheus-server.monitoring.svc.cluster.local access: proxy isDefault: true dashboardProviders: dashboardproviders.yaml: apiVersion: 1 providers: - name: 'default' orgId: 1 folder: '' type: file disableDeletion: false editable: true options: path: /var/lib/grafana/dashboards/default dashboards: default: k8s-cluster: gnetId: 7249 revision: 1 datasource: Prometheus node-exporter: gnetId: 1860 revision: 37 datasource: Prometheus k8s-pods: gnetId: 6417 revision: 1 datasource: Prometheus EOF
2.3 问题解决
2.3.1 问题描述
镜像拉取失败
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 [root@master01 monitor]# kubectl get pods -n monitoring NAME READY STATUS RESTARTS AGE prometheus-kube-state-metrics-5579fcf48f-g7xss 0/1 ErrImagePull 0 17s prometheus-kube-state-metrics-6856dd4f55-dnhld 0/1 ImagePullBackOff 0 16m kubectl get deploy -n monitoring prometheus-kube-state-metrics -o jsonpath ='{.spec.template.spec.containers[0].image}' [root@master01 monitor]# kubectl describe pod -n monitoring prometheus-kube-state-metrics-5579fcf48f-g7xss Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 11m default-scheduler Successfully assigned monitoring/prometheus-kube-state-metrics-6856dd4f55-dnhld to node02 Warning Failed 9m53s (x2 over 10m) kubelet Failed to pull image "registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.18.0" : failed to pull and unpack image "registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.18.0" : failed to resolve reference "registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.18.0" : failed to do request: Head "https://europe-west4-docker.pkg.dev/v2/k8s-artifacts-prod/images/kube-state-metrics/kube-state-metrics/manifests/v2.18.0" : dial tcp 74.125.20.82:443: connect: connection refused [root@master01 monitor]# kubectl set image deployment/prometheus-kube-state-metrics -n monitoring kube-state-metrics =k8s.m.daocloud.io/kube-state-metrics/kube-state-metrics:v2.18.0
3. 待续