2019-07-26-Kubernetes应用
应用
1. 安装metrics-server 监控
1.1 介绍
metrics-server
是一个用于收集 Kubernetes 集群中节点和容器指标的组件。它通过 API
服务器提供节点和容器的指标数据,包括 CPU、内存、磁盘和网络使用情况。
1.2 安装步骤
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml -O metrics-server.yaml sed -i '/- args:/a\ - --kubelet-insecure-tls' metrics-server.yaml sed -i 's#registry.k8s.io/metrics-server#docker.m.daocloud.io/rancher/mirrored-metrics-server#g' metrics-server.yaml sed -i 's#memory: 200Mi#memory: 80Mi#g' metrics-server.yaml kubectl apply -f metrics-server.yaml kubectl top nodes kubectl top pods -A --sort-by=memory
1.3 验证
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 free -h kubectl top pods -A --sort-by=memory kubectl get pods -A kubectl create deployment stress --image=nginx:alpine kubectl scale deployment stress --replicas=10sleep 60 kubectl get pods kubectl delete deployment stress
2. Prometheus + Grafana
完整教程
2.1 介绍
prometheus
是一个监控系统和时间序列数据库。它通过一个 HTTP API
接口暴露指标数据,支持通过一个查询语言(PromQL)来查询和处理指标数据。
它的架构包括一个主节点(Prometheus 服务器)和多个从节点(Prometheus
从节点)。主节点负责接收、存储和查询指标数据,从节点负责采集指标数据并将其发送到主节点。
它的架构允许在集群中部署多个从节点,以实现高可用和可扩展性。
grafana
是一个开源的监控和分析平台。它提供了一个用户友好的界面,用于可视化指标数据和创建自定义的监控仪表板。
它支持从多个数据源(包括
Prometheus)获取指标数据,并提供丰富的可视化图表和警报功能。
它的架构包括一个主节点(Grafana 服务器)和多个从节点(Grafana
从节点)。主节点负责接收、存储和查询指标数据,从节点负责将指标数据发送到主节点。
它的架构允许在集群中部署多个从节点,以实现高可用和可扩展性。
2.2 安装步骤
方案
特点
资源占用
推荐度
kube-prometheus-stack (Helm)
一套全家桶,含 Prometheus Operator、Grafana、各种
exporter、告警规则
1.5G+
⭐⭐⭐(功能全但重)
单独装 Prometheus + Grafana
按需组合,轻量
500-800M
⭐⭐⭐⭐⭐(推荐使用,资源紧张)
2.2.1 前置步骤
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 kubectl label nodes node01 workload=apps --overwrite kubectl label nodes node02 workload=apps --overwrite kubectl get nodes --show-labels | grep workload curl -fsSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash helm version kubectl create namespace monitoring
2.2.2.1 方案一: 安装
kube-prometheus-stack
1 helm install kube-prometheus-stack kube-prometheus-stack --namespace monitoring --create-namespace
2.2.2.2 方案二: 安装 Prometheus
+ Grafana
安装Prometheus
1 2 3 4 5 6 7 8 helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update helm install prometheus prometheus-community/prometheus \ -f prometheus-values.yaml \ -n monitoring
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 cat > prometheus-values.yaml <<EOF server: nodeSelector: workload: apps resources: requests: memory: 256Mi cpu: 100m limits: memory: 512Mi cpu: 500m retention: "7d" persistentVolume: enabled: false service: type: NodePort nodePort: 30090 alertmanager: enabled: false prometheus-pushgateway: enabled: false prometheus-node-exporter: resources: requests: memory: 32Mi cpu: 50m limits: memory: 64Mi cpu: 100m kube-state-metrics: nodeSelector: workload: apps resources: requests: memory: 64Mi cpu: 50m limits: memory: 128Mi cpu: 200m configmapReload: prometheus: nodeSelector: workload: apps EOF
验证Prometheus
1 2 3 4 5 6 7 8 9 10 11 [root@master01 monitor ] NAME READY STATUS RESTARTS AGE prometheus-kube-state-metrics-6 d7d7d9b78-f6rl9 1 /1 Running 0 2m 40s prometheus-prometheus-node -exporter-25knx 1 /1 Running 0 21m prometheus-prometheus-node -exporter-lmvpp 1 /1 Running 0 21m prometheus-prometheus-node -exporter-ndqqw 1 /1 Running 0 21m prometheus-server-59 d754bf78-hbwk8 2 /2 Running 0 21m
安装 Grafana
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 helm repo add grafana https://grafana.github.io/helm-charts helm repo update helm install grafana grafana/grafana \ -f grafana-values.yaml \ -n monitoring helm pull grafana/grafana --version 10.5.15 helm install grafana ./grafana-10.5.15.tgz \ -f grafana-values.yaml -n monitoring kubectl get pods -n monitoring -l app.kubernetes.io/name=grafana kubectl get secret grafana -n monitoring -o jsonpath='{.data.admin-password}' | base64 -decho
▶
title grafana-values.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 cat > grafana-values.yaml <<'EOF' nodeSelector: workload: apps resources: requests: memory: 150Mi cpu: 50m limits: memory: 400Mi cpu: 300m persistence: enabled: false adminPassword: admin123 service: type: NodePort nodePort: 30030 datasources: datasources.yaml: apiVersion: 1 datasources: - name: Prometheus type: prometheus url: http://prometheus-server.monitoring.svc.cluster.local access: proxy isDefault: true dashboardProviders: dashboardproviders.yaml: apiVersion: 1 providers: - name: 'default' orgId: 1 folder: '' type: file disableDeletion: false editable: true options: path: /var/lib/grafana/dashboards/default dashboards: default: k8s-cluster: gnetId: 7249 revision: 1 datasource: Prometheus node-exporter: gnetId: 1860 revision: 37 datasource: Prometheus k8s-pods: gnetId: 6417 revision: 1 datasource: Prometheus EOF
3. 部署harbor
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 ┌───────────────────────────────────────────────────────────────┐ │ 方式 A: NodePort(默认 Helm 安装方式) │ │ │ │ 外部 ──► 节点IP:30080 ──► ingress-nginx-controller ──► Harbor│ │ │ │ 访问地址: http://harbor.k8s.local :30080 │ │ 优点: 简单,默认就有 │ │ 缺点: 端口难看,docker login 需要带端口 │ └───────────────────────────────────────────────────────────────┘ ┌───────────────────────────────────────────────────────────────┐ │ 方式 B: hostNetwork(推荐) │ │ │ │ 外部 ──► 节点IP:80 ──► ingress-nginx-controller ──► Harbor │ │ (直接用宿主机网络) │ │ │ │ 访问地址: http://harbor.k8s.local │ │ 优点: 端口干净,docker login 不带端口 │ │ 缺点: 占用宿主机 80/443 端口 │ └───────────────────────────────────────────────────────────────┘ 外部请求 │ ▼ 节点 IP:80(hostNetwork) │ ▼ Ingress-Nginx Controller ← 核心:按 Host 头路由,自动维护 Nginx 配置 │ Host: harbor.k8s.local ▼ harbor-nginx(Harbor 内部代理) │ ├── /v2/ → harbor-registry(镜像读写) ├── /service/ → harbor-core(认证) ├── /api/ → harbor-core(管理 API) └── / → harbor-portal(Web UI)
3.1 介绍harbor
harbor
是一个基于k8s的镜像仓库,它可以帮助用户存储、分发和管理镜像。
生成证书
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 mkcert 的作用:1. 在本机创建一个本地 CA2. 把这个 CA 加入本机系统/浏览器信任列表3. 用这个 CA 签发证书 解决的问题: ├── 浏览器不告警 → 浏览器在本机 → 本机信任 CA → 在本机装 mkcert ✅ ├── docker 不报错 → docker 在本机 → 本机信任 CA → 在本机装 mkcert ✅ └── K8s 节点拉镜像 → 手动把 CA 复制过去配置信任(不需要装 mkcert) 本机(开发机) ├── 安装 mkcert ← 只在这里装 ├── mkcert -install ← CA 加入本机信任 ├── mkcert 生成证书 ← 证书在本机生成 │ ├── 证书文件 → scp → K8s master → kubectl create secret └── CA 文件 → scp → 三台节点 → 系统信任 + containerd 信任 K8s 节点 ├── 不装 mkcert ├── 只接收 CA 文件并配置信任 └── containerd 用 CA 验证 Harbor 的证书 > harbor.k8s.local ` > 192.168 .56 .129 ` > 192.168 .56 .130 ` > 192.168 .56 .131 ` > localhost ` > 127.0 .0 .1 ` > harbor Created a new certificate valid for the following names 📜 - "harbor.k8s.local" - "192.168.56.129" - "192.168.56.130" - "192.168.56.131" - "localhost" - "127.0.0.1" - "harbor" The certificate is at "./harbor.k8s.local+6.pem" and the key at "./harbor.k8s.local+6-key.pem" ✅ It will expire on 14 August 2028 🗓
3.2 部署Ingress
Controller和StorageClass
k8s默认不自带 存储组件 (StorageClass)和Ingress
控制器 (Ingress Controller)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.30/deploy/local-path-storage.yaml kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx helm repo update helm install ingress-nginx ingress-nginx/ingress-nginx \ --namespace ingress-nginx \ --create-namespace \ --set controller.nodeSelector."workload" =apps \ --set controller.image.registry=k8s.m.daocloud.io \ --set controller.image.image=ingress-nginx/controller \ --set controller.image.digest="" \ --set controller.admissionWebhooks.patch.image.registry=k8s.m.daocloud.io \ --set controller.admissionWebhooks.patch.image.image=ingress-nginx/kube-webhook-certgen \ --set controller.admissionWebhooks.patch.image.digest="" \ --set controller.service.type=NodePort \ --set controller.service.nodePorts.http=30080 \ --set controller.service.nodePorts.https=30443 --set controller.hostNetwork=true --set controller.service.type=ClusterIP kubectl get svc -n ingress-nginx kubectl get pods -n ingress-nginx -w
3.3 部署harbor
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 helm repo add harbor https://helm.goharbor.io helm repo update helm search repo harbor/harbor helm show values harbor/harbor > harbor-default-values.yaml kubectl create namespace harbor helm install harbor harbor/harbor \ -f harbor-https-values-k8s-mkcert.yaml \ -n harbor \ --timeout 15m helm install harbor harbor/harbor \ -f harbor-https-values-k8s-mkcert.yaml \ -n harbor \ --dry-run \ --debug 2>&1 | head -50 kubectl get pods -n harbor -w NAME READY STATUS RESTARTS AGE harbor-core-59844d55df-mfgwf 1/1 Running 7 (23m ago) 38m harbor-database-0 1/1 Running 0 38m harbor-jobservice-6cb854f9d6-gs6vk 1/1 Running 5 (18m ago) 38m harbor-portal-65ffc56476-4c72t 1/1 Running 0 38m harbor-redis-0 1/1 Running 0 38m harbor-registry-75bb8d47df-pldkp 2/2 Running 0 38m harbor-trivy-0 1/1 Running 0 38m
▶
title harbor-values.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 cat >>harbor-https-values-k8s-mkcert.yaml <<EOF expose: type: ingress tls: enabled: true certSource: secret secret: secretName: harbor-tls ingress: hosts: core: harbor.k8s.local ingressClassName: nginx annotations: kubernetes.io/ingress.class: nginx nginx.ingress.kubernetes.io/proxy-body-size: "0" nginx.ingress.kubernetes.io/proxy-read-timeout: "600" nginx.ingress.kubernetes.io/proxy-send-timeout: "600" nginx.ingress.kubernetes.io/ssl-redirect: "false" nginx.ingress.kubernetes.io/force-ssl-redirect: "false" ingress.kubernetes.io/ssl-redirect: "false" externalURL: https://harbor.k8s.local internalTLS: enabled: false harborAdminPassword: "Harbor12345" persistence: enabled: true resourcePolicy: "keep" persistentVolumeClaim: registry: storageClass: "local-path" accessMode: ReadWriteOnce size: 10Gi jobservice: jobLog: storageClass: "local-path" accessMode: ReadWriteOnce size: 1Gi database: storageClass: "local-path" accessMode: ReadWriteOnce size: 2Gi redis: storageClass: "local-path" accessMode: ReadWriteOnce size: 1Gi trivy: storageClass: "local-path" accessMode: ReadWriteOnce size: 5Gi nginx: image: repository: docker.m.daocloud.io/goharbor/nginx-photon nodeSelector: workload: apps resources: requests: memory: 64Mi cpu: 50m limits: memory: 128Mi cpu: 100m portal: image: repository: docker.m.daocloud.io/goharbor/harbor-portal nodeSelector: workload: apps resources: requests: memory: 64Mi cpu: 50m limits: memory: 128Mi cpu: 100m core: image: repository: docker.m.daocloud.io/goharbor/harbor-core nodeSelector: workload: apps resources: requests: memory: 256Mi cpu: 100m limits: memory: 512Mi cpu: 500m jobservice: image: repository: docker.m.daocloud.io/goharbor/harbor-jobservice nodeSelector: workload: apps resources: requests: memory: 128Mi cpu: 50m limits: memory: 256Mi cpu: 200m registry: registry: image: repository: docker.m.daocloud.io/goharbor/registry-photon controller: image: repository: docker.m.daocloud.io/goharbor/harbor-registryctl nodeSelector: workload: apps resources: requests: memory: 128Mi cpu: 50m limits: memory: 256Mi cpu: 200m trivy: image: repository: docker.m.daocloud.io/goharbor/trivy-adapter-photon nodeSelector: workload: apps resources: requests: memory: 128Mi cpu: 50m limits: memory: 256Mi cpu: 200m database: internal: image: repository: docker.m.daocloud.io/goharbor/harbor-db nodeSelector: workload: apps resources: requests: memory: 128Mi cpu: 100m limits: memory: 256Mi cpu: 300m redis: internal: image: repository: docker.m.daocloud.io/goharbor/redis-photon nodeSelector: workload: apps resources: requests: memory: 64Mi cpu: 50m limits: memory: 128Mi cpu: 100m exporter: image: repository: docker.m.daocloud.io/goharbor/harbor-exporter nodeSelector: workload: apps EOF
4. 问题解决
4.1 问题描述
4.1.1 镜像拉取失败
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 [root@master01 monitor]# kubectl get pods -n monitoring NAME READY STATUS RESTARTS AGE prometheus-kube-state-metrics-5579fcf48f-g7xss 0/1 ErrImagePull 0 17s prometheus-kube-state-metrics-6856dd4f55-dnhld 0/1 ImagePullBackOff 0 16m kubectl get deploy -n monitoring prometheus-kube-state-metrics -o jsonpath='{.spec.template.spec.containers[0].image}' [root@master01 monitor]# kubectl describe pod -n monitoring prometheus-kube-state-metrics-5579fcf48f-g7xss Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 11m default-scheduler Successfully assigned monitoring/prometheus-kube-state-metrics-6856dd4f55-dnhld to node02 Warning Failed 9m53s (x2 over 10m) kubelet Failed to pull image "registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.18.0" : failed to pull and unpack image "registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.18.0" : failed to resolve reference "registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.18.0" : failed to do request: Head "https://europe-west4-docker.pkg.dev/v2/k8s-artifacts-prod/images/kube-state-metrics/kube-state-metrics/manifests/v2.18.0" : dial tcp 74.125.20.82:443: connect: connection refused [root@master01 monitor]# kubectl set image deployment/prometheus-kube-state-metrics -n monitoring kube-state-metrics=k8s.m.daocloud.io/kube-state-metrics/kube-state-metrics:v2.18.0
4.1.2 Harbor相关问题
Harbor认证流程,导致docker login 失败: > 根本原因:externalURL
配置的是 http://harbor.k8s.local,但实际访问端口是 30080,Harbor 返回的
token 地址没带端口,导致 docker 去访问 80 端口失败。
解决方案: > 方案 A: 修改 Harbor externalURL 带上端口 →
端口丑(每次都要加 :30080)
1 2 3 4 5 6 7 8 9 10 11 12 13 externalURL: http://harbor.k8s.local :30080 helm upgrade harbor harbor/harbor \ -f harbor-values.yaml \ -n harbor \ --timeout 10m kubectl rollout status deployment harbor-core -n harbor docker login harbor.k8s.local:30080 -u admin -p Harbor12345
方案 B: 让 Ingress 监听 80 端口 → 更优雅,推荐 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 kubectl get all -n ingress-nginx helm get values ingress-nginx -n ingress-nginx kubectl get deployment ingress-nginx-controller \ -n ingress-nginx \ -o jsonpath='{.spec.template.spec.hostNetwork}' kubectl patch deployment ingress-nginx-controller -n ingress-nginx \ --type ='json' \ -p='[ {"op": "add", "path": "/spec/template/spec/hostNetwork", "value": true}, {"op": "add", "path": "/spec/template/spec/dnsPolicy", "value": "ClusterFirstWithHostNet"} ]' kubectl patch daemonset ingress-nginx-controller -n ingress-nginx \ --type ='json' \ -p='[ {"op": "add", "path": "/spec/template/spec/hostNetwork", "value": true}, {"op": "add", "path": "/spec/template/spec/dnsPolicy", "value": "ClusterFirstWithHostNet"} ]' kubectl rollout status deployment ingress-nginx-controller -n ingress-nginx
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 docker login harbor.k8s.local:30080 │ ▼ 访问 http://harbor.k8s.local :30080/v2/ │ ▼ Harbor 返回 401,并告诉 docker 去哪里拿 token: Location: http://harbor.k8s.local /service/token?... ^^^^^^^^^^^^^^^^ 注意!这里没有端口!用的是默认 80 │ ▼ docker 去访问 http://harbor.k8s.local :80/service/token │ ▼ ❌ 80 端口没人监听 → connection refused