Bootstrap

基于helm部署kube-prometheus stack全家桶

背景说明:

在日常使用原生k8s一般采用Prometheus进行监控,具体框架图如下

Kube-Prometheus Stack 仓库收集 Kubernetes 清单、Grafana 仪表板和 Prometheus 规则,结合相关文档和脚本,基于 Prometheus Operator 提供易于操作的端到端 Kubernetes 集群监控。

    此项目基于 jsonnet 编写,既可以被描述为一个包,也可以被描述为一个库。主要包含如下组件:

  • Grafana
  • kube-state-metrics
  • node-exporter
  • prometheus
  • Prometheus Adapter for Kubernetes Metrics APIs
  • Prometheus Operator
  • Alertmanager

一、环境

1.1 K8S版本:

[root@kubemaster01 system]# kubectl version 
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:14:22Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:07:13Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}

这里的版本比较低,故我们这里选择对应的事release-0.4的版本。

详细的对应的kube-prometheus请参照github:https://github.com/prometheus-operator/kube-prometheus 

1.2 helm:

根据k8s的版本选择对应的helm3.5.4,尽可能的使用helm3

二、安装

新建ns  monitoring,默认用的就是该ns名。并添加helm repo源

[leonli@Leon minikube ] % kubectl create ns monitoring
namespace/monitoring created
[leonli@Leon minikube ] % helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
"prometheus-community" has been added to your repositories
[leonli@Leon minikube ] % helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "traefik" chart repository
...Successfully got an update from the "komodorio" chart repository
...Successfully got an update from the "traefik-hub" chart repository
...Successfully got an update from the "prometheus-community" chart repository
Update Complete. ⎈Happy Helming!⎈
[leonli@Leon minikube ] % helm install prometheus-community/kube-prometheus-stack --namespace monitoring --generate-name 
Error: INSTALLATION FAILED: failed to download "prometheus-community/kube-prometheus-stack"

直接用helm安装报错,选择从github上拉取

[root@kubemaster01 ~]#  git clone https://github.com/prometheus-operator/kube-prometheus.git -b release-0.4 
Cloning into 'kube-prometheus'...
remote: Enumerating objects: 17291, done.
remote: Counting objects: 100% (197/197), done.
remote: Compressing objects: 100% (99/99), done.
remote: Total 17291 (delta 126), reused 146 (delta 91), pack-reused 17094
Receiving objects: 100% (17291/17291), 9.18 MiB | 6.19 MiB/s, done.
Resolving deltas: 100% (11319/11319), done.

 进入  kube-prometheus 目录下,安装 manifest/setup 目录下的所有 yaml 文件,具体如下:

[root@kubemaster01 ~]#  kubectl apply --server-side -f manifests/setup --force-conflicts                                 
customresourcedefinition.apiextensions.k8s.io/alertmanagerconfigs.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com serverside-applied

注:根据不同的k8的版本,部分yaml可能需要微调

[root@kubemaster01 ~]# cd manifests/setup
[root@kubemaster01 ~]#  ls -l
total 3040
-rw-r--r--  1 leonli  admin  169131 Dec  2 14:53 0alertmanagerConfigCustomResourceDefinition.yaml
-rw-r--r--  1 leonli  admin  377495 Dec  2 14:53 0alertmanagerCustomResourceDefinition.yaml
-rw-r--r--  1 leonli  admin   30361 Dec  2 14:53 0podmonitorCustomResourceDefinition.yaml
-rw-r--r--  1 leonli  admin   31477 Dec  2 14:53 0probeCustomResourceDefinition.yaml
-rw-r--r--  1 leonli  admin  502646 Dec  2 14:53 0prometheusCustomResourceDefinition.yaml
-rw-r--r--  1 leonli  admin    4101 Dec  2 14:53 0prometheusruleCustomResourceDefinition.yaml
-rw-r--r--  1 leonli  admin   31881 Dec  2 14:53 0servicemonitorCustomResourceDefinition.yaml
-rw-r--r--  1 leonli  admin  385790 Dec  2 14:53 0thanosrulerCustomResourceDefinition.yaml
-rw-r--r--  1 leonli  admin      60 Dec  2 14:53 namespace.yaml
[root@kubemaster01 ~]#  until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done
[root@kubemaster01 ~]#  cd ../..
[root@kubemaster01 ~]#  kubectl apply -f manifests/
alertmanager.monitoring.coreos.com/main created
poddisruptionbudget.policy/alertmanager-main created
prometheusrule.monitoring.coreos.com/alertmanager-main-rules created
secret/alertmanager-main created
service/alertmanager-main created
serviceaccount/alertmanager-main created
servicemonitor.monitoring.coreos.com/alertmanager-main created
clusterrole.rbac.authorization.k8s.io/blackbox-exporter created
clusterrolebinding.rbac.authorization.k8s.io/blackbox-exporter created
configmap/blackbox-exporter-configuration created
deployment.apps/blackbox-exporter created
service/blackbox-exporter created
serviceaccount/blackbox-exporter created
servicemonitor.monitoring.coreos.com/blackbox-exporter created
secret/grafana-config created
secret/grafana-datasources created
configmap/grafana-dashboard-alertmanager-overview created
configmap/grafana-dashboard-apiserver created
configmap/grafana-dashboard-cluster-total created
configmap/grafana-dashboard-controller-manager created
configmap/grafana-dashboard-k8s-resources-cluster created
configmap/grafana-dashboard-k8s-resources-namespace created
configmap/grafana-dashboard-k8s-resources-node created
configmap/grafana-dashboard-k8s-resources-pod created
configmap/grafana-dashboard-k8s-resources-workload created
configmap/grafana-dashboard-k8s-resources-workloads-namespace created
configmap/grafana-dashboard-kubelet created
configmap/grafana-dashboard-namespace-by-pod created
configmap/grafana-dashboard-namespace-by-workload created
configmap/grafana-dashboard-node-cluster-rsrc-use created
configmap/grafana-dashboard-node-rsrc-use created
configmap/grafana-dashboard-nodes created
configmap/grafana-dashboard-persistentvolumesusage created
configmap/grafana-dashboard-pod-total created
configmap/grafana-dashboard-prometheus-remote-write created
configmap/grafana-dashboard-prometheus created
configmap/grafana-dashboard-proxy created
configmap/grafana-dashboard-scheduler created
configmap/grafana-dashboard-workload-total created
configmap/grafana-dashboards created
deployment.apps/grafana created
service/grafana created
serviceaccount/grafana created
servicemonitor.monitoring.coreos.com/grafana created
prometheusrule.monitoring.coreos.com/kube-prometheus-rules created
clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
deployment.apps/kube-state-metrics created
prometheusrule.monitoring.coreos.com/kube-state-metrics-rules created
service/kube-state-metrics created
serviceaccount/kube-state-metrics created
servicemonitor.monitoring.coreos.com/kube-state-metrics created
prometheusrule.monitoring.coreos.com/kubernetes-monitoring-rules created
servicemonitor.monitoring.coreos.com/kube-apiserver created
servicemonitor.monitoring.coreos.com/coredns created
servicemonitor.monitoring.coreos.com/kube-controller-manager created
servicemonitor.monitoring.coreos.com/kube-scheduler created
servicemonitor.monitoring.coreos.com/kubelet created
clusterrole.rbac.authorization.k8s.io/node-exporter created
clusterrolebinding.rbac.authorization.k8s.io/node-exporter created
daemonset.apps/node-exporter created
prometheusrule.monitoring.coreos.com/node-exporter-rules created
service/node-exporter created
serviceaccount/node-exporter created
servicemonitor.monitoring.coreos.com/node-exporter created
clusterrole.rbac.authorization.k8s.io/prometheus-k8s created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-k8s created
poddisruptionbudget.policy/prometheus-k8s created
prometheus.monitoring.coreos.com/k8s created
prometheusrule.monitoring.coreos.com/prometheus-k8s-prometheus-rules created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s-config created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s-config created
role.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s created
service/prometheus-k8s created
serviceaccount/prometheus-k8s created
servicemonitor.monitoring.coreos.com/prometheus-k8s created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
clusterrole.rbac.authorization.k8s.io/prometheus-adapter created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-adapter created
clusterrolebinding.rbac.authorization.k8s.io/resource-metrics:system:auth-delegator created
clusterrole.rbac.authorization.k8s.io/resource-metrics-server-resources created
configmap/adapter-config created
deployment.apps/prometheus-adapter created
poddisruptionbudget.policy/prometheus-adapter created
rolebinding.rbac.authorization.k8s.io/resource-metrics-auth-reader created
service/prometheus-adapter created
serviceaccount/prometheus-adapter created
servicemonitor.monitoring.coreos.com/prometheus-adapter created
clusterrole.rbac.authorization.k8s.io/prometheus-operator created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
deployment.apps/prometheus-operator created
prometheusrule.monitoring.coreos.com/prometheus-operator-rules created
service/prometheus-operator created
serviceaccount/prometheus-operator created
servicemonitor.monitoring.coreos.com/prometheus-operator created

至此,全部应用安装完毕,检查

由于我这里只会用到Prometheus,所以将Prometheus的svc 单独暴露出30050端口,供给grafna访问,不会用到本文章中的grafna。

三、查看安装结果

3.1 查看Prometheus的targets结果,以及自发现情况

 

 

3.2 查看grafna 数据情况

;