Bootstrap

k8s部署metrics server

Kubernetes Metrics Server 是集群范围内的资源使用数据聚合器。 它的工作是从 Kubelet 在每个节点上公开的 Summary API 收集指标。 资源使用指标,例如容器 CPU 和内存使用情况,在解决奇怪的资源使用问题时很有帮助。 所有这些指标都可以通过 Metrics APIKubernetes 中使用。

Metrics API 具有给定节点或给定 pod 当前使用的资源量。 由于它不存储指标值,因此 Metrics Server 用于此目的。 部署 yamls文件在 Metrics Server 项目源代码中提供用于安装。

要求

Metrics Server 对集群和网络配置有特定要求。

这些要求并不是所有集群发行版的默认要求。 在使用 Metrics Server 之前,请确保集群支持这些要求:

  • Metrics Server 必须可以从 kube-apiserver 访问
  • 必须正确配置 kube-apiserver 以启用聚合层
  • 节点必须配置 kubelet 授权以匹配 Metrics Server 配置
  • 容器运行时必须实现容器指标 RPC

部署

下载部署文件。

wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml -O metrics-server-components.yaml

修改镜像地址:

sed -i 's/k8s.gcr.io\/metrics-server/registry.cn-hangzhou.aliyuncs.com\/google_containers/g' metrics-server-components.yaml

修改tls校验:

$ cat metrics-server-components.yaml
......
      containers:
      - args:
        ...
        - --kubelet-insecure-tls
......

部署:

$ kubectl apply -f metrics-server-components.yaml
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created

使用以下命令验证 metrics-server 部署是否正在运行所需数量的 pod

$ kubectl get deployment metrics-server -n kube-system
NAME             READY   UP-TO-DATE   AVAILABLE   AGE
metrics-server   1/1     1            1           52s

$ kubectl get pods -n kube-system | grep metrics
metrics-server-68fbbb47dc-d4gbl         1/1     Running   0              78s

测试

可以使用 kubectl top 命令访问 Metrics API

$ kubectl top --help
Display Resource (CPU/Memory) usage.

 The top command allows you to see the resource consumption for nodes or pods.

 This command requires Metrics Server to be correctly configured and working on the server.

Available Commands:
  node        Display resource (CPU/memory) usage of nodes
  pod         Display resource (CPU/memory) usage of pods

Usage:
  kubectl top [flags] [options]

Use "kubectl <command> --help" for more information about a given command.
Use "kubectl options" for a list of global command-line options (applies to all commands).

要显示集群节点资源使用情况 - CPU/内存/存储,运行以下命令:

$ kubectl top nodes
NAME            CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
k8s-master-01   142m         3%     1291Mi          33%       
k8s-worker-01   54m          1%     729Mi           19%       
k8s-worker-02   42m          1%     685Mi           17%       
k8s-worker-03   35m          0%     1191Mi          31%

类似的命令可用于 pod

$ kubectl top pods -A 
NAMESPACE              NAME                                         CPU(cores)   MEMORY(bytes)   
calico-apiserver       calico-apiserver-64cd47df68-nfmmq            4m           31Mi            
calico-apiserver       calico-apiserver-64cd47df68-q7g94            4m           31Mi            
calico-system          calico-kube-controllers-7dddfdd6c9-lw29c     3m           21Mi            
calico-system          calico-node-5kzxc                            16m          152Mi           
calico-system          calico-node-9n4dh                            18m          151Mi           
calico-system          calico-node-bzwdk                            17m          151Mi           
calico-system          calico-node-s6kx2                            16m          93Mi            
calico-system          calico-typha-865568fbb8-gm8vj                2m           22Mi            
calico-system          calico-typha-865568fbb8-zc672                1m           21Mi            
kube-system            coredns-6d8c4cb4d-rlsr8                      2m           13Mi            
kube-system            coredns-6d8c4cb4d-v667h                      1m           13Mi            
kube-system            etcd-k8s-master-01                           13m          61Mi            
kube-system            kube-apiserver-k8s-master-01                 41m          365Mi           
kube-system            kube-controller-manager-k8s-master-01        8m           55Mi            
kube-system            kube-proxy-2xj99                             1m           13Mi            
kube-system            kube-proxy-pnv4b                             1m           19Mi            
kube-system            kube-proxy-xh4nl                             1m           19Mi            
kube-system            kube-proxy-xqxhw                             1m           19Mi            
kube-system            kube-scheduler-k8s-master-01                 2m           19Mi            
kube-system            metrics-server-75ff6bdcc-2k2mk               3m           16Mi            
kubernetes-dashboard   dashboard-metrics-scraper-799d786dbf-bxd9q   1m           6Mi             
kubernetes-dashboard   kubernetes-dashboard-7577b7545-gd7zm         1m           14Mi            
tigera-operator        tigera-operator-768d489967-jq9v7             2m           25Mi

可以访问使用 kubectl get –raw来获取集群中所有节点的原始资源使用指标。

$ apt install jq -y
$ kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes" | jq
{
  "kind": "NodeMetricsList",
  "apiVersion": "metrics.k8s.io/v1beta1",
  "metadata": {},
  "items": [
    {
      "metadata": {
        "name": "k8s-master-01",
        "creationTimestamp": "2022-01-22T09:25:28Z",
        "labels": {
          "beta.kubernetes.io/arch": "amd64",
          "beta.kubernetes.io/os": "linux",
          "kubernetes.io/arch": "amd64",
          "kubernetes.io/hostname": "k8s-master-01",
          "kubernetes.io/os": "linux",
          "node-role.kubernetes.io/control-plane": "",
          "node-role.kubernetes.io/master": "",
          "node.kubernetes.io/exclude-from-external-load-balancers": ""
        }
      },
      "timestamp": "2022-01-22T09:25:09Z",
      "window": "10.018s",
      "usage": {
        "cpu": "167263108n",
        "memory": "1345468Ki"
      }
    },
    {
      "metadata": {
        "name": "k8s-worker-01",
        "creationTimestamp": "2022-01-22T09:25:28Z",
        "labels": {
          "beta.kubernetes.io/arch": "amd64",
          "beta.kubernetes.io/os": "linux",
          "kubernetes.io/arch": "amd64",
          "kubernetes.io/hostname": "k8s-worker-01",
          "kubernetes.io/os": "linux"
        }
      },
      "timestamp": "2022-01-22T09:25:15Z",
      "window": "20.027s",
      "usage": {
        "cpu": "47595856n",
        "memory": "746892Ki"
      }
    },
    {
      "metadata": {
        "name": "k8s-worker-02",
        "creationTimestamp": "2022-01-22T09:25:28Z",
        "labels": {
          "beta.kubernetes.io/arch": "amd64",
          "beta.kubernetes.io/os": "linux",
          "kubernetes.io/arch": "amd64",
          "kubernetes.io/hostname": "k8s-worker-02",
          "kubernetes.io/os": "linux"
        }
      },
      "timestamp": "2022-01-22T09:25:10Z",
      "window": "10.014s",
      "usage": {
        "cpu": "44446150n",
        "memory": "702516Ki"
      }
    },
    {
      "metadata": {
        "name": "k8s-worker-03",
        "creationTimestamp": "2022-01-22T09:25:28Z",
        "labels": {
          "beta.kubernetes.io/arch": "amd64",
          "beta.kubernetes.io/os": "linux",
          "kubernetes.io/arch": "amd64",
          "kubernetes.io/hostname": "k8s-worker-03",
          "kubernetes.io/os": "linux"
        }
      },
      "timestamp": "2022-01-22T09:25:12Z",
      "window": "10.012s",
      "usage": {
        "cpu": "33563159n",
        "memory": "1219948Ki"
      }
    }
  ]
}

附录

设置标志

Metrics Server 支持所有标准的 Kubernetes API 服务器标志,以及标准的 Kubernetes glog 日志标志。最常用的是:

  • --logtostderr:记录到标准错误而不是容器中的文件。
  • --v=<X>:设置日志详细程度。除非遇到错误,通常设置为运行日志级别 1 或 2 。在日志级别 10,将报告大量诊断信息,包括 API 请求和响应正文,以及来自 Kubelet 的原始指标结果。
  • --secure-port=<port>:设置安全端口。如果不是以 root 身份运行,则需要将其设置为默认值(端口 443)以外的其他值。
  • --tls-cert-file, --tls-private-key-file:服务证书和密钥文件。如果未指定,将生成自签名证书。在生产中使用非自签名证书。
  • --kubelet-certificate-authority:用于验证 Kubelet 服务证书的 CA 证书路径。

更改 Metrics Server 行为的其他标志:

  • --metric-resolution=<duration>:从 Kubelet 抓取指标的时间间隔(默认为 60 秒)。
  • --kubelet-insecure-tls:跳过验证 Kubelet CA 证书。
  • --kubelet-port:用于连接 Kubelet 的端口(默认为默认的安全 Kubelet 端口,10250)。
  • --kubelet-preferred-address-types:连接 Kubelet时考虑 Kubelet 节点地址类型的顺序。
;