Kubernetes Metrics Server
是集群范围内的资源使用数据聚合器。 它的工作是从 Kubelet
在每个节点上公开的 Summary API
收集指标。 资源使用指标,例如容器 CPU
和内存使用情况,在解决奇怪的资源使用问题时很有帮助。 所有这些指标都可以通过 Metrics API
在 Kubernetes
中使用。
Metrics API
具有给定节点或给定 pod
当前使用的资源量。 由于它不存储指标值,因此 Metrics Server
用于此目的。 部署 yamls
文件在 Metrics Server
项目源代码中提供用于安装。
要求
Metrics Server
对集群和网络配置有特定要求。
这些要求并不是所有集群发行版的默认要求。 在使用 Metrics Server
之前,请确保集群支持这些要求:
Metrics Server
必须可以从kube-apiserver
访问- 必须正确配置
kube-apiserver
以启用聚合层 - 节点必须配置
kubelet
授权以匹配Metrics Server
配置 - 容器运行时必须实现容器指标
RPC
部署
下载部署文件。
wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml -O metrics-server-components.yaml
修改镜像地址:
sed -i 's/k8s.gcr.io\/metrics-server/registry.cn-hangzhou.aliyuncs.com\/google_containers/g' metrics-server-components.yaml
修改tls
校验:
$ cat metrics-server-components.yaml
......
containers:
- args:
...
- --kubelet-insecure-tls
......
部署:
$ kubectl apply -f metrics-server-components.yaml
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
使用以下命令验证 metrics-server
部署是否正在运行所需数量的 pod
:
$ kubectl get deployment metrics-server -n kube-system
NAME READY UP-TO-DATE AVAILABLE AGE
metrics-server 1/1 1 1 52s
$ kubectl get pods -n kube-system | grep metrics
metrics-server-68fbbb47dc-d4gbl 1/1 Running 0 78s
测试
可以使用 kubectl top
命令访问 Metrics API
。
$ kubectl top --help
Display Resource (CPU/Memory) usage.
The top command allows you to see the resource consumption for nodes or pods.
This command requires Metrics Server to be correctly configured and working on the server.
Available Commands:
node Display resource (CPU/memory) usage of nodes
pod Display resource (CPU/memory) usage of pods
Usage:
kubectl top [flags] [options]
Use "kubectl <command> --help" for more information about a given command.
Use "kubectl options" for a list of global command-line options (applies to all commands).
要显示集群节点资源使用情况 - CPU
/内存/存储,运行以下命令:
$ kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-master-01 142m 3% 1291Mi 33%
k8s-worker-01 54m 1% 729Mi 19%
k8s-worker-02 42m 1% 685Mi 17%
k8s-worker-03 35m 0% 1191Mi 31%
类似的命令可用于 pod
。
$ kubectl top pods -A
NAMESPACE NAME CPU(cores) MEMORY(bytes)
calico-apiserver calico-apiserver-64cd47df68-nfmmq 4m 31Mi
calico-apiserver calico-apiserver-64cd47df68-q7g94 4m 31Mi
calico-system calico-kube-controllers-7dddfdd6c9-lw29c 3m 21Mi
calico-system calico-node-5kzxc 16m 152Mi
calico-system calico-node-9n4dh 18m 151Mi
calico-system calico-node-bzwdk 17m 151Mi
calico-system calico-node-s6kx2 16m 93Mi
calico-system calico-typha-865568fbb8-gm8vj 2m 22Mi
calico-system calico-typha-865568fbb8-zc672 1m 21Mi
kube-system coredns-6d8c4cb4d-rlsr8 2m 13Mi
kube-system coredns-6d8c4cb4d-v667h 1m 13Mi
kube-system etcd-k8s-master-01 13m 61Mi
kube-system kube-apiserver-k8s-master-01 41m 365Mi
kube-system kube-controller-manager-k8s-master-01 8m 55Mi
kube-system kube-proxy-2xj99 1m 13Mi
kube-system kube-proxy-pnv4b 1m 19Mi
kube-system kube-proxy-xh4nl 1m 19Mi
kube-system kube-proxy-xqxhw 1m 19Mi
kube-system kube-scheduler-k8s-master-01 2m 19Mi
kube-system metrics-server-75ff6bdcc-2k2mk 3m 16Mi
kubernetes-dashboard dashboard-metrics-scraper-799d786dbf-bxd9q 1m 6Mi
kubernetes-dashboard kubernetes-dashboard-7577b7545-gd7zm 1m 14Mi
tigera-operator tigera-operator-768d489967-jq9v7 2m 25Mi
可以访问使用 kubectl get –raw
来获取集群中所有节点的原始资源使用指标。
$ apt install jq -y
$ kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes" | jq
{
"kind": "NodeMetricsList",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {},
"items": [
{
"metadata": {
"name": "k8s-master-01",
"creationTimestamp": "2022-01-22T09:25:28Z",
"labels": {
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/os": "linux",
"kubernetes.io/arch": "amd64",
"kubernetes.io/hostname": "k8s-master-01",
"kubernetes.io/os": "linux",
"node-role.kubernetes.io/control-plane": "",
"node-role.kubernetes.io/master": "",
"node.kubernetes.io/exclude-from-external-load-balancers": ""
}
},
"timestamp": "2022-01-22T09:25:09Z",
"window": "10.018s",
"usage": {
"cpu": "167263108n",
"memory": "1345468Ki"
}
},
{
"metadata": {
"name": "k8s-worker-01",
"creationTimestamp": "2022-01-22T09:25:28Z",
"labels": {
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/os": "linux",
"kubernetes.io/arch": "amd64",
"kubernetes.io/hostname": "k8s-worker-01",
"kubernetes.io/os": "linux"
}
},
"timestamp": "2022-01-22T09:25:15Z",
"window": "20.027s",
"usage": {
"cpu": "47595856n",
"memory": "746892Ki"
}
},
{
"metadata": {
"name": "k8s-worker-02",
"creationTimestamp": "2022-01-22T09:25:28Z",
"labels": {
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/os": "linux",
"kubernetes.io/arch": "amd64",
"kubernetes.io/hostname": "k8s-worker-02",
"kubernetes.io/os": "linux"
}
},
"timestamp": "2022-01-22T09:25:10Z",
"window": "10.014s",
"usage": {
"cpu": "44446150n",
"memory": "702516Ki"
}
},
{
"metadata": {
"name": "k8s-worker-03",
"creationTimestamp": "2022-01-22T09:25:28Z",
"labels": {
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/os": "linux",
"kubernetes.io/arch": "amd64",
"kubernetes.io/hostname": "k8s-worker-03",
"kubernetes.io/os": "linux"
}
},
"timestamp": "2022-01-22T09:25:12Z",
"window": "10.012s",
"usage": {
"cpu": "33563159n",
"memory": "1219948Ki"
}
}
]
}
附录
设置标志
Metrics Server
支持所有标准的 Kubernetes API
服务器标志,以及标准的 Kubernetes glog
日志标志。最常用的是:
--logtostderr
:记录到标准错误而不是容器中的文件。--v=<X>
:设置日志详细程度。除非遇到错误,通常设置为运行日志级别 1 或 2 。在日志级别 10,将报告大量诊断信息,包括API
请求和响应正文,以及来自Kubelet
的原始指标结果。--secure-port=<port>
:设置安全端口。如果不是以root
身份运行,则需要将其设置为默认值(端口443
)以外的其他值。--tls-cert-file, --tls-private-key-file
:服务证书和密钥文件。如果未指定,将生成自签名证书。在生产中使用非自签名证书。--kubelet-certificate-authority
:用于验证Kubelet
服务证书的 CA 证书路径。
更改 Metrics Server
行为的其他标志:
--metric-resolution=<duration>
:从Kubelet
抓取指标的时间间隔(默认为60
秒)。--kubelet-insecure-tls
:跳过验证Kubelet CA
证书。--kubelet-port
:用于连接Kubelet
的端口(默认为默认的安全Kubelet
端口,10250
)。--kubelet-preferred-address-types
:连接Kubelet
时考虑Kubelet
节点地址类型的顺序。