目录
1、Promethesu Exporter介绍
Prometheus的Client Library提供度量的四种基本类型包括
// Counter 计数器
// Gauge 仪表盘
// Histogram 直方图
// Summary 概要
// Prometheus中metric的格式
// 格式:<metric name>{<label name>=<label value>, ...}
// 例如:api_http_requests_total{method="POST", handler="/messages"}
// metric name: 唯一标识,命名遵循[a-zA-Z_:][a-zA-Z0-9_:]*.
Counter
Counter类型好比计数器,用于统计类似于: cpu时间,api访问总次数,异常发生次数等等场景,这些指标的特点就是增加不减少.
因此当我们需要统计cpu的使用率时,我们需要使用rate{}函数计算该counter在过去一段时间每个时间序列上的每秒的平均增长率.
一个Counter标识一个累计度量,只增不减,重启后恢复为0,适用于访问次数统计,异常次数统计等场景.
Gauge
Gauge类型,英文直译的话叫"计量器",但是和Counter的翻译太类似了,因此我个人更喜欢使用"仪表盘"这个称呼,仪表盘的特点就是可以增加或者减少的,因此Gauge适合用于如: 当前内存使用率,当前cpu使用率,当前温度,当前速度等等一系列的监控指标
Gauge表示可变化的度量值,适用于CPU,内存使用率等
Histogram
Histogram柱状图这个比较直接,更多的是用于统计一些数据分布的情况,用于计算在一定范围的分布情况,同时还提供了度量指标值的总和.
Histogram对指标的范围性(区间)统计,比如内存在0%-30%, 30%-70%之间的采样次数
Histogram包含三个指标
<basename>: 度量值名称
<basename>_count: 样本反正总次数
<basename>_sum: 样本发生次数中值的总和
<basename>_bucket{le="+Inf"}: 每个区间的样本数
Summary
Summary摘要和Histogram柱状图比较类似,主要用于计算在一定时间窗口范围内度量指标对象的总数以及所有对量指标的总和.
和histogram类似,提供次数和总和,同时提供每个滑动窗口中的分位数.
Histogram和Summary的对比
序号 | histogram | Summary |
---|---|---|
配置 | 区间配置 | 分位数和滑动窗口 |
客户端性能 | 只需增加counters代价小 | 需要流式计算代价高 |
服务端性能 | 计算分位数消耗大,可能会耗时 | 无需计算,代价小 |
时序数量 | _sum、_count、bucket | _sum、_count、quantile |
分位数误差 | bucket的大小有关 | φ的配置有关 |
φ和滑动窗口 | Prometheus 表达式设置 | 客户端设置 |
聚合 | 根据表达式聚合 | 一般不可聚合 |
2、Prometheus Exporter开发流程:
定义metric结构体->注册metric->Describe->Collect
2.1、定义Describe、Collect方式采集
2.1.1、定义metric结构体
func NewMetrics(namespace string) *Metrics { return &Metrics{ metrics: map[string]*prometheus.Desc{ "my_counter_metric": newGlobalMetric(namespace, "my_counter_metric", "The description of my_counter_metric", []string{"host"}), "my_gauge_metric": newGlobalMetric(namespace, "my_gauge_metric", "The description of my_gauge_metric", []string{"host"}), "hello": newGlobalMetric(namespace, "hello", "dd", []string{"dddddd"}), }, summaryDuration: prometheus.NewSummary(prometheus.SummaryOpts{ Namespace: namespace, Name: "summary_durations", Help: "Durations of summary by the exporter", Objectives: map[float64]float64{0.5: 0, 0.9: 0, 0.99: 0}, }, ), histogram: newGlobalMetric(namespace, "histogram_duration", "histogram_duration", []string{"host"}), countDuration: prometheus.NewCounter(prometheus.CounterOpts{ Namespace: namespace, Name: "count_durations", Help: "Count Durations", }), } } func newGlobalMetric(namespace string, metricName string, docString string, labels []string) *prometheus.Desc { return prometheus.NewDesc(namespace+"_"+metricName, docString, labels, nil) }
2.1.2、注册Metric
metrics := collector.NewMetrics(*metricsNamespace) registry := prometheus.NewRegistry() registry.MustRegister(metrics)
2.1.3、注入Describe
执行前需要先注册metric结构体,然后将metric指标信息放入chan队列中
func (c *Metrics) Describe(ch chan<- *prometheus.Desc) { for _, m := range c.metrics { ch <- m } ch <- c.summaryDuration.Desc() ch <- c.histogram ch <- c.countDuration.Desc() }
2.1.4、编写Collect采集监控
采集metric指标数据,然后将监控指标数据存放到对应的chan队列中
func (c *Metrics) Collect(ch chan<- prometheus.Metric) { c.mutex.Lock() // 加锁 defer c.mutex.Unlock() mockCounterMetricData, mockGaugeMetricData, hello := c.GenerateMockData() for host, currentValue := range mockCounterMetricData { ch <- prometheus.MustNewConstMetric(c.metrics["my_counter_metric"], prometheus.CounterValue, float64(currentValue), host) } for host, currentValue := range mockGaugeMetricData { ch <- prometheus.MustNewConstMetric(c.metrics["my_gauge_metric"], prometheus.GaugeValue, float64(currentValue), host) } for host, currentValue := range hello { ch <- prometheus.MustNewConstMetric(c.metrics["hello"], prometheus.GaugeValue, float64(currentValue), host) } c.countDuration.Inc() ch <- c.countDuration c.summaryDuration.Observe(float64(rand.Int31n(10))) ch <- c.summaryDuration //ch <- prometheus.MustNewConstSummary(c.summaryDuration.Desc(), 4711, 403.34, map[float64]float64{0.5: 42.3, 0.9: 323.3}) ch <- prometheus.MustNewConstHistogram(c.histogram, 4711, 403.34, map[float64]uint64{25: 121, 50: 2403, 100: 3221, 200: 4233}, "200") } func (c *Metrics) GenerateMockData() (mockCounterMetricData map[string]int, mockGaugeMetricData map[string]int, hello map[string]int) { mockCounterMetricData = map[string]int{ "yahoo.com": int(rand.Int31n(1000)), "google.com": int(rand.Int31n(1000)), } mockGaugeMetricData = map[string]int{ "yahoo.com": int(rand.Int31n(10)), "google.com": int(rand.Int31n(10)), } hello = map[string]int{ "yahoo.com": int(rand.Int31n(10)), "google.com": int(rand.Int31n(10)), } return }
2.1.5、创建采集url
http.Handle(*metricsPath, promhttp.HandlerFor(registry, promhttp.HandlerOpts{})) http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { w.Write([]byte(`<html> <head><title>A Prometheus Exporter</title></head> <body> <h1>A Prometheus Exporter</h1> <p><a href='/metrics'>Metrics</a></p> </body> </html>`)) }) log.Printf("Starting Server at http://localhost:%s%s", *listenAddr, *metricsPath) log.Fatal(http.ListenAndServe(":"+*listenAddr, nil))
2.2、Vec方式采集
2.2.1、定义metric结构体
rpcDurations = prometheus.NewSummaryVec( prometheus.SummaryOpts{ Name: "rpc_durations_seconds", Help: "RPC latency distributions.", Objectives: map[float64]float64{0.5: 0.05, 0.9: 0.01, 0.99: 0.001}, }, []string{"service"}, )
2.2.2、注册metric
prometheus.MustRegister(rpcDurations)
2.2.3、采集metric指标数据
oscillationFactor := func() float64 { return 2 + math.Sin(math.Sin(2*math.Pi*float64(time.Since(start))/float64(*oscillationPeriod))) } // Periodically record some sample latencies for the three services. go func() { for { v := rand.Float64() * *uniformDomain rpcDurations.WithLabelValues("uniform").Observe(v) time.Sleep(time.Duration(100*oscillationFactor()) * time.Millisecond) } }()
2.2.4、创建url
http.Handle("/metrics", promhttp.HandlerFor( prometheus.DefaultGatherer, promhttp.HandlerOpts{ // Opt into OpenMetrics to support exemplars. EnableOpenMetrics: true, }, )) log.Fatal(http.ListenAndServe(*addr, nil))
2.3、promauto方式采集
package main
import (
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
"github.com/prometheus/client_golang/prometheus/promhttp"
"log"
"net/http"
"strconv"
)
var (
MsgRecvTotalCounter = promauto.NewCounter(prometheus.CounterOpts{
Name: "msg_recv_total",
Help: "The number of msg received",
})
)
func main() {
go func() {
MsgRecvTotalCounter.Inc()
}()
http.Handle("/metrics", promhttp.Handler())
log.Fatal(http.ListenAndServe(":"+strconv.Itoa(8081), nil))
}