foreword
As Prometheus monitors more and more components, quantities, and indicators, Prometheus will have higher and higher requirements for computing performance and more and more storage usage.
In this case, it is necessary to optimize Prometheus performance and optimize storage usage. The first thought may be various Prometheus-compatible storage solutions, such as Thanos or VM, Mimir, etc. But in fact, although centralized storage, long-term storage, storage downsampling, and storage compression can solve related problems to a certain extent, they can only solve the symptoms but not the root cause.
- The real problem lies in the fact that the index volume (series) is too large.
The fundamental solution should be to reduce the amount of indicators. There are 2 ways:
- Prometheus performance tuning - solving high cardinality problems
- According to the actual usage, only keep (keep) indicators that will be used for display (Grafana Dashboards) and alarms (prometheus rules).
This time, we will focus on the second method: how to simplify Prometheus's indicators and storage usage based on actual usage?
train of thought
- Analyze all metric name s (index items) currently stored in Prometheus;
- Analyze all the metric name s used in the display link, that is, all the indicators used by Grafana's Dashboards;
- Analyze all metric name s used in the alarm link, that is, all indicators used in Prometheus Rule configuration;
- (Optional) Analyze all metric name s used in the diagnostic environment, that is, indicators that are often queried on the Prometheus UI;
- pass relabel Only keep metrics in 2-4 in metric_relabel_configs or write_relabel_configs to drastically reduce the amount of metrics Prometheus needs to store.
To realize this idea, you can use Grafana Labs' mimirtool to get it done.
Here I have a before and after comparison effect, which can be used for reference to see how amazing the effect is:
- Before streamlining: 270336 event series
- After streamlining: 61055 activity series
- Streamlining effect: nearly 5 times the streamlining rate!
Grafana Mimirtool
Grafana Mimir is a Prometheus long-term storage solution based on object storage, which evolved from Cortex. It officially claims to support billion-level series write storage and query.
Grafana Mimirtool is a utility released by Mimir, which can be used alone.
Grafana Mimirtool supports extracting metrics from:
- Grafana Dashboards in a Grafana instance (via Grafana API)
- Prometheus alerting and recording rules in a Mimir instance
- Grafana Dashboards JSON file
- Prometheus remembers the YAML file of alerting and recording rules
Grafana Mimirtool can then compare these extracted metrics to the active series in the Prometheus or Cloud Prometheus instance and output a list of used and unused metrics.
Prometheus streamlined indicators in practice
suppose
assumed:
- Install Prometheus via kube-prometheus-stack
- Grafana installed and used as a display
- The corresponding alarm rules have been configured
- In addition, there are no other indicators that need to be retained
premise
- Grafana Mimirtool Find the version corresponding to the platform of mimirtool from releases and download it to use;
- already Create Grafana API token
- Prometheus installed and configured.
Step 1: Analyze the indicators used by Grafana Dashboards
via the Grafana API
details as follows:
# Analyze the metrics used by Grafana through the Grafana API # The premise is to create API Keys on Grafana now mimirtool analyze grafana --address http://172.16.0.20:32651 --key=eyJrIjoiYjBWMGVoTHZTY3BnM3V5UzNVem9iWDBDSG5sdFRxRVoiLCJuIjoibWltaXJ0b29sIiwiaWQiOjF9
๐Description:
- http://172.16.0.20:32651 is the Grafana address
- --key=eyJr is the Grafana API Token. Obtained through the following interface:
What is obtained is a metrics-in-grafana.json, the content of which is summarized as follows:
{ "metricsUsed": [ ":node_memory_MemAvailable_bytes:sum", "alertmanager_alerts", "alertmanager_alerts_invalid_total", "alertmanager_alerts_received_total", "alertmanager_notification_latency_seconds_bucket", "alertmanager_notification_latency_seconds_count", "alertmanager_notification_latency_seconds_sum", "alertmanager_notifications_failed_total", "alertmanager_notifications_total", "cluster", "cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits", "cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests", "cluster:namespace:pod_memory:active:kube_pod_container_resource_limits", "cluster:namespace:pod_memory:active:kube_pod_container_resource_requests", "cluster:node_cpu:ratio_rate5m", "container_cpu_cfs_periods_total", "container_cpu_cfs_throttled_periods_total", "..." ], "dashboards": [ { "slug": "", "uid": "alertmanager-overview", "title": "Alertmanager / Overview", "metrics": [ "alertmanager_alerts", "alertmanager_alerts_invalid_total", "alertmanager_alerts_received_total", "alertmanager_notification_latency_seconds_bucket", "alertmanager_notification_latency_seconds_count", "alertmanager_notification_latency_seconds_sum", "alertmanager_notifications_failed_total", "alertmanager_notifications_total" ], "parse_errors": null }, { "slug": "", "uid": "c2f4e12cdf69feb95caa41a5a1b423d9", "title": "etcd", "metrics": [ "etcd_disk_backend_commit_duration_seconds_bucket", "etcd_disk_wal_fsync_duration_seconds_bucket", "etcd_mvcc_db_total_size_in_bytes", "etcd_network_client_grpc_received_bytes_total", "etcd_network_client_grpc_sent_bytes_total", "etcd_network_peer_received_bytes_total", "etcd_network_peer_sent_bytes_total", "etcd_server_has_leader", "etcd_server_leader_changes_seen_total", "etcd_server_proposals_applied_total", "etcd_server_proposals_committed_total", "etcd_server_proposals_failed_total", "etcd_server_proposals_pending", "grpc_server_handled_total", "grpc_server_started_total", "process_resident_memory_bytes" ], "parse_errors": null }, {...} ] }
(Optional) via Grafana Dashboards json file
If the Grafana API Token cannot be created, as long as there is a Grafana Dashboards json file, it can also be used for analysis. The example is as follows:
# Analyze the indicators used by Grafana through Grafana Dashboard json mimirtool analyze dashboard grafana_dashboards/blackboxexporter-probe.json mimirtool analyze dashboard grafana_dashboards/es.json
The obtained json structure is similar to the previous section, so I won’t repeat it here.
Step 2: Analyze the indicators used by Prometheus Alerting and Recording Rules
The specific operation is as follows:
# (Optional) Copy the used rule files to the local via kubectl cp kubectl cp <prompod>:/etc/prometheus/rules/<releasename>-kube-prometheus-st-prometheus-rulefiles-0 -c prometheus ./kube-prometheus-stack/rulefiles/ # Analyze the indicators used by Prometheus Rule through Prometheus rule files (involving recording rule and alert rules) mimirtool analyze rule-file ./kube-prometheus-stack/rulefiles/*
The result is as follows metrics-in-ruler.json:
{ "metricsUsed": [ "ALERTS", "aggregator_unavailable_apiservice", "aggregator_unavailable_apiservice_total", "apiserver_client_certificate_expiration_seconds_bucket", "apiserver_client_certificate_expiration_seconds_count", "apiserver_request_terminations_total", "apiserver_request_total", "blackbox_exporter_config_last_reload_successful", "..." ], "ruleGroups": [ { "namspace": "default-monitor-kube-prometheus-st-kubernetes-apps-ae2b16e5-41d8-4069-9297-075c28c6969e", "name": "kubernetes-apps", "metrics": [ "kube_daemonset_status_current_number_scheduled", "kube_daemonset_status_desired_number_scheduled", "kube_daemonset_status_number_available", "kube_daemonset_status_number_misscheduled", "kube_daemonset_status_updated_number_scheduled", "..." ] "parse_errors": null }, { "namspace": "default-monitor-kube-prometheus-st-kubernetes-resources-ccb4a7bc-f2a0-4fe4-87f7-0b000468f18f", "name": "kubernetes-resources", "metrics": [ "container_cpu_cfs_periods_total", "container_cpu_cfs_throttled_periods_total", "kube_node_status_allocatable", "kube_resourcequota", "namespace_cpu:kube_pod_container_resource_requests:sum", "namespace_memory:kube_pod_container_resource_requests:sum" ], "parse_errors": null }, {...} ] }
Step 3: Analyze Unused Indicators
details as follows:
# Comprehensive analysis of VS collected by Prometheus (display (Grafana Dashboards) + records and alarms (Rule files)) mimirtool analyze prometheus --address=http://172.16.0.20:30090/ --grafana-metrics-file="metrics-in-grafana.json" --ruler-metrics-file="metrics-in-ruler.json"
๐Description:
- --address=http://172.16.0.20:30090/ is the prometheus address
- --grafana-metrics-file="metrics-in-grafana.json" is the json file obtained in the first step
- --ruler-metrics-file="kube-prometheus-stack-metrics-in-ruler.json" is the json file obtained in the second step
The output prometheus-metrics.json is as follows:
{ "total_active_series": 270336, "in_use_active_series": 61055, "additional_active_series": 209281, "in_use_metric_counts": [ { "metric": "rest_client_request_duration_seconds_bucket", "count": 8855, "job_counts": [ { "job": "kubelet", "count": 4840 }, { "job": "kube-controller-manager", "count": 1958 }, {...} ] }, { "metric": "grpc_server_handled_total", "count": 4394, "job_counts": [ { "job": "kube-etcd", "count": 4386 }, { "job": "default/kubernetes-ebao-ebaoops-pods", "count": 8 } ] }, {...} ], "additional_metric_counts": [ { "metric": "rest_client_rate_limiter_duration_seconds_bucket", "count": 81917, "job_counts": [ { "job": "kubelet", "count": 53966 }, { "job": "kube-proxy", "count": 23595 }, { "job": "kube-scheduler", "count": 2398 }, { "job": "kube-controller-manager", "count": 1958 } ] }, { "metric": "rest_client_rate_limiter_duration_seconds_count", "count": 7447, "job_counts": [ { "job": "kubelet", "count": 4906 }, { "job": "kube-proxy", "count": 2145 }, { "job": "kube-scheduler", "count": 218 }, { "job": "kube-controller-manager", "count": 178 } ] }, {...} ] }
Step 4: Only the indicators used by keep
Configured in the write_relabel_configs link
If you use remote_write, then configure the keep relabel rule directly in the write_relabel_configs link, which is simple and rude.
You can first use the jp command to get all the metric name s that need to be kept:
jq '.metricsUsed' metrics-in-grafana.json \ | tr -d '", ' \ | sed '1d;$d' \ | grep -v 'grafanacloud*' \ | paste -s -d '|' -
The output is similar to the following:
instance:node_cpu_utilisation:rate1m|instance:node_load1_per_cpu:ratio|instance:node_memory_utilisation:ratio|instance:node_network_receive_bytes_excluding_lo:rate1m|instance:node_network_receive_drop_excluding_lo:rate1m|instance:node_network_transmit_bytes_excluding_lo:rate1m|instance:node_network_transmit_drop_excluding_lo:rate1m|instance:node_vmstat_pgmajfault:rate1m|instance_device:node_disk_io_time_seconds:rate1m|instance_device:node_disk_io_time_weighted_seconds:rate1m|node_cpu_seconds_total|node_disk_io_time_seconds_total|node_disk_read_bytes_total|node_disk_written_bytes_total|node_filesystem_avail_bytes|node_filesystem_size_bytes|node_load1|node_load15|node_load5|node_memory_Buffers_bytes|node_memory_Cached_bytes|node_memory_MemAvailable_bytes|node_memory_MemFree_bytes|node_memory_MemTotal_bytes|node_network_receive_bytes_total|node_network_transmit_bytes_total|node_uname_info|up
Then configure the keep relabel rule directly in the write_relabel_configs link:
remote_write: - url: <remote_write endpoint> basic_auth: username: <on demand> password: <on demand> write_relabel_configs: - source_labels: [__name__] regex: instance:node_cpu_utilisation:rate1m|instance:node_load1_per_cpu:ratio|instance:node_memory_utilisation:ratio|instance:node_network_receive_bytes_excluding_lo:rate1m|instance:node_network_receive_drop_excluding_lo:rate1m|instance:node_network_transmit_bytes_excluding_lo:rate1m|instance:node_network_transmit_drop_excluding_lo:rate1m|instance:node_vmstat_pgmajfault:rate1m|instance_device:node_disk_io_time_seconds:rate1m|instance_device:node_disk_io_time_weighted_seconds:rate1m|node_cpu_seconds_total|node_disk_io_time_seconds_total|node_disk_read_bytes_total|node_disk_written_bytes_total|node_filesystem_avail_bytes|node_filesystem_size_bytes|node_load1|node_load15|node_load5|node_memory_Buffers_bytes|node_memory_Cached_bytes|node_memory_MemAvailable_bytes|node_memory_MemFree_bytes|node_memory_MemTotal_bytes|node_network_receive_bytes_total|node_network_transmit_bytes_total|node_uname_info|up action: keep
Configured in the metric_relabel_configs link
If remote_write is not used, it can only be configured in the metric_relabel_configs link.
Take etcd job as an example: (Take prometheus configuration as an example, please adjust Prometheus Operator according to your needs)
- job_name: serviceMonitor/default/monitor-kube-prometheus-st-kube-etcd/0 honor_labels: false kubernetes_sd_configs: - role: endpoints namespaces: names: - kube-system scheme: https tls_config: insecure_skip_verify: true ca_file: /etc/prometheus/secrets/etcd-certs/ca.crt cert_file: /etc/prometheus/secrets/etcd-certs/healthcheck-client.crt key_file: /etc/prometheus/secrets/etcd-certs/healthcheck-client.key relabel_configs: - source_labels: - job target_label: __tmp_prometheus_job_name - ... metric_relabel_configs: - source_labels: [__name__] regex: etcd_disk_backend_commit_duration_seconds_bucket|etcd_disk_wal_fsync_duration_seconds_bucket|etcd_mvcc_db_total_size_in_bytes|etcd_network_client_grpc_received_bytes_total|etcd_network_client_grpc_sent_bytes_total|etcd_network_peer_received_bytes_total|etcd_network_peer_sent_bytes_total|etcd_server_has_leader|etcd_server_leader_changes_seen_total|etcd_server_proposals_applied_total|etcd_server_proposals_committed_total|etcd_server_proposals_failed_total|etcd_server_proposals_pending|grpc_server_handled_total|grpc_server_started_total|process_resident_memory_bytes|etcd_http_failed_total|etcd_http_received_total|etcd_http_successful_duration_seconds_bucket|etcd_network_peer_round_trip_time_seconds_bucket|grpc_server_handling_seconds_bucket|up action: keep
Use drop instead of keep
Similarly, it is also possible to use drop instead of keep. I won’t go into details here.
๐๐๐
Summarize
In this article, the requirements for streamlining Prometheus indicators are introduced, and then how to use the mimirtool analyze command to determine the indicators used in Grafana Dashboards and Prometheus Rules. Then use analyze prometheus to analyze the used and unused activity series in the display and alarm, and finally configure Prometheus to only keep the used indicators.
Combined with this actual combat, the simplification rate can reach about 5 times, and the effect is still very obvious. It is recommended to give it a try. ๐๏ธ๐๏ธ๐๏ธ
๐๏ธ Reference Documentation
- grafana/mimir: Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus. (github.com)
- Analyzing and reducing metrics usage with Grafana Mimirtool | Grafana Cloud documentation
Three people walk together, there must be my teacher; knowledge sharing, the world is public. This article is sponsored by Dongfeng Weiming Technology Blog EWhisper.cn write.