Use Prometheus to monitor web sites and certificate expiration


In this paper, web monitoring refers to the monitoring of certain access addresses or interfaces.Here are some examples of how to configure Prometheus, black_exporter, grafana monitor the following aspects of the site:

  1. Status Code
  2. response time
  3. Certificate expiration time

The final result is as follows (pictures are recommended on the computer):


Detailed view of single site monitoring

Enterprise WeChat Warning Picture

Prometheus web monitoring needs help blackbox_exporter

Of course black_exporter is much more than monitoring a web site. It also monitors ports (TCP), DNS, UDP, etc. Here's an official description

The blackbox exporter allows blackbox probing of endpoints over HTTP, HTTPS, DNS, TCP and ICMP.

We're only focusing on web monitoring here. Other monitoring is not covered in this article.

Configuration details

Configuration is roughly divided into the following steps:

  1. Install black_exporter
  2. Configure Monitoring Target Address
  3. Configure alarm rules
  4. Configure the grafana panel

Installation Configuration black_expoter

I use docker-compose here for deployment (my Prometheus, grafana, alertmanager, and so on are all defined in a compose file, with only black_listed hereExporter)

version: '3.7'
services:
  blackbox_exporter:
    container_name: blackbox_exporter
    image: prom/blackbox-exporter:master
    volumes:
      - /data/monitor/blackbox_exporter/config/config.yml:/etc/blackbox_exporter/config.yml
    ports:
      - 9115:9115

Configuration file/data/monitor/blackbox_The contents of exporter/config/config.yml are as follows:

modules:
  http_2xx:      # Name the module, which is used later in the Prometheus configuration file
    prober: http # Probe types, probes have many types such as http, tcp, icmp, dns, different probes have different functions
    timeout: 5s  # Probe Detection Timeout
    http:
      valid_status_codes: [] # A valid status code, which defaults to 200, can also be defined by itself, such as your site 304 may be normal
      method: GET            # http uses get requests
      fail_if_body_not_matches_regexp: [] # Regular matching of returned results and failure if no match succeeds
      tls_config:
        insecure_skip_verify: true        # Unsafe https skip validation, such as certain certificates that are not valid or expired. If you access them in a browser, the browser will let you confirm that you want to continue. This is similar here.

Configure Prometheus to support black_expoter

Add the following to the prometheus.yml configuration file:

  - job_name: 'http_status'  # Configure job Name
    metrics_path: /probe     # Define the path to metric acquisition
    params:
      module: [http_2xx]     # This is where we are in black_Module name defined in exporter
    file_sd_configs:         # Since there are so many addresses to monitor, we'll separate all addresses here, which will be described later
      - files: 
        - '/etc/prometheus/etc.d/job_web.yaml'
        refresh_interval: 30s # Refresh once in 30 seconds, when a new monitoring address is available, it will automatically load in without restarting
    relabel_configs:
      - source_labels: [__address__]  # The access address of the current target, such as Baidu Monitoringhttps://baidu.com
        target_label: __param_target  # _uParam is the default parameter prefix and target is the parameter, which can be interpreted as uAddress_uThe value of u is assigned toParam_Target, if Baidu is monitored, target=https://baidu.com
      - source_labels: [__param_target]
        target_label: instance        # Can be understood as uParam_The value of the target is assigned to the instance tag
      - target_label: __address__
        replacement: 172.33.0.33:9115 # web monitoring originally targeted the address of the site, but Prometheus did not request the address directly, but requested black_exporter, so you need to replace the target address with black_Address of exporter

job_web.yaml sample

---
- targets:
  - https://www.baidu.com/
  labels:
    env: pro
    app: web
    project: Baidu
    desc: Baidu Production
- targets:
  - https://blog.csdn.net/
  labels:
    env: test
    app: web
    project: CSDN
    desc: Have a test
    not_200: yes # This custom label is used to identify addresses that normally do not return a 200 status code

Once the configuration is complete, you will see similar goals in your Prometheus

Let's find a click in and see that the final jump address looks like this:

http://172.33.0.33:9115/probe?module=http_2xx&target=https%3A%2F%2Fwww.baidu.com

Screenshot of indicator information

Configure alarm rules

Alert Rule File

groups:
- name: web
  rules:
  - alert: Web Access Exception
    expr: probe_http_status_code{not_200 != "yes" } != 200
    for: 30s
    annotations:
      summary: Web Access Exception{{ $labels.instance }}
    labels:
      Severity: 'serious'
  - alert: Web Access Response Time>3s
    expr: probe_duration_seconds >= 3
    for: 30s
    annotations:
      summary: Web Response Exception{{ $labels.instance }}
    labels:
      Severity: 'warning'
  - alert: Certificate expiration time<30 day
    expr: probe_ssl_earliest_cert_expiry-time()< 3600*24*30
    annotations:
      summary: Web Certificate will expire in 30 days {{ $labels.instance }}
    labels:
      Severity: 'remind'
  - alert: Certificate expiration time<7 day
    expr: probe_ssl_earliest_cert_expiry-time()< 3600*24*7
    annotations:
      summary: Web Certificate will expire in 30 days {{ $labels.instance }}
    labels:
      Severity: 'serious'
  - alert: Certificate expiration time<1 day
    expr: probe_ssl_earliest_cert_expiry-time()< 3600*24*1
    annotations:
      summary: Web Certificate will expire in 30 days {{ $labels.instance }}
    labels:
      Severity: 'disaster'

Configure grafana

grafana web Monitor I made two panels, like the screenshots above.

  • Site Availability Observing Center
  • Site Availability - Single Site

One is an overview of all sites, the other is the details of a single site.

You can jump to the details page by clicking Overview.

[External chain picture transfer failed, source station may have anti-theft chain mechanism, it is recommended to save the picture and upload it directly (img-bw7lex5q-1624292842330)(Prometheus monitoring web site and certificate expiration.assets/image-20210621233847469.png)]

The configuration of grafana is cumbersome and the panel in this article is already available for download on the grafana website Web Monitor Center dashboard for Grafana | Grafana Labs

Note: In the actual use of the author, it is found that the compatibility of grafana is not very good, there are many incompatible panels downloaded from grafana, so compatibility problems may also occur. The current version of the author's grafana is: v7.5.3

Tags: Docker DevOps monitor and control

Posted by grantf on Tue, 22 Jun 2021 01:55:15 +0930