How To Manage Prometheus Counters

Better Stack Team
Updated on November 29, 2024

Prometheus counters are metrics that only increase or reset to zero. They are ideal for tracking values like requests, errors, or completed tasks. Managing counters effectively ensures accurate and meaningful metrics.

1. Define counters in your code

Use Prometheus client libraries to define counters. Below is an example in Python:

 
from prometheus_client import Counter

# Define a counter
request_counter = Counter('http_requests_total', 'Total HTTP requests', ['method', 'endpoint'])

The labels argument allows tracking metrics by specific dimensions.

2. Increment counters

Increment counters using the .inc() method whenever an event occurs. You can increment by one or by a specific value:

 
# Increment by 1
request_counter.labels(method='GET', endpoint='/home').inc()

# Increment by a specific value
request_counter.labels(method='POST', endpoint='/submit').inc(5)

3. Reset counters

Counters are designed to reset only when the application restarts. Prometheus does not allow manual resets for counters to prevent misrepresentation of data. If you need a reset-like behavior, consider using a gauge instead.

4. Use PromQL for counter analysis

Prometheus counters track cumulative values. Use PromQL functions like rate() or irate() to compute the rate of increase over time.

Example queries:

  • Total number of requests over time:
 
  http_requests_total
  • Requests per second (average over 5 minutes):
 
  rate(http_requests_total[5m])
  • Requests per second (instantaneous rate):
 
  irate(http_requests_total[1m])

5. Handle counter resets in PromQL

When a counter resets (e.g., due to a restart), Prometheus automatically adjusts for this. However, use the rate() function to ensure accurate calculations:

 
rate(http_requests_total[1m])

This handles resets and calculates the rate based on increasing values.

6. Monitor counter behavior

Set up alerts to notify you if counters behave unexpectedly.

Example alert rule:

 
alert: HighErrorRate
expr: rate(http_requests_total{status="500"}[5m]) > 0.05
for: 2m
labels:
  severity: warning
annotations:
  summary: "High error rate detected"
  description: "Error rate is above 5% for the last 5 minutes."

Best practices

  • Use meaningful names and labels for counters to ensure clear insights.
  • Avoid using counters for metrics that decrease (use gauges for such cases).
  • Use rate() or irate() functions for time-based analysis.
  • Combine counters with histograms if you need latency or distribution data.