How to Calculate Containers' Cpu Usage in Kubernetes With Prometheus as Monitoring?
To calculate CPU usage for containers in a Kubernetes cluster using Prometheus, you need to set up Prometheus to scrape metrics from your Kubernetes nodes and pods. Here’s a step-by-step guide on how to achieve this, including a sample query to calculate CPU usage.
Step 1: Set Up Prometheus
If you haven't already set up Prometheus in your Kubernetes cluster, you can use the Prometheus Operator or a Helm chart for an easier installation. The following steps assume that you have Prometheus running and collecting metrics from your Kubernetes cluster.
Step 2: Scrape Metrics from kubelet
Ensure that your Prometheus configuration is set up to scrape metrics from the Kubernetes kubelet. The kubelet exposes metrics for all the containers running on a node.
Here’s a snippet of what your prometheus.yml
might look like:
scrape_configs:
- job_name: 'kubelet'
kubernetes_sd_configs:
- role: node
metrics_path: /metrics
scheme: https
tls_config:
insecure_skip_verify: true
Step 3: Use the Right Metrics
Prometheus collects various metrics from the containers. The most relevant ones for CPU usage are:
container_cpu_usage_seconds_total
: This metric represents the total CPU time consumed by the container.
Step 4: Calculate CPU Usage
To calculate CPU usage for containers, you generally want to look at the rate of CPU usage over a specific time interval. This can be done using the rate
function in PromQL.
Here’s a sample query to calculate CPU usage for a specific container over the last 5 minutes:
sum(rate(container_cpu_usage_seconds_total{job="kubelet", cluster="", container!="POD"}[5m])) by (pod, namespace)
Breakdown of the Query
sum(rate(...[5m]))
: This calculates the per-second rate of CPU usage over the last 5 minutes. Thesum
function aggregates this usage across all containers matching the labels.container_cpu_usage_seconds_total{job="kubelet", cluster="", container!="POD"}
: This specifies the metric to be used. The label filtercontainer!="POD"
excludes infrastructure containers that are not part of your application.by (pod, namespace)
: This groups the results by pod and namespace, allowing you to see CPU usage per pod in each namespace.
Example: Displaying CPU Usage in Percentage
If you want to display CPU usage as a percentage of the total CPU capacity available to the containers, you can combine it with the kube_pod_container_resource_limits_cpu_cores
metric, which represents the CPU limits set for the containers.
Here's an example of how to do that:
sum(rate(container_cpu_usage_seconds_total{job="kubelet", cluster="", container!="POD"}[5m])) by (pod, namespace) /
sum(kube_pod_container_resource_limits_cpu_cores{job="kube-state-metrics"}) by (pod, namespace) * 100
-
How Can I Group Labels in a Prometheus Query?
Grouping labels in a Prometheus query allows you to aggregate metrics based on specific labels, providing a way to analyze data across different dimensions. You can use the by clause in Prometheus ...
Questions -
How Do I Write a Prometheus Query That Returns the Value of a Label?
To write a Prometheus query that returns the value of a specific label, you need to use the basic syntax for querying metrics and utilize the appropriate label matchers. Here’s how to construct a q...
Questions -
Prometheus - Add Target Specific Label in Static_configs
Adding target-specific labels in the static_configs section of your Prometheus configuration file (prometheus.yml) allows you to associate additional metadata with your targets. This can be useful ...
Questions -
Prometheus Query to Count Unique Label Values
To count unique label values in Prometheus, you can use the count function along with the by clause to aggregate metrics based on a specific label. This is useful when you want to find out how many...
Questions