Prometheus - Convert Cpu_user_seconds to Cpu Usage %?
To convert cpu_user_seconds
(or a similar metric that represents CPU time) to CPU usage percentage in Prometheus, you need to calculate the rate of CPU usage over a defined period and then normalize that by the number of available CPU cores. This gives you a percentage value that represents the CPU usage.
Step 1: Understanding the Metric
Assuming you have a metric called container_cpu_user_seconds_total
, which tracks the total user CPU time consumed by the containers, you can calculate the CPU usage percentage as follows:
- Rate Calculation: Use the
rate()
function to get the per-second rate of CPU usage. - Normalization: Divide the CPU usage rate by the total number of available CPU cores, then multiply by 100 to get the percentage.
Step 2: Sample Query
Here's a sample query that calculates the CPU usage percentage based on container_cpu_user_seconds_total
:
100 * sum(rate(container_cpu_user_seconds_total[5m])) by (pod, namespace) / count(node_cpu_seconds_total{mode="user"})
Breakdown of the Query
rate(container_cpu_user_seconds_total[5m])
: This computes the per-second rate of CPU time used in the last 5 minutes. You can adjust the duration as needed.sum(...) by (pod, namespace)
: This aggregates the CPU usage for all containers grouped by pod and namespace.count(node_cpu_seconds_total{mode="user"})
: This counts the number of CPU cores available. You might want to sum it instead if you're interested in total CPU capacity rather than just counting the cores.100 * ...
: This converts the ratio into a percentage.
Example with Total CPU Cores
If you want to calculate the CPU usage based on the total number of CPU cores on a node, you can use the node_cpu_seconds_total
metric directly:
100 * sum(rate(container_cpu_user_seconds_total[5m])) by (pod, namespace) / sum(count(node_cpu_seconds_total) by (instance))
Step 3: Adjusting for Other Modes
If you want to include other CPU modes like system
or idle
, you can modify the query accordingly. For example:
100 * sum(rate(container_cpu_user_seconds_total[5m]) + rate(container_cpu_system_seconds_total[5m])) by (pod, namespace) / sum(count(node_cpu_seconds_total) by (instance))
Conclusion
By using the rate
function along with aggregation and normalization, you can effectively convert CPU usage in seconds to a percentage in Prometheus. This allows for better visibility into resource utilization within your Kubernetes or other environments.