cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

sum(builtin:tech.generic.cpu.usage) versus builtin:host.cpu.usage or builtin:host.cpu.user

PierreLamonzie
Visitor

Hello all,

I wanted to check something very basic - the sum of the CPU usage of my processes is equals to my host CPU usage.

I was not expected to have exactly the same values, maybe some %ages of difference.

But I really have huge differences, quite often 30% to 50%, on both ways.

Figures are better when I compare sum(builtin:tech.generic.cpu.usage) with builtin:host.cpu.user rather than builtin:host.cpu.usage, but results are very far to be "good"

By increasing the resolution, from 1m to 15m for exemple, the differences are less however, even if I still meet differences higher than 20% between metrics in 20% of the cases.

I was wondering where the approach is not correct, and how I can create/use:

* problems on builtin:tech.generic.cpu.usage thresholds

* problems on builtin:host.cpu.user thresholds

being sure everything is consistent.

With the data I have, I could generate a problem telling "this process CPU usage is higher than x%" but with a host CPU usage which is lower than x% --> clearly not good...

(Working on premise)

Thx and regards.

Pierre

2 REPLIES 2

Eric_Yu
Dynatrace Advisor
Dynatrace Advisor

Hi Pierre,

Regarding the description of each metric, you can get an idea by checking the metrics tab:

Eric_Yu_2-1713804815818.png

For example, for CPU user vs CPU usage %, here's the difference:

Eric_Yu_0-1713804686796.png     Eric_Yu_1-1713804724370.png

The one you're looking for should be builtin:host.cpu.usage. And for the comparison between that and the builtin:tech.generic.cpu.usage per process, they should be around the same when aggregated. From my tests: 

Eric_Yu_3-1713804973970.png

Can you provide the metric selectors of yours? It may be helpful to see what's going on there.

Regards,

Eric Yu

Hello Eric,

thank you so much for your answer.

I was using the same selector than you, but discrepancies look to be higher than yours:

Capture d'écran 2024-04-22 231312.png

The absolute difference between the two values is low, but the relative difference can be very high.

This uncertainty is maybe intrinsic to the way those data are captured, because anyway usage values are quite low, it's hard to get a perfect accuracy from the linux OS itself.

 

Thank you again for your feedback, let's consider this topic as closed.

 

Best regards.

 

Pierre

 

Featured Posts