cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

SLO Alerts not merged

AurelienGravier
DynaMight Pro
DynaMight Pro

Hello,

 

I have created an SLO for several services tagged with the same specific tag (for example, multiple controllers carrying the same business action) :

AurelienGravier_0-1714389001640.png

And an burn rate alert with "Allow merge" option enable.

 

But I feel that Davis is not able to group the SLO burn-rate alert and the other issues together on a SLO with multiples services :

AurelienGravier_1-1714389250520.png

 

However, when I open the issues, they are able to link it to my SLO :

AurelienGravier_3-1714389361990.png

The goal is to add the information below in my previous problem :

AurelienGravier_4-1714389429486.png

 

Have you already identified this limitation ?
Is it a best practise to define 1 SLO per service ?

SLO for multiple services limits the implementation effort of SLOs.

 

Thank you in advance.

Regards Aurelien

Observability consultant - Dynatrace Associate/Pro/Services certified
2 REPLIES 2

Gerhard-K
Dynatrace Helper
Dynatrace Helper

Hi @AurelienGravier 

The challenge we have in this situation is, that the SLO and hence its status, error budget, and error budget burn rate calculation is aggregated over all contributing entities (in your example 3 services). Because of the aggregation, the impact of each dimension or entity is lost. Due to this fact, we can't automatically and easily tell what entity caused the performance degradation or error budget burn rate increase if there is more than one entity contributing to the result.

In other words, the metric event (SLO alert) is based on an aggregated value, but the contribution of the different underlaying entities might differ based on its characteristics. Due to the aggregation transformation, single dimensions are removed. If a burn-rate alert is triggered, it is currently not possible to resolve what entity caused this issue. Imagine, two independent problems on different underlaying entities - what problem would you expect the burn-rate alert to be merged to? That's why we went for the reference in the opposite direction and show if a problem impacts an SLO, as the other way round is not unambiguously resolvable at the moment.

@anton_freyberg wrote a great blog post on SLO alerting and combining them with Davis AI - if you haven't seen it yet - https://www.dynatrace.com/news/blog/slo-monitoring-alerting-on-slos-error-budget-burn-rates/ 

We certainly understand the advantages and value in combining several entities to one SLO status and we are evaluating to improve the problem <> SLO impact analysis. At the moment, if you want to directly merge a custom SLO alert (metric event), you would need to have a one to one relationship.

 

Kind regards,

Gerhard

AurelienGravier
DynaMight Pro
DynaMight Pro

Thank you @Gerhard-K,
As confirm with @anton_freyberg too, the best practice is indeed to have only one entity service per SLO so that the alert is properly grouped by Davis.

It is also recommended to use SLO alerts only for service and service-method type SLOs and not based on user experience or synthetic type SLOs.

Regards Aurélien.

Observability consultant - Dynatrace Associate/Pro/Services certified

Featured Posts