Is there any way how we could make any difference on how long problems are staying open?
I have this kind of situations frequently when there is some failing request on few minute period but the actual problem stays open almost 20 minutes after the last failed request?
So therefore in some environments we are getting alerts constantly and therefore the whole alerts are becoming quite useless since there is so many "false positives". I have tried to setup 15 minutes delay under problem notifications but since the problem seems to stay open 20 minutes almost every time the delay setting does not work since would need to raise the limit over 30 minutes.
Is this the default way how the problem engine works or are we having some issue on our Managed cluster which could cause this kind of delays?
Answer by Julius L. ·
When using automatic baselining (anomaly detection), the detection of failures is performed in sliding windows (5min and 15min). They cannot be changed. For details consult docs here: https://www.dynatrace.com/support/help/how-to-use-dynatrace/problem-detection-and-analysis/problem-detection/automated-multi-dimensional-baselining/
Before changing or finetuning the anomaly detection I would strongly suggest examining the detected errors if it's really a failed request. If it's not a failure of the service, then first configure the error detection for the service to ignore those error and not mark the requests as failed.
If the requests are really failures, then you will have to finetune the detection of failures - in the anomaly detection configuration of your service
Exclude a Tag from Alerting profile 2 Answers
Utilize the "Events API" 2 Answers