question

Chris N. avatar image
Chris N. asked ·

Detecting Problems after a maintenance window that don't come back

Hello:

I'm asking the community if they have found a solution to the problem I'm describing below.

We have maintenance windows and sometimes processes do not come back after patching or a reboot.

My first question is this

1) If a process is "UP" before a maintenance window occurs..and I set Dynatrace to "Disable problem detection during maintenance window" If a process is "DOWN" After the maintenance window. Will dynatrace alert that it is down since it never detected it during the maintenance window?


2) If that does not work..does anyone know of an easier work around than below:

As an option I can set maintenance window to "Detect but don't alert"

Then at the end of the maintenance window. I can query the API for OPEN problems

that match specific TAGS I have set up. I can scrape the JSON data for these OPEN Problems of concern. Then set an API PUSH to CLOSE the Problems.

The idea is this.

If the problem is closed, and the condition occurs again. The Dynatrace will detect a "new" or "Fresh" problem for the same condition.

Is there anything in this idea that will not work?

Please advise and thank you DT community!

rest apiproblem detectionnotificationsmaintenance window
10 |2000000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

1 Answer

Sebastian K. avatar image
Sebastian K. answered ·

About first point, no Dynatrace will not alert about duch thing. This is becauwe change was happened during maintenance window. Dynatrace is not alerting about something that was on environment during window.

in general you can always query dynatrace api by script and count processes that are monitored before and after. In such case you will know if all is monitored properly. You don’t have problems for such case.

sebastian

3 comments Share
10 |2000000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

So I tried #2--Using the API to Close the Problem during a maintenance window as work around and it failed spectacularly

Let me tell you what I tried to do.

We know that Dynatrace during a maintenance window has the option of “detect problems but don’t alert”


SO I tried the following

1-Create a maintenance window

2-“Force a process unavailable problem to occur” during the maintenance window

3-Leave the problem still open after the maintenance window expires

This is where I get surprised...

4- go to the API.. I find the exact problem ID. I sent an API close after the maintenance window has expired

5-What I expected to happen was, since the maintenance window was over. DT would detect the process is down (since the old problem closed and generate a new problem.

6-THIS DID NOT HAPPEN. DT went along behaving as if the process was green and it appears to not be checking the state.


What going on here?

Why would closing a problem not result in a check in state? (I guess that depends on how it checks I don’t know how that is handled)

I waited 15 minutes and did not see a change. Does anyone know if the detection of down would have eventually occurred

You see what I am trying to do here.. What options are available?

0 Likes 0 · ·
1578323342891.png (2.4 KiB)

Hi Chris,

How did you force the process unavailability problem? Did you configure it to 'alert on any process becomes unavailable' or 'alert on a minimum threshold'?

In general we do avoid to immediately open a fresh problem right after the maintenance window if the condition still is the same. I will check if we can improve here to automatically force close the suppressed ones and open fresh ones after the MW.

Best greetings,

Wolfgang

0 Likes 0 · ·
Chris N. avatar image Chris N. Wolfgang B. ♦ ·

Hi Wolfgang.

We are configured for 'alert on any process becomes unavailable '

However. The only thing I have been able to find that may help us currently is a setting of

"Detect but don't alert" for a maintenance window.

Then.. after the maintenance window is over... I make through cron or some scheduler and API call for any problems "OPEN"

The implication is that they are still open after the maintenance window then they are still unresolved.

I take that JSON output and do an API push to an external 3rd party notification system

Not elegant by any description...but functional.

My hope is that Dynatrace will include a functionality such as I mention here.

https://answers.dynatrace.com/idea/232518/view.html


-Chris


0 Likes 0 · ·

Space Topics

mobile monitoring dotnet synthetic monitoring reports iis chat kubernetes servicenow amazon web services mysql mainframe rest api errors cassandra dashboard oneagent sdk cmc application monitoring openkit smartscape request attributes monitoring developer community user tagging log monitoring services ufo syntheticadvisory activegate ip addresses auto-detection high five award oracle hyperion webserver uem usql iib test automation license web performance monitoring ios news migration management zones index ibm mq web services custom event alerts notifications sso host monitoring knowledge sharing reports browser monitors java hybris sap vmware maintenance window user action naming javascript appmon ai synthetic classic availability tipstricks automation extensions session replay diagnostic tools permissions davis assistant faq documentation problem detection http monitors server easytravel apdex aws-quickstart network docker tags and metadata cloud foundry google cloud platform synthetic monitoring process groups account usability dynatrace saas gui paas openshift key user actions administration user actions postgresql synthetic locations oneagent security Dynatrace Managed user management custom python technologies mongodb openstack user session monitoring continuous delivery citrix configuration alerting NGINX timestamp action naming linux nam installation masking error reporting database mission control jmeter recorder apache mobileapp RUM php threshold azure purepath davis scripting agent aix nodejs android