There are multiple variants how to validate SSL certificates and alert on expiry. I've taken a look at all of them and missed a lack of automation. I therefor created another one that hopefully overcomes some of the limitations and is easier to use in large environments.
As we are not having this feature out of the box for a long time this might be useful.
Summarizing the various attempts and threads from:
SSL Certification expiration checks out of the box - Details? (@Larry R.)
Does Dynatrace monitor SSL certificate validation (@Akshay S.)
Monitor SSL certificate expiry and generate alert (@Dario C.)
(also the contributers @Július L., @Leon Van Z.)
What is different in this plugin?
Where to find it?
You can find the plugin at my personal github repository.
Thanks for sharing! Our services team created a similar one but instead of using synthetic tests as input it took a csv as well as automatically discovering all https endpoints from incoming/outgoing service calls.
That one does require some configuration though and does use DDUs to track the endpoints.
Mike
Using the synthetic monitor configuration seemed logical. Can't rely too much on the services as they are more likely to change and eventually there are no services and one would still perform synthetic tests (or test something that is not even covered by DT on the backend).
Though I use the detected services approach to automatically configure RUM applications at scale...waiting for DT to bring back automatic application detection :-)
Hello @Reinhard W.
In how to use the below sentence written.
"that is able to access the sites you want to monitor."
I am a bit confused about this, thus need your assistance to clear my concept before using the plugin. We have a few eChannel applications monitoring with the Synthetic Browser.
Do you mean the Environment AG should be able to reach that publically available DNS?
Regards,
Babar
Hi @Babar Q.,
yes, the AG that is executing the Plugin must be able to reach the public available DNS/Hosts/Sites to check the certificates.
This is done independently of the synthetic monitors (that you probably let execute on Dynatrace's infrastructure).
Reinhard
Hello @Reinhard W.
Thank you for the confirmation. On the uploading of the extension, I received the message that it will consume the DDU.
Is this true or was it just a default message?
Regards,
Babar
This plugin doesn't create any custom metrics, only events so it will not consume any DDUs.
Answer by Reinhard W. ·
Thanks for the feedback (@Aymeric B.). I've added functionality to the plugin so that it now also checks previously created problems/events and if their state is still satisfied (outside of the normal long cert check interval). It will so so by fetching the event/problem and check if it is close to expiry (the max. 120 minutes). It this is the case it will check those hosts and make sure the problem is refreshed, or if the failure condition doesn't exist anymore close the problem.
Additionally I added proxy support for the plugin. This can be useful in cases where direct access to the sites to check isn't possible. The plugin will only use TLSv1.2 for security reasons.
Answer by Aymeric B. ·
Thanks for sharing !
I will try it.
The best solution (in my opinion) was to develop our own AG plugin (based on ssl and openssl library) in order to be able to manage our own groups of certificates and the associated alert thresholds. (I don't really like to use synthetics for this kind of monitoring).
The concern also (for me) to use events is that the problem will be automatically closed after 15 minutes (max 120 minutes) and therefore would not be compatible with an execution schedule higher than 120 minutes (or we have to manage this in the script and many events will be created every day until the certificate is renewed)
Hi @Aymeric B.,
actually this AG plugin uses OpenSSL in the background to fetch the certificates. Any solution that gets certificates from remote servers is some kind of synthetic monitoring. Unless you do cert checks locally on the filesystem (which is hardly controllable on large heterogeneous environments) IMO.
There is no issue with problems closing after 15 minutes. You can actually set the timeoutduration higher and also simply refresh the problem when needed. So my approach is to set the timeout to longer than the check interval, then the problem will be simply refreshed and no additional ones will be created.
Hi @Reinhard W.
We had specific needs for this AG plugin (management of assignment groups for the ticketing tool, different thresholds according to the type of certificates, ...).
Regarding the events management, the documentation indicated that the maximum timeout was 120 minutes for an event , so i have decided to configure a custom event in order not to manage a situation where the execution interval would be greater than the maximum timeout.
(but you're absolutely right, it's indeed possible to manage the refresh of the event in the script, maybe I've been a little lazy.^^)
Hi @Aymeric B.
you just gave me a great idea on how to do the refresh better, will include that in my plugin.
For the different thresholds for different groups of certificates. This could be covered with different instances of the plugin. In case you know on which sites (synthetic monitors) you have which certificates, you could assign different tags in DT and then the different instances of the plugin would pick up those sites with separate thresholds.
JANUARY 15, 3:00 PM GMT / 10:00 AM ET