Trying to find the best way to track response time of a certain URI (and many others, this is just the start). It shows up in our application/BT list and if I go to the response time graph, put it in a dashboard, I get the top image below. Great, this LOOKS somewhat accurate, response time around 1s each.
Now if I create a different chart using BT, Built-in, Web Page Requests, PurePath response time, and select the same URI in the split (see below), I end up with a graph with a couple of spikes and while most are around 1s, it does not match up with the BT from the applications dashboard at all.
Resulting graph:
So then a few questions from this. First, why are these seemingly so different? Second, is either a better way to monitor a specific URI pattern? And finally how else can I specify an alternative URI to track that is not created as part of these default BTs? I have also tried this below and only gives me the maximum values, I cannot change it to be average.
Answer by Andreas G. ·
Hi Michael
The Response Time you see on the Application Overivew (top image) is actually the 50th Percentile of the Response Time of that URL. The AppOverview baselines Response Time for the 50th and 90th percentile. The Baseline algorithm calculates these two values from the actual Response Time Result of that BT - but it calculates these two percentile values just for the baseline algorightm.
Also - the out-of-the-box BT that is called "Web Page Reqeust" excludes Robots and Synthetic requests. Not sure if you have requests like this - but - just an FYI.
If you put the same measure on a chart you will by default see the "Average" - not the 50th Percentile. Thats why you will see a difference in the Chart and the Baseline Chart of the AppOverview. Average < > 50th Percentile
Is there a better way to monitor individual URLs?
Well - If you want to "Baseline" these URLs you have to create a BT for it. One Best Practice is to create individual BTs for e.g: "pricingAvailabilty", "search", "home", ... With that you always make sure that this URL is captured by that BT and doesnt end up in the generic "..." splitting value in case it is not called often enough.
If you dont need the baselining you can go ahead and create different Web Request Response Time Measures for your individual URLs.
Hope this helps
Answer by Andreas G. ·
You are correct - our baselining will automatically trigger one of our two out-of-the-box incidents called "Response time degraded" and "Response time degraded for slow requests". These incidents will be triggered if you have 2 significant violations of the baseline. If you want to have a static baseline or change the way we handle incidents on these baselines you can do this on the response time dashboard that you posted as first screenshot. there is "cog wheel" on the top that allows you to configure the baseline
As for synthetic/robot requests: some of our out-of-the-box BTs use out-of-the-box measures called "No Robot Requests" and "No Synthetic Requests". Synthetic requests are detected by looking at the dynaTrace HTTP Header we use to intregrate with synthetic tools. Robots are detected by our own mechanism, e.g: looking at particular User-Agents that are well known robots
hope this helps
Answer by Michael C. ·
Told you it was a stupid question. The follow up to that then is, if we were to set an incident off of deviation from the baseline, it sounds like that would not capture spikes such as the ones in the second graph above unless it lasted for two iterations above the baseline? What's the best practice I guess is what I am asking for monitoring a certain type of request we consider business critical via the application BT.
Also how does it determine a robotic/synthetic request? We do indeed have various scripts hitting various components however from our point of view they would show up on the webserver like any other request.
Answer by Andreas G. ·
There are two Response Time charts. One says Response Time -> this is the 50th percentila chart. The other one says Response Time (slowest 10%) -> thats the 90th percentile.
you also see a small icon in the header of that chart. Hover your mouse over it and you get a description on it.
Answer by Michael C. ·
Thanks for all the help yesterday BTW.
It does help, just a couple questions then.
Reading through documentation on baseline vs. actual (DOCDT42/Baseline+and+Smart+Alerting+Explained), since there is only one line on the graph, how do we know which it is at any point in time, the 50% median or the slowest 90% (probably a stupid question!).
That leads to another question but depends on the answer to the first!
JANUARY 15, 3:00 PM GMT / 10:00 AM ET