We are building out new infrastructure to handle new dynaTrace installs we are planning. Our company has a huge push to put everything on Virtual infrastructure (VmWare, Pureflex etc.). I know dynaTrace "will work" on a virtual server but I'm concerned that time measurements calculated by the dynaTrace server will be off. Can someone explain if dynaTrace can calculate and report on exact cputime when running in a virtualized environment? I'm concern that if the physical host hosting the Virtual server is under duress, that the time calculated by dynaTrace running on that virtual server will get skewed. If dynaTrace can calculate times exactly, can you provide some detail as to how?
Thanks,
Frank
Answer by Michael K. ·
Hi Frank,
We have a general mechanism called "Agent Time" that ensures that clock drift on monitored machines doesn't impact our PurePaths. It's kind of an global time sync management. It works in physical and virtual environments and ensures that data is not essembled out of order due to "normal" clock drift issues.
As for vmware introduced timekeeping issues, on tickless systems we don't have to do anything special. A tickless system does not suffer time keeping issues, because its not based on a constant rate interrupt, even under high utilization. We typical use the OS available high precision time to measures execution and response times. In a tickless system the measurement is real time (no timekeeping issues, no apparent time). The only caveat is that steal time is of course accounted in that time and thus execution time. You can identify this by correlating guest steal times with your execution times. You should use the VMWare plugin to capture CPU ready time (vmware's name for steal time) for your vms. (as long as steal time is in normal bounds this should not matter much, if its big you will see it, think of it similar to GC suspension, its always there, but as long as its small it doesn't matter).
In a VMWare system you should install the guest tools to avoid that steal time is inadvertently accounted as CPU time by the guest, there is not such issue with Xen or other paravirtualized systems.
In Tick counting systems on VMWare we suggest using the VMWare Pseudo Counters to avoid clock issues. This timer acts similar to clocks in tickless systems in that it provides the system with a real time clock. This clock is provided by the hyperviser and does therefore not suffer time keeping issues. Just like with tickless systems steal time or other suspension events (e.g. VMotion) are included in measured times and thus you need to correlate these events to your transaction time measurements to be aware of them.
Tick Counting systems on paravirtualized systems (Xen or KVM) should not have any timekeeping issues as they use the real physical clock.
In all cases a VMotion event will not affect the clock, but of course methods and response times measured "during" a vmotion event will be much longer then they usually are even though the system didn't do anything.
I hope this answers all you questions, if you need more information please don't hesitate to ask.
Best
Michael
Answer by Frank P. ·
Ping... I need to make a case one way or another to my management. Can someone answer this question either here or send me a private email ...
"We have both tick counting OS's and tickless counting OS's here. I would be interested in any additional detail you can provide on how you're avoiding clock drift type issues. Especially on agents and dt servers that are on highly utilized virtual hosts, hosts that may be undergoing vmotion etc"
Thanks,
Frank
Answer by Bernd G. ·
Also do our native dynaTrace agents have built-in special virtualization detection and high-resolution timing code that has been specially built and optimized for accurate nano-second timing in virtualized environments
Bernd,
Thank you for your response. I'm very pleasantly surprised to see that you are active on the community. We have always been very impressed with the level of involvement senior management has with your customers and this is a great example of that. It's great to see that's continuing. It's also good to see that you have implemented something to counter clock time issues inherent in virtual environments. We have both tick counting OS's and tickless counting OS's here. I would be interested in any additional detail you can provide on how you're avoiding clock drift type issues. Especially on agents and dt servers that are on highly utilized virtual hosts, hosts that may be undergoing vmotion etc.
Answer by Andreas G. ·
Good Morning
To add to Robs comment and answer your question: Running the dT Server on a VM wont have any impact on time calculations. We get the timings from the Agents - so - the VMs on the dT Server dont have an impact.
So - just make sure that you have enough resources allocated to this VM - use the dynaTrace Health Dashboards to monitor the dynaTrace Server / Collector Health and Performance. Follow the Deployment Guide to make sure everything is properly sized - then you should be ok
Thanks for the replys Rob and Andreas,
To answer your question, I'm actually concerned about both times coming from virtualized agents and time calcuated on the server itself for health measures etc. If we are unable to dedicate CPU's to a virtualized server, would it then be considered un-wise to install dynatrace server on virtualized infrastructure?
It would be unwise to put a dT server in an active environment if you cannot commit to the CPU, RAM, and I/O requirements. As mentioned, we need to run at the speed of your app to keep up with all transactions, plus our post-processing of them. If we're starved for resources you can imagine that causing performance data loss.
Answer by Rob V. ·
Hi Frank,
I can't answer your question on computations as I don't have access to the code, but I want to ask a clarifying question to help anyone from the lab who steps in to answer.
When you're talking about the "time measurements calculated by the dynaTrace server..." - are you talking about its own self-health measures (CPU etc) or are there other measurements you are concerned about?
I can comment in general on hosting the dT server on virtual gear, from experience:
Rob
Answer by Andreas G. ·
Hi Frank
Just to verify. Are you talking about running the dynaTrace server on a VM Infrastructure or your application (and with that our agents) on a virtualized infrastructure?
JANUARY 15, 3:00 PM GMT / 10:00 AM ET