• Forums
    • Public Forums
      • Community Connect
      • Dynatrace
        • Dynatrace Open Q&A
      • Application Monitoring & UEM
        • AppMon & UEM Open Q&A
      • Network Application Monitoring
        • NAM Open Q&A
  • Home /
  • Public Forums /
  • Application Monitoring & UEM /
  • AppMon & UEM Open Q&A /
avatar image
Question by Victor B. · Apr 15, 2015 at 04:59 AM ·

DT 6.1 on Windows: kernel CPU usage high in instrumented applications

Hello there. On a Windows server machine with ~30 instrumented .NET applications running (a mix of Windows services and IIS pools), the privileged CPU usage is high (machine icon is constantly red due to over 15% privileged CPU usage). With DT 5.6 there was no such problem. Also if debug info (full PDB) in the instrumented binaries is switched off, then the problem goes away as well. This has been observed on 2 different machines. The Windows performance monitor shows that each of the instrumented processes has exactly one thread that spikes the privileged CPU % measure every 10 seconds. Processes that have more threads exhibit higher CPU usage %%. Many processes spike up to 100%. When viewing threads in SysInternals Process Explorer, there is indeed one thread that spikes every 10 seconds - this thread has depth of 35 stack frames, with dtagentcore.dll!setAgentCorePath near the top, and no stack frames belonging to our apps. There is also perfproc.dll!ColelctSysProcessObjectData on the stack, also ntoskrnl.exe!PsResumeProcess, etc. Again, this spike is observed on all processes, but release builds have much shorter and smaller spikes. In effect, this high CPU usage makes the affected machines nearly unusable - it's a test environment machines, and we have to run debug builds there. Fiddling with sensors, even disabling all sensors on the system profile and agent group, has no effect on spikes. I suspect that this periodic activity has to do with gathering per-process CPU/memory/etc.

If the suspicion is correct, is it possible to reduce that polling frequency e.g. 10-fold?

Anyone experienced the same issue? There were no such problem with DT 5.6. We're running v.6.1 with 8105 patch level (fixpack).

Thanks, Victor.

Comment

People who like this

0 Show 0
10 |2000000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Toggle Comment visibility. Current Visibility: Viewable by all users

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

6 Replies

  • Sort: 
  • Most voted
  • Newest
  • Oldest
avatar image

Answer by Adam R. · Apr 25, 2015 at 09:52 PM

Victor, thanks for the improved work around. We'll give this a shot.

I suspect that with several hundred processes, may need to even further back off from 60 seconds.

Andreas, if it helps, the support ticket I submitted was 00752352, was opened April 23, 2013, and closed May 17 with the resolution "disable performance counters".

Comment

People who like this

0 Show 0 · Share
10 |2000000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Toggle Comment visibility. Current Visibility: Viewable by all users

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

avatar image

Answer by Victor B. · Apr 18, 2015 at 05:00 AM

UPDATE

1) There is a different workaround, as suggested by DT tech support: setting environment variable DT_PERFCOUNTERINTERVAL to 60 (60 seconds instead of the default 10 seconds) dramatically reduced kernel CPU usage - it's now at an acceptable level, so we can still take advantage of per-process NET performance metrics, albeit at a slightly lower resolution (which matters only during a short period of time as it is not warehoused at the higher resolution);

2) There is actually no difference between debug and release builds - a new series of tests using a single machine shows that the type of builds is not a factor;

3) The issue seems new (affecting dynatrace 6), and hopefully will be addressed in a future fixpack;

Comment

People who like this

0 Show 0 · Share
10 |2000000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Toggle Comment visibility. Current Visibility: Viewable by all users

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

avatar image

Answer by Victor B. · Apr 17, 2015 at 03:56 AM

Yep, submitted a ticket: SUPDT-8362; Thanks;

Comment

People who like this

0 Show 0 · Share
10 |2000000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Toggle Comment visibility. Current Visibility: Viewable by all users

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

avatar image

Answer by Victor B. · Apr 16, 2015 at 02:21 AM

Hi Adam. Your fix worked - CPU dropped from 50...100% to 10% !!! Thanks a lot for sharing this.

Regards, Victor.

 

Comment

People who like this

0 Show 1 · Share
10 |2000000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Toggle Comment visibility. Current Visibility: Viewable by all users

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

avatar image Andreas G. ♦ · Apr 16, 2015 at 09:43 AM 0
Share

Hi Victor. I would still open a support ticket for this. If this is a known issue on certain windows/.NET versions then we should get an official statement from our support team. If it is not a well known issue then support needs to addres this with engineering

Andi

avatar image

Answer by Adam R. · Apr 15, 2015 at 09:33 AM

We experienced a similar issue with Dynatrace 5.x ... except with 200+ processes, instead of 30. As you can imagine, our host was not responsive.

YMMV, but we were OK after simply disabling the per-process performance counter monitoring by setting the appropriate environmental variable, i.e.,

setx DT_DISABLEPERFCOUNTERS true /M

FWIW, we opened a ticket with DT support for this issue for other workarounds, but I can't recall the outcome, and unfortunately can't find the ticket in the DT JIRA support system now (it was a SF support ticket).

Comment

People who like this

0 Show 0 · Share
10 |2000000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Toggle Comment visibility. Current Visibility: Viewable by all users

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

avatar image

Answer by Victor B. · Apr 15, 2015 at 05:03 AM

Comment

People who like this

0 Show 0 · Share
10 |2000000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Toggle Comment visibility. Current Visibility: Viewable by all users

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

How to get started

First steps in the forum
Read Community User Guide
Best practices of using forum

NAM 2019 SP5 is available


Check the RHEL support added in the latest NAM service pack.

Learn more

LIVE WEBINAR

"Performance Clinic - Monitoring as a Self Service with Dynatrace"


JANUARY 15, 3:00 PM GMT / 10:00 AM ET

Register here

Follow this Question

Answers Answers and Comments

1 Person is following this question.

avatar image

Forum Tags

dotnet mobile monitoring load iis 6.5 kubernetes mainframe rest api dashboard framework 7.0 appmon 7 health monitoring adk log monitoring services auto-detection uem webserver test automation license web performance monitoring ios nam probe collector migration mq web services knowledge sharing reports window java hybris javascript appmon sensors good to know extensions search 6.3+ server documentation easytravel web dashboard kibana system profile purelytics docker splunk 6.1 process groups account 7.2 rest dynatrace saas spa guardian appmon administration production user actions postgresql upgrade oneagent measures security Dynatrace Managed transactionflow technologies diagnostics user session monitoring unique users continuous delivery sharing configuration alerting NGINX splitting business transaction client 6.3 installation database scheduler apache mobileapp RUM php dashlet azure purepath agent 7.1 appmonsaas messagebroker nodejs 6.2 android sensor performance warehouse
  • Forums
  • Public Forums
    • Community Connect
    • Dynatrace
      • Dynatrace Open Q&A
    • Application Monitoring & UEM
      • AppMon & UEM Open Q&A
    • Network Application Monitoring
      • NAM Open Q&A