Hi,
We are seeing following log messages in the Dynatrace (v5.0.0) server log saying "Collection Memory Threshold exceeded. Aborting Analyzer". Did anybody else faced similar issues? any solutions?
2014-01-21 10:22:12 INFO [com.dynatrace.diagnostics.server.datacenter.ae] Memory shortage after java.management.memory.collection.threshold.exceeded usage notification received! Limit=85%
2014-01-21 10:22:12 INFO [com.dynatrace.diagnostics.server.datacenter.ae] Total heap status:
2014-01-21 10:22:12 INFO [com.dynatrace.diagnostics.server.datacenter.ae] committed Memory: 12,555,264kb (100% of max)
2014-01-21 10:22:12 INFO [com.dynatrace.diagnostics.server.datacenter.ae] free Memory: 1,777,450kb (14.16% of max)
2014-01-21 10:22:23 INFO [com.dynatrace.diagnostics.server.datacenter.ae] Removing analyzed PurePaths did not free enough memory! Aborting running jobs ... (buffer size = 694854)
2014-01-21 10:22:23 WARNING [com.dynatrace.diagnostics.server.datacenter.ae] Collection Memory Threshold exceeded. Trying to abort action.
2014-01-21 10:22:23 WARNING [AbstractAnalyzer] Aborting Analyzer: memory limit reached
2014-01-21 10:22:24 WARNING [com.dynatrace.diagnostics.server.datacenter.ae] Collection Memory Threshold exceeded. Trying to abort action.
2014-01-21 10:22:24 WARNING [AbstractAnalyzer] Aborting Analyzer: memory limit reached
2014-01-21 10:22:32 INFO [com.dynatrace.diagnostics.server.datacenter.ae] Memory shortage after java.management.memory.collection.threshold.exceeded usage notification received! Limit=85%
2014-01-21 10:22:32 INFO [com.dynatrace.diagnostics.server.datacenter.ae] Total heap status:
2014-01-21 10:22:32 INFO [com.dynatrace.diagnostics.server.datacenter.ae] committed Memory: 12,555,264kb (100% of max)
2014-01-21 10:22:32 INFO [com.dynatrace.diagnostics.server.datacenter.ae] free Memory: 1,848,145kb (14.72% of max)
2014-01-21 10:22:33 INFO [com.dynatrace.diagnostics.server.datacenter.ae] Memory usage statistics:
2014-01-21 10:22:33 INFO [com.dynatrace.diagnostics.server.datacenter.ae] Memory Pool (CMS Old Gen) status:
2014-01-21 10:22:33 INFO [com.dynatrace.diagnostics.server.datacenter.ae] committed Memory: 12,306,048kb (100% of max)
2014-01-21 10:22:33 INFO [com.dynatrace.diagnostics.server.datacenter.ae] free Memory: 1,761,079kb (14.31% of max)
2014-01-21 10:22:33 INFO [com.dynatrace.diagnostics.server.datacenter.ae] * PurePath count: 700348, PurePathNode count: 21322469
2014-01-21 10:22:33 INFO [com.dynatrace.diagnostics.server.datacenter.ae] Aborting jobs did not free enough memory! Continuing with the removal of PurePaths ...
2014-01-21 10:22:33 WARNING [com.dynatrace.diagnostics.server.datacenter.ae] Collection Memory Threshold exceeded. Trying to abort action.
2014-01-21 10:22:33 WARNING [AbstractAnalyzer] Aborting Analyzer: memory limit reached
2014-01-21 10:22:34 WARNING [com.dynatrace.diagnostics.server.datacenter.ae] Collection Memory Threshold exceeded. Trying to abort action.
2014-01-21 10:22:34 WARNING [AbstractAnalyzer] Aborting Analyzer: memory limit reached
-Sreerag
Answer by Savas T. ·
Hi Andreas.
Costumer got "Low server memory" eventhough there is just 2 client connected and night time with low trafic. When i check the memory usage of frontend server. it is always increasing eventhough GC works. Backend process seems ok but frontend not , I doubt if there is a known bug at frontend server in dt 6.1.
By the way, I have already opened a case for this issue.
Thanks,
Savas
Answer by Savas T. ·
Frontend server health when dt has the problem ;
Is it possible that you have a lot of dynatrace clients with dashboards open that constantly refresh? And those dashboards may have purepath-based dashlets on it, e.g: database, exception, ... that show a long timeframe? This may explain some of your high GC times. But - the backend server should have no impactd on the frontend. So - the spike in PP Length is a pure Backend measure you want to focus on.
Answer by Savas T. ·
is this the measure that we talk about, if that is the measure we talk about, it is about 5000 ( screen shot is not belong to the system I mention ) .
Is it possible to decrease it easiy with playing exception, jdbc sensor ?
Have a look at the PurePaths that came in in that timeframe. Good news is that this is just a short spike of the Max value - not the average. Stiill - you should look at your PurePaths in that timeframe to figure out why these purepaths are that long. Most likely it has to do with Exceptions or Database statements. If you are on 6.1 already you can use Exception Aggregation. Also - you should use Database Aggregation for the JDBC Sensor. Or you may have some custom sensors that cause this.
Answer by Savas T. ·
I have an same issue with a costumer. it is an xlarge environment. I got this error in frontend server log. heap memory is 22 GB ( divided by fronted and dynatrace process ). I have just install 120 agents. I need to install 700 agents to the same server. Purepaeht leghth is about 5000 ( I check it dynatrace server monitoring dashboard ) . Even I disable stacktracce sensor. I do not think 1 dt server will handle remaining agents. So do you offer me to increase heap size ? server has 126 GB ram already
You need to get your PP Length down if this is really your average size. thats the first thing I would do as this consumes most of your CPU and Memory on your server. If you get this to the recommended production length of about 100-200 on average you have enough headroom for your additonal agents.
Answer by Srikar M. ·
In the interim you can also try to reduce overhead as follows (if this applies to you):
These are some basic tricks I have learned by experience, from the forum, from my peers and experts in the field. Hope this helps.
Thanks Mohan, We do most of these now; we are dealing with high volume 600+ agents in one dynatrace server and the server goes up right when there is an increase in traffic.
Answer by Andreas G. ·
It seems you are exhausting your dynaTrace server. You are already running on 12GB of heap space - so - I would recommend that you get your PP size down so that they dont need that much space in the heap before the analyzer can analyze them.
Whats your average PP Size?
The Avg PurePath Length from the dynaTrace Server Health dashboard is ~2000. We have a lot of database calls which we hope will reduce once we upgrade to 5.5 or 5.6 and enable the database aggregation.
That will definitely help as we have discussed here and in the past when we met.
This will greatly reduce the memory usage of the dT server to process these PPs and with that reduce a lot of these memory and GC related problems
JANUARY 15, 3:00 PM GMT / 10:00 AM ET