• Forums
    • Public Forums
      • Community Connect
      • Dynatrace
        • Dynatrace Open Q&A
      • Application Monitoring & UEM
        • AppMon & UEM Open Q&A
      • Network Application Monitoring
        • NAM Open Q&A
  • Home /
  • Public Forums /
  • Network Application Monitoring /
  • NAM Open Q&A /
avatar image
Question by Keshav S. · Mar 31, 2014 at 02:27 AM ·

AMD Server Restarting at specific time regularly

Hi,

Has anyone encountered this situation, AMD server reboots around 00:50 AM EST mostly everyday ? (skip some days)

We have investigated many aspects from Hardware side, RAM replaced, Motherboard replaced, and H/W resources doesn't seems to be an issue at that particular time. all remains within limit.

AMD is running on 12.0 version on Red Hat 5.8 with Native Driver.

Share help & opinions, you're welcome !!

Comment

People who like this

0 Show 0
10 |2000000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Toggle Comment visibility. Current Visibility: Viewable by all users

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

8 Replies

  • Sort: 
  • Most voted
  • Newest
  • Oldest
avatar image

Answer by Erik S. · Apr 09, 2014 at 01:09 PM

The very restricted threading is largely due to the minimal number of CPU cores the AMD has; I believe the AMD's minimum cut off before using enhanced threading is that it must be >4 cores.  At 4 or less cores, a very restricted threading model must be used.

 

However, that is normally only a performance issue; you are getting OS level crashes, which need to be addressed by Red Hat support.

 

--- Erik

Comment

People who like this

0 Show 0 · Share
10 |2000000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Toggle Comment visibility. Current Visibility: Viewable by all users

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

avatar image

Answer by przemek.tafelski@compuware.com · Apr 03, 2014 at 04:18 AM

Yes, I can see your support case now. In fact you already have a post SP3 code on AMD (12.0.4), but the latest findings indicated some RHEL OS related issue as you're experiencing OS crash failures (kernel panic). Since AMD's custom drivers are not being used (due to a mix of PCI-X & PCIe cards you have on your AMD), it's not likely AMD software could cause the kernel panic, so it's purely an OS related issue. That would need to be examined by RHEL OS Support (perhaps an OS/kernel upgrade will be sufficient. You're currently running 2.6.18 kernel, latest 2.6 one was 2.6.39).

Please also note your AMD only has 4 cores and to in order to run 64-bit version of the AMD, at least 8 cores are needed (under testing, the best performance-to-cost ratio was achieved with 12 cores).

ps. The SP3 on EUE Console won't have any impact on AMD's operation at all.

Comment

People who like this

0 Show 1 · Share
10 |2000000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Toggle Comment visibility. Current Visibility: Viewable by all users

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

avatar image Keshav S. · Apr 04, 2014 at 01:18 AM 0
Share

Hi Przemek,

I only see activity on one CPU thread at a time.

Adding more CPU would seem to be a waste, advise how it would benefit. RTM always uses cpu0 for packet processing.

avatar image

Answer by Keshav S. · Apr 02, 2014 at 12:17 PM

Hello Przemek,

I just applied SP3 on AMD as per recommendation from Compuware support last week. That didn't help anyways. I couldn't apply SP3 on RUM Console, will that make it work properly ?

Comment

People who like this

0 Show 0 · Share
10 |2000000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Toggle Comment visibility. Current Visibility: Viewable by all users

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

avatar image

Answer by przemek.tafelski@compuware.com · Apr 02, 2014 at 12:12 PM

First of all make sure you have the latest/greatest AMD code loaded on your AMD corresponding to the major release you have.

For the 12.0 release, that is AMD 12.0.3 (SP3). There might be different reasons to cause an AMD restart, but either way, development won't bother investigating root cause, as long as you don't operate on the latest AMD version for your release. The eventual AMD fix (if still needed) will be based on the latest issued service pack anyway. Here is a link to SP3 for version 12.0. We recommend to deploy SP3 on all DC RUM components alltogether (CAS, EUE Console, AMD) to maintain best compatibility level. Alternativelly consider an upgrade to newer version. Release 12.0 is two releases behind our latest 12.2 GA.

 

Comment

People who like this

0 Show 0 · Share
10 |2000000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Toggle Comment visibility. Current Visibility: Viewable by all users

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

avatar image

Answer by Harshal P. · Apr 01, 2014 at 09:35 AM

Hi Keshav,

As Ulf suggested, please post the rtm.log here (usr/adlex/log/rtm.log) You might need to post some previous rtm logs so that we can see what is happening during the restart.

The previous logs are rtm.log.1, rtm.log.2.gz etc.

Comment

People who like this

0 Show 0 · Share
10 |2000000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Toggle Comment visibility. Current Visibility: Viewable by all users

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

avatar image

Answer by Ulf T. · Apr 01, 2014 at 09:18 AM

Hi

I would first check the server log to see if it is the same thing executing when the AMD restarts.

Secondly - if it is occuring at the same time every day - I would try to set up the AMD to log/capture network traffic around this time to understand if there is something specific in the network that occurs at that time or if it is something internal to the AMD.

Then after that I'd open a ticket. I tried to find the post about how to set up the capture but cannot find that post right now (smile)

Comment

People who like this

0 Show 0 · Share
10 |2000000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Toggle Comment visibility. Current Visibility: Viewable by all users

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

avatar image

Answer by Keshav S. · Apr 01, 2014 at 07:14 AM

In our case, that's not the indications. We use to have 30-40 % traffic at Eastern time night (compare to day time). So that's not case here.

Any idea about any other schedule activity/ Task which AMD might does ? since AMD service restart happens regularly 00:50 AM EST and it last for 15 odd mins & then comes up automatically.

Do you use Compuware adlex given SNMP driver or Red Hat SNMP ?


 

Comment

People who like this

0 Show 0 · Share
10 |2000000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Toggle Comment visibility. Current Visibility: Viewable by all users

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

avatar image

Answer by Stefan D. · Mar 31, 2014 at 08:07 AM

We had something similar at one customer: At night the traffic to the AMD dropped to zero - this caused the AMDs self-monitoring to restart the processes to avoid this assumed error condition. Compuware Support provided a workaround, this should be generally fixed since 12.1.1

Comment

People who like this

0 Show 0 · Share
10 |2000000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Toggle Comment visibility. Current Visibility: Viewable by all users

Up to 10 attachments (including images) can be used with a maximum of 50.0 MiB each and 250.0 MiB total.

How to get started

First steps in the forum
Read Community User Guide
Best practices of using forum

NAM 2019 SP5 is available


Check the RHEL support added in the latest NAM service pack.

Learn more

LIVE WEBINAR

"Performance Clinic - Monitoring as a Self Service with Dynatrace"


JANUARY 15, 3:00 PM GMT / 10:00 AM ET

Register here

Follow this Question

Answers Answers and Comments

5 People are following this question.

avatar image avatar image avatar image avatar image avatar image

Forum Tags

esm siebel Dynatrace Managed license nam probe wan citrix dna rest api configuration mq alerting NAM 2018 dashboard dcrumadvisory reports css nam universal decode database mobileapp RUM ads sap nam console scripting nam server sequence transactions nam 2019 upgrade
  • Forums
  • Public Forums
    • Community Connect
    • Dynatrace
      • Dynatrace Open Q&A
    • Application Monitoring & UEM
      • AppMon & UEM Open Q&A
    • Network Application Monitoring
      • NAM Open Q&A