Symptom
When checking the Alerts tab in HANA, there is an alert called 'Unexpected state STOPPING found when starting StatisticsServer'.
- In HANA Studio, you would find this alert by going to Administration Console -> Alerts -> Show: all alerts
- In Solution Manager, you would find this alert using transaction DBACOCKPIT -> choose HANA System -> Expand Current Status -> Alerts
Environment
All tests have been performed on HANA 1.0 revision 45
Cause
Following prerequisite is met:
The affected service is not included in the daemon configuration file or the services are set to be restarted automatically according to the configuration.
Normally there are 3 reasons for this alert:
1. Services have been manually stopped or killed for a specific purpose. The DB administrator could have triggered the HANA alert by stopping or killing services from HANA Studio.
3. The revision of the HANA DB is not at the latest version.
The affected service is not included in the daemon configuration file or the services are set to be restarted automatically according to the configuration.
Normally there are 3 reasons for this alert:
1. Services have been manually stopped or killed for a specific purpose. The DB administrator could have triggered the HANA alert by stopping or killing services from HANA Studio.
You can check the Daemon Trace File for
detailed information. In the trace file, you can find which service is
inactive and the reason why it is inactive.
a. Check Daemon Trace File in HANA Studio
b. Check daemon trace file from OS level:
The file is usually located in directory /usr/sap/<SID>/HDB<InstanceNumber>/<hostname>/trace
For example, if trying to kill the
nameserver from HANA Studio, we would catch the following information
from the daemon trace file:
TrexDaemon.cpp(10226) : process hdbnameserver with pid 43929 exited because it caught signal 9
[43906]{0}[0] 2013-01-28 08:05:10.377704 w Basis ProcessExecution.cpp(00099) : Active Context before fork ID: 43908 Name: NetworkChannelCompletionThread State: Inactive
[43906]{0}[0] 2013-01-28 08:05:10.377718 w Basis ProcessExecution.cpp(00099) : Active Context before fork ID: 43909 Name: NetworkChannelCompletionThread State: Inactive
[43906]{0}[0] 2013-01-28 08:05:10.377725 i Daemon TrexDaemon.cpp(08656) : start 'hdbnameserver' as process 76615
[43906]{0}[0] 2013-01-28 08:05:17.727685 i Daemon TrexDaemon.cpp(10341) : program hdbnameserver with pid 76615 is started
[43906]{0}[0] 2013-01-28 08:05:17.727720 i Daemon TrexDaemon.cpp(10355) : runlevel 5 completely started2. HANA host crashed or restarted due to some unexpected reason.
3. The revision of the HANA DB is not at the latest version.
Resolution
1. If the service has been killed manually for some specific purpose, you can ignore the alert.
2. For any inactive service alert, you need to check the reason why this service was inactive (check the Daemon Trace File to find the reason for the inactive service)
2. For any inactive service alert, you need to check the reason why this service was inactive (check the Daemon Trace File to find the reason for the inactive service)
- Check Daemon Trace File in HANA Studio:
- Check Daemon Trace File on OS level:
The file is usually located in directory /usr/sap/<SID>/HDB<InstanceNumber>/<hostname>/trace:
Example:
TrexDaemon.cpp(10226) : process hdbnameserver with pid 43929 exited because it caught signal 9
[43906]{0}[0] 2013-01-28 08:05:10.377704 w Basis ProcessExecution.cpp(00099) : Active Context before fork ID: 43908 Name: NetworkChannelCompletionThread State: Inactive
[43906]{0}[0] 2013-01-28 08:05:10.377718 w Basis ProcessExecution.cpp(00099) : Active Context before fork ID: 43909 Name: NetworkChannelCompletionThread State: Inactive
[43906]{0}[0] 2013-01-28 08:05:10.377725 i Daemon TrexDaemon.cpp(08656) : start 'hdbnameserver' as process 76615
[43906]{0}[0] 2013-01-28 08:05:17.727685 i Daemon TrexDaemon.cpp(10341) : program hdbnameserver with pid 76615 is started
[43906]{0}[0] 2013-01-28 08:05:17.727720 i Daemon TrexDaemon.cpp(10355) : runlevel 5 completely started
Here you can see that hdbnameserver with old pid 43929 was inactive because it caught signal 9 which means it was killed by kill -9.
3. If the revision of HANA DB is not the latest version, it is
strongly recommended to upgrade the HANA DB to the latest version.
See Also
- Are there any Functional Constraints?
- During services inactive period, the HANA system would be unavailable to use.
- Are there any Non-functional Constraints? No
- Are there any Side-effects? No
- Is there any suggestion to avoid this alert?
- For reason 2, Never use kill against HANA processes in production environment.
- For reason 3, update HANA to the latest available revision.
Keywords
Inactive, kill services, restarted, Operations Recommendation, #OpsRec-HANA
No comments:
Post a Comment