Monday, February 15, 2016

1846872 - "No space left on device" error reported from HANA

Symptom
1. HANA fails to start with this error which can be found in indexserver or nameserver trace:
Error during asynchronous file transfer, rc=28: No space left on device;

2. df command shows there is still space left on the device or mount point.

3. the file system is GPFS.


Other Terms
GPFS, No Space Left, HANA crashes


Reason and Prerequisites
There was a bug in GPFS for versions prior to 3.4.0.23, which causes GPFS to step occasionally into a read-only mode. That in turn leads to any subsequent writes to return "no space left" error.

df command shows there is space left because df doesn't have the complete picture of the whole clustered file system for GPFS.
For GPFS, mmdf needs to be used instead of df command.
For more information on mmdf:
http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/index.jsp?topic=%2Fcom.ibm.cluster.gpfs.v3r5.0.7.gpfs100.doc%2Fbl1adm_mmdf.htm
There is also a SAP note that explains GPFS in detail:
1650046 - IBM SAP HANA Appliance Operations Guide


To quickly verify that this situation is happenning for you:
1. Make sure that you are running GPFS, and version is eariler than 3.4.0.23
2. copy paste directly on the file system with any file bigger than 4K in size, and see if it throws "No space left", on the directory where HANA was reporting error in.



Solution
1. Make sure to shutdown all HANA nodes by issuing shutdown command from the studio, or with SSH session logged on using sidadm user.
run "HDB stop" if you have one node. run "sapcontrol -nr instance_number -function StopSystem" to stop the entire cluster
After several minutes, run:
HDB info
in SSH with sidadm user to see if there is any HANA processes running.
If there are, use kill -9 to shut it down.

2. Download and apply GPFS patch that is at least 3.4.0.23.

Information about the patch is here:
http://www-01.ibm.com/support/docview.wss?uid=isg400001565
Unfortunately, there's not much information available concerning that APAR from IBM.

We highly recommend that you run uniqueChecker.py script after patching GPFS to make sure that your daatbase is consistent.

Submit tickets to this queue BC-OP-LNX-IBM  for any outstanding question regarding applying the patch or GPFS in general.



Header Data

Released On 02.12.2013 21:50:07
Release Status Released for Customer
Component HAN-DB SAP HANA Database
Priority Recommendations / Additional Info
Category External error

No comments:

Post a Comment