Tuesday, November 1, 2016

1900257 - How to handle HANA Alert 43: 'Memory Usage of Services'

Symptom

When checking the Alerts tab in HANA Studio and Solution Manager, there is an alert called “indexserver (pid 45281) on host suse-hanajl01 is running out of memory! Used memory: 180500 MB, effective allocation limit for this service: 534219 MB”.
For HANA Studio, you would find the alert by going to Administration Console -> Alert -> Show: all alert.


For Solution Manager, you would find the alert by transaction DBACOCKPIT -> choose HANA system -> expand Current Status -> Alerts.


Remember for this “Check used memory against allocation limit” alert in Solution Manager, it is auto-refreshed every 15 minutes, so the time stamp on the alert may be not the exactly time when alert occurs.
To check the exact time, please go to Administration Console -> Alerts -> double click on the alerts.


Environment

  • SAP HANA DATABASE
  • BW on HANA

Cause

Normally there are 2 reasons for this alert:
  1. The threshold value of "Check used memory against allocation limit" in "Configure Check Settings" is improper. The default value is Low: 80, Medium: 90, High: 95.

  1. Out of memory (OOM) issue occurs on HANA DB
The other possible reasons for this current utilized percentage of main memory alert is OOM issue occurs on HANA system, including but not limited to
  • Execution of certain transactions
  • Data loading
  • Hanging of jobs
  • Long running SQL statements
They will generate a high workload on memory by either frequent access to database or joining between huge tables.
  1. Bug in some revisions
As we experienced in previous customer case, used memory cannot automatic released in some older revisions.  This bug would lead to memory issues and trigger the alert.

Resolution

  1. For the alert triggered due to improper threshold value setting, it could be resolved by reset it to default or more reasonable threshold value.


  1. For the alert triggered due to OOM issue

    1. Due to the various kinds of possible causes, the best way is raising a message to SAP to optimize the process. The component should be BC-DB-HDB, and open a remote connection to HANA DB.
    2. If we know which process runs out of memory, we could trace this process, and send the trace file to SAP for further analysis. The trace includes SQL trace and performance trace. Both of them could be done in the Trace Configuration tab.



For SQL trace, after changing it to “Active” and giving a new file name, click “Finish”.
For performance trace, give a new file name and a duration time, then click “Finish”.
    1. If we do not sure which process caused this alert, then we need try to figure out as the following steps.
- Go to Landscape tab to check which server encounters the high memory consumption issue. E.g: the memory consumption is very high in indexserver with Port 30003.


- Go to Performance -> Threads, check which threads are running. E.g: a SQL statement is executed in the figure.


- Running the SQL again, if the issue is reproduced, we could make sure that the statement is (one of) the main cause of the alert.
- Do the trace as previous mentioned, and enclose the trace files in the Customer Message.
    1. The trace files could be seen in Diagnosis Files tab
The SQL trace file can be opened by double clicks on it.
But for the performance trace file, if you want to read it, you need to use HDBAdmin tool to open it. Please refer to wiki page wiki HDBAdmin on Linux to execute the tool.


In the SQL trace file (richard0128.py), we could download it to local and open it with Notepad++, then find the SQL tracing info,

After you prepare your environment and execute HDBAdmin tool, we could load the .tpt file and read it.

Then we could filter the result with known info (e.g Service is indexserver, Method is SqlExecuteStatement, User is SYSTEM) and take a further analysis on the SQL statement.

Even if you could do it by yourself, we still recommends you to send the trace files to SAP unless you are very sure about how to analysis, optimize and configure in further steps.
    1. Before the statement analysis finished, we could try to unload the current unused huge tables from memory, to save the memory resource for the process execution.
The huge tables could be found in “System Information” -> “Schema Size (MB) for loaded tables” and “Size of tables on disk”.

After double click on them, we could sort the size column as the following figure shows, to justify if a table is in use, you can go to Performance -> Threads tab, to check if the table is used by any thread.

  1. If the revision of HANA DB is not the latest version, we strongly recommend to upgrade to the HANA DB to the latest version. If the current licensed memory cannot meet the workload requirements, you may need to increase it.

Keywords

allocation limit, Configure Check Settings, OOM, out of memory, threshold, Diagnosis Files, performance, HDBAdmin, licensed memory,

1977268 - How to handle HANA Alert 40: 'Total memory usage of column-store tables'

Symptom

You experience HANA Alerts 'Total memory usage of column-store tables' (Alert ID = 40).

Environment

This KBA has been created based on SAP HANA 1.0 revision 81

Cause

This alert determines what percentage of the effective allocation limit is being consumed by individual column-store tables as a whole (that is, the cumulative size of all of a table's columns and internal structures).
The default settings for the alert are:
Severity Low Medium High
Threshold 20 % 25 % 30 %
Usually this alert indicates issues with high memory usage.

Resolution

Procedure

  1. Check memory consumption of column-store tables
  2. Monitor data volume in regular intervals

Path:

  1. SAP HANA Studio - SAP HANA Administration Console - Alerts
  2. SAP HANA Studio - SAP HANA Administration Console - SQL Console
  3. SAP HANA Studio - SAP HANA Administration Console - System Information
  4. SAP HANA Studio - SAP HANA Administration Console - Configuration

How to:

This alert is implemented to control the memory usage of resident table data in the column store, together with the Alert 45 – Memory usage of main storage of Column Store tables. The basic analysis steps are:
  1. Check memory consumption of column-store tables In order to understand the current and historic SAP HANA memory consumption of column-store tables, we should focus on the following questions:
    • What is the main memory, delta memory and total memory used by Column Store tables?
    • Which Column Store tables have the largest memory footprint?
    • Which Column Store tables have been least recently loaded into memory?
    There are several circumstances after checking the memory consumption:
    • Sum of main memory plus delta memory should be about the total memory displayed. If this is not the case, create a ticket to SAP Support to investigate the reason (Application Component: HAN-DB).
    • If delta memory << main memory, then processed according to KBA - 1977269 - How to handle HANA alert 45: Memory usage of main memory of a column-store table.
    Consider partitioning, data volume management and hardware extension based on a new sizing.
    • If delta memory >> main memory, then analyze the reason for the large delta according to KBA 1977314 - How to handle HANA Alert 29: Size of delta storage of a column-store table
    You can use the following tools and approaches to check values for memory of column-store tables:
    a)   Initial analysis from SQL commands
    For initial analysis use the following statements published in SAP note 1969700 - SQL Statement collection for SAP HANA
    - HANA_Tables_LargestTables
    - HANA_Tables_TopGrowingTables_Size_History
    If you require further assistance please contact SAP Support. Check with your Enterprise Support Advisor or Technical Quality Manager or create a ticket to SAP Support in component HAN-DB.
    b)   Memory information in SAP HANA studio
    Access System Information Tab in the SAP HANA Studio and check the content of Used Memory by Tables and Schema Size of Loaded Tables.
    The Used Memory by Tables shows total memory consumption of all column and row tables, while the Schema Size of Loaded Tables displays the aggregated memory consumption of loaded tables in MB for different database schemas. The aggregation comprises both Column Store and Row Store tables. Order by the Schema size column and find the largest consumers.
    c)   Memory information from SQL commands
    To get a high-level overview of the amount of memory used for Column Store tables, you can execute the following SQL statement:
    SELECT   ROUND(SUM(MEMORY_SIZE_IN_TOTAL)/1024/1024) AS "CS Memory (MB)", ROUND(SUM(MEMORY_SIZE_IN_MAIN)/1024/1024) AS "CS Memory In Main (MB)",
    ROUND(SUM(MEMORY_SIZE_IN_DELTA)/1024/1024) AS "CS Memory In Delta(MB)"
    FROM M_CS_TABLES
    WHERE LOADED <> 'NO'
    To get a breakdown by host, service, and schema, you can execute the following statement:
    SELECT S.HOST AS "Host",
    SERVICE_NAME AS "Service",
    SCHEMA_NAME AS "Schema",
    ROUND(SUM(MEMORY_SIZE_IN_TOTAL)/1024/1024) AS "Schema CS Memory (MB)"
    FROM M_CS_TABLES AS T JOIN M_SERVICES AS S ON T.HOST = S.HOST AND T.PORT = S.PORT
    WHERE LOADED <> 'NO'
    GROUP BY S.HOST, SERVICE_NAME, SCHEMA_NAME
    ORDER BY "Schema CS Memory (MB)" DESC
    You can use the following technique to examine the amount of memory consumed by a specific table. This also shows which of its columns are loaded, and the compression ratio that was accomplished. For example, list all tables for schema “SYSTEM”:
    SELECT TABLE_NAME as "Table", round(MEMORY_SIZE_IN_TOTAL/1024/1024) as "MB Used" from M_CS_TABLES where SCHEMA_NAME = 'SYSTEM' order by "MB Used" desc
    Or drill down into columns of a single table, for instance the table "LineItem", to view the actual size of the data, the “delta changes” and the compression ratio for each of its columns.
    select COLUMN_NAME as "Column", LOADED, round(UNCOMPRESSED_SIZE/1024/1024) as "Uncompressed MB", round(MEMORY_SIZE_IN_MAIN/1024/1024) as "Main MB", round(MEMORY_SIZE_IN_DELTA/1024/1024) as "Delta MB", round(MEMORY_SIZE_IN_TOTAL/1024/1024) as "Total Used MB", round(COMPRESSION_RATIO_IN_PERCENTAGE/100, 2) as "Compr. Ratio" from M_CS_COLUMNS where TABLE_NAME = 'LineItem'

  2. Monitor data volume in regular intervals
    After having resolved the current issue, monitor data volume of your largest tables in regular intervals based on the SAP Early Watch Alert report.

For details about further steps please refer to the referenced Knowledge Base Articles.
If you require further assistence please contact SAP Support. Check with your Enterprise Support Advisor or Technical Quality Manager or create a SAP incident on component SV-BO-DB-HAN. if you assume a product issue, then use component HAN-DB.

Keywords

HANA Alert
Large Table
Partitioning
Operation Recommendations
#OpsRec-HANA

1977101 - How to handle HANA Alert 12: 'Memory usage of name server'

Symptom

You receive HANA Alert ID 12 - 'Memory usage of name server'

Environment

SAP HANA Database

Cause

  1. The number of tables loaded into memory of SAP HANA database is well above 1 million.
  2. HANA issue with cleaning up shared memory.

Resolution

PROCEDURE

  1. Check Alert Details.
  2. Increase value of parameter size in file nameserver.ini - topology.
  3. Test and validate the new configuration by checking memory consumption of nameserver and by checking for re-occurrences of alert 'Memory usage of name server'.
  4. Further recommendations if countermeasures do not help - Clean up shared memory.

PATH

  • SAP HANA Studio - SAP HANA Administration Console - Alerts
  • SAP HANA Studio - SAP HANA Administration Console - Configuration
  • SAP HANA Studio - SAP HANA Administration Console - Landscape - Services
  • SAP HANA Studio - SQL Console

HOW-TO

1. Check Alert

Access Alert Tab in HANA Studio and check 'Current Alerts':
a.) What is the priority of the alert?
b.) Is an issue occurring on one host only or on multiple hosts?
Check all alerts and put a filter on Alert ID 12 and select the last 3 weeks:
c.) Is it a one-time or temporary issue or are there frequent occurrences based on historic alert data?

2. Increase value of parameter 'size' in nameserver.ini - topology

Goto Configuration Tab in HANA Studio and navigate to nameserver.ini - topology - size. Enter an increased value. Usually it helps to initially double the value from its default value 256 MB (268368504 byte) to 512 MB (536870912 byte). The parameter must not be larger than 2 GB.
If the alert occurs for multiple hosts then changes can be implemented on system level. If only a single host is affected the parameter can be changed on host level. Please note that the parameter is only changeable offline, so after the change you must restart the SAP HANA Database for the new parameter value to take effect.

3. Test and Validate new Configuration

  1. Check for re-occurrences of alert 'Memory usage of name server' (Alert ID = 12). Goto Alerts Tab in HANA Studio and put a filter on Check ID = 12.
  2. If no alert occurs you can manually check trends by checking memory consumption of nameserver in the Services Tab in HANA Studio - Landscape.
  3. For a more detailed check you can display current data of monitoring view M_SHARED_MEMORY in schema SYS via the SQL Editor:

    SELECT 'NAMESERVER_SHARED_MEMORY',HOST,TO_CHAR(ROUND(MAP(SHARED_MEMORY_ALLOCATED_SIZE, 0, 0, SHARED_MEMORY_USED_SIZE / SHARED_MEMORY_ALLOCATED_SIZE * 100))) asValueFROM M_SERVICE_MEMORY WHERE SERVICE_NAME = 'nameserver'

4. Further recommendations

If the alert continues to occur even with increased parameter value, then it is recommended to create a SAP incident with application component HAN-DB for further analysis in the system. The root cause may be either in the usage of the system or in a memory leak. A service connection to HANA Studio should be configured and opened as per the SAP note 1592925
In some cases the following actions have helped:
  1. Clean-up shared memory with SQL statement: ALTER SYSTEM CLEAR SQL PLAN CACHE.
    Note: This query will remove all of the SQL plans that are not currently being executed from the SAP HANA database sql plan cache. It also removes all plans having reference count of '0' from the plan cache and resets all the statistics of the remaining plans. Lastly the command also reset the contents of M_SQL_PLAN_CACHE_OVERVIEW monitoring view. This has helped when automatic removal of hidden temp tables related to SQL Plan Cache has not been performed by the database system, so that they are still existing and using shared memory. This query will remove all of un-used temp tables.
  2. Restart the SAP HANA Database.

Keywords

nameserver.ini; topology; shared memory; large number of tables loaded to memory; over one million tables; size; M_SERVICE_MEMORY; M_SQL_PLAN_CACHE_OVERVIEW; SQL Plan Cache; temp tables; temporary tables;

1840954 - Alerts related to HANA memory consumption

Symptom

Statisticserver alerts are triggered on HANA side:
HANA alert 1: Host physical memory usage
HANA alert 43: Memory usage of services
HANA alert 44: Licensed memory usage

Environment

HANA

Resolution

1. Operating System vs. Database
Before proceeding with analysing the different HANA specific memory areas, it has to be clarified that indeed the database and not processes running outside the SAP Software are responsible. This involves calculating how much resident memory is used by the HANA database. On the HANA appliance, resident memory used outside HANA (OS, 3rd party processes) is typically quite small, not exceeding 2 GB. HANA Studio and DBACockpit provide 2 different views on the resident memory:
HANA Studio:

The following SQL statements are behind this output:
'Database Resident':
SELECT SUM(PHYSICAL_MEMORY_SIZE) FROM M_SERVICE_MEMORY
'Total Resident':
SELECT T1.HOST, T1.USED_PHYSICAL_MEMORY + T2.SHARED_MEMORY_ALLOCATED_SIZE
FROM M_HOST_RESOURCE_UTILIZATION AS T1
JOIN (SELECT M_SERVICE_MEMORY.HOST, SUM(M_SERVICE_MEMORY.SHARED_MEMORY_ALLOCATED_SIZE) AS SHARED_MEMORY_ALLOCATED_SIZE
         FROM SYS.M_SERVICE_MEMORY 
         GROUP BY M_SERVICE_MEMORY.HOST) AS T2
ON T2.HOST = T1.HOST;
Since the difference between 'Total Resident' and 'Database Resident' is well below 2 GB, there is no indication that processes outside the database contribute significantly to memory consumption.

DBACockpit:

This is effectively the output of the following statement:
SELECT HOST,
ROUND(USED_PHYSICAL_MEMORY/1024/1024/1024, 2) AS "Resident GB", 

ROUND((USED_PHYSICAL_MEMORY + FREE_PHYSICAL_MEMORY)/1024/1024/1024, 2) AS "Physical Memory GB"
FROM PUBLIC.M_HOST_RESOURCE_UTILIZATION
DBACockpit does currently not consider shared memory as part of the resident memory, hence the difference to what HANA Studio reports.
If processes outside the database are not contributing significantly to memory usage, 2 cases have to be distinguished in general:
  1. An 'out of memory' error has already occurred.
  2. An 'out of memory' error was not yet triggered, but further investigation on the current memory usage of the database might be required.
For case (1), the foundation for further analysis is the oom trace file which follows the naming convention <processname>_<hostname>.<number>.rtedump.<number>.oom.trc (for example indexserver_hdbnode1.39503.rtedump.72290.oom.trc).
In case (2), such a trace file is not yet available and an rte dump has to be generated which has a structure similar to the oom trace. SAP note 1813020 contains the necessary instructions regarding how to get the rte dump. For case (2), further analysis can also be done 'top-down', making use of the information provided by the different administrative frontends (HANA Studio / DBACockpit) and monitoring views.
2. Top down approach
2.1 Which HANA process on which host is using most of the memory?
SQL Statement:
SELECT TOP 3 HOST, PORT, SERVICE_NAME, TOTAL_MEMORY_USED_SIZE

FROM M_SERVICE_MEMORY

ORDER BY TOTAL_MEMORY_USED_SIZE DESC

DBACockpit:

HANA Studio:
2.1.1 On 'Used Memory'
HANA SP5
The values in the 'Used Memory' column are calculated in the following way:
SELECT ( (CODE_SIZE + SHARED_MEMORY_ALLOCATED_SIZE + HEAP_MEMORY_USED_SIZE ) / 1024 / 1024) AS "USED MEMORY"
FROM M_SERVICE_MEMORY
It is important to note that even though the column is called 'Used Memory', the overall allocated shared memory is used for the calculation and not the value of SHARED_MEMORY_USED_SIZE. The reason for this is that the a shared memory segment (see point 4) can only be completely attached to the address space of the process and not just the used part.
Due to changes in the calculation of CODE_SIZE, there might be cases where the used memory increased after an revision upgrade. This can be seen in the following example:
SELECT SERVICE_NAME, CODE_SIZE

FROM M_SERVICE_MEMORY
rev49:
SERVICE_NAME CODE_SIZE
nameserver 627.855.360
preprocessor 687.079.424
indexserver 733.585.408
statisticsserver 727.781.376
xsengine 725.540.864

rev54 and later:
SERVICE_NAME CODE_SIZE
nameserver 4.498.731.008
preprocessor 4.526.575.616
indexserver 5.256.380.416
statisticsserver 4.956.565.504
xsengine 5.230.637.056
Before HANA revision 54, the code size was incorrectly reported. This problem is explained in SAP note 1826139. The majority of the CODE_SIZE is however shared between the processes, so the assumption that in total 20 GB of memory is consumed by CODE_SIZE is therefore not correct. Starting with HANA SP 06, an instance wide CODE_SIZE will be reported. Additionally, also the license check will then consider this new instance wide CODE_SIZE (which is expected to be around 5 GB).
HANA SP6
Due to changes in the calculation of the code size (see above) the calculation for 'Used Memory' was changed. It can now be retrieved with the following statement:
SELECT INSTANCE_TOTAL_MEMORY_USED_SIZE
FROM sys.m_host_resource_utilization
Starting with SP6, instace_total_memory_used_size is exposed via the view m_host_resource_utilization.
2.2 Is shared or heap memory using the largest share?
So far, the sum of used heap and shared memory has been determined. Now, this total has to be split:
SQL Statement:
SELECT TOP 3 HOST, PORT, SERVICE_NAME, HEAP_MEMORY_USED_SIZE, SHARED_MEMORY_USED_SIZE, TOTAL_MEMORY_USED_SIZE
FROM M_SERVICE_MEMORY

ORDER BY SHARED_MEMORY_USED_SIZE DESC
HANA Studio:


If it is shared memory, proceed as outlined in section 'Shared Memory Usage', otherwise, go to section 'Heap memory usage'.
3. Trace file approach
The following sections of those trace files are usually relevant:
[IPMM_MEMORY]
The first part of this section lists the local (heap) memory the processes that make up the HANA database are currently using:
[0] PID=34884, SId=42309944, compactors active, alive, process name: hdbnameserver
AB=2220466176b (2.06gb), UA=0b, U=2015851859b (1.87gb), FSL=0b, PAL=487738426982b (454.24gb), TPAL=2220466176b (2.06gb), W=100

[1] PID=35049, SId=42310545, compactors active, alive, process name: hdbpreprocessor
AB=365772800b (348.82mb), UA=0b, U=362430594b (345.64mb), FSL=0b, PAL=487738426982b (454.24gb), TPAL=365772800b (348.82mb), W=100

[3] PID=35094, SId=42310846, compactors active, alive, process name: hdbstatisticsse
AB=17623138051b (16.41gb), UA=0b, U=14624613181b (13.62gb), FSL=268435456b (256mb), PAL=27096579276b (25.23gb), TPAL=17757355779b (16.53gb), W=100

[4] PID=35114, SId=42310947, compactors active, alive, process name: hdbxsengine
AB=2270855168b (2.11gb), UA=0b, U=2136436039b (1.98gb), FSL=0b, PAL=487738426982b (454.24gb), TPAL=2270855168b (2.11gb), W=100

[5] PID=33976, SId=171197412, compactors active, alive, process name: hdbindexserver
AB=240495694077b (223.97gb), UA=0b, U=260528715346b (242.63gb), FSL=0b, PAL=487738426982b (454.24gb), TPAL=270062153728b (251.51gb), W=100
For each process, the allocation limit (PAL), the amount of memory currently allocated (AB) and used (U) are displayed. An AB value that is significantly larger than an U value does not necessarily mean a problem: If necessary, allocated, but not used memory will be released. Of particular interest are lines where AB, U and PAL have approximately the same value. This particular process has then reached its allocation limit. Further analysis based on the output of M_HEAP_MEMORY in section [PROCESS_INFO] has to be done then.
The second part of section [IPMM_MEMORY] contains information regarding the shared memory:
GLOBAL_MAX_ALLOCATION_LIMIT=487738426983b (454.24gb), cached sum of allocation limits=487604209255b (454.11gb)
#checks for cached allocation limit sum=838859, #terminated processes since last reset=7
#processes=5, sum of temp process allocation limits=292676603651b (272.57gb), cached sum=487604209255b (454.11gb), weight sum=500, cached weight sum=500
SHARED_MEMORY=194927605604b (181.54gb), temp shared memory allocation limit=194927605604b (181.54gb)
IPMM reservation=0b, emergency reservation=134217728b (128mb)
112 early exits since creation of IPMM, provide-memory-counter=50279
Provide memory 50267 failed, only 109174531b were free.
Users respected in shm calculation: [302]
Shared Memory is used for various purposes. For practical purposes, it is sufficient to assume that mainly row store tables occupy shared memory. In case of the 'early exits since' output in the oom trace, the error was triggered by the inability to further increase the TPAL (temporary allocation limit) of the failing process. The allocation limit of a process is not set to the maximum right at process start, but increased (and decreased) over time. In this specific case, 'sum of temp process allocation limits' + SHARED_MEMORY exceeds the GLOBAL_MAX_ALLOCATION_LIMIT which consequently causes the error.
[MEMORY_OOM]
Before any detailed analysis is done, the information from section [MEMORY_OOM] has to be reviewed as well. It might be the case that the request was that large that it was anyway impossible to fulfill it:
[MEMORY_OOM]  Information about current out of memory situation: 
OUT OF MEMORY occurred.
Failed to allocate 2405843009873693952 byte.
 A known issue regarding such large request is solved with revision 50. In case this occurs in newer revisions, please contact SAP Support.
The following checks can then be done based on the information available so far:
4. Shared Memory Usage
Contrary to heap memory, which is allocated using the 'malloc' system call, shared memory is provided using the 'shmget' call. The results of shared memory creation can then be viewed on OS level using the ipcs command. The following approach can be used to display the shared memory of one particular HANA process on operating system side:
  1. Get the process pid: ps -ef | grep <HANA process>
  2. ipcs -p | grep <pid> then displays all segments that were created by this particular process:
ipcs -p | grep 4221
86999065   hanadm     4221       4221
87064602   hanadm     4221       4221
87130139   hanadm     4221       4221
87195676   hanadm     4221       4221
87261213   hanadm     4221       4221
87359519   hanadm     4221       4221
88309819   hanadm     4221       4221
88375356   hanadm     4221       4221
88440894   hanadm     4221       4221
  1. The size of a particular segment can then be further examined using the command ipcs -m -i <id>
ipcs -m -i 86999065
Shared memory Segment shmid=86999065
uid=60000       gid=100 cuid=60000      cgid=100
mode=01600      access_perms=0600
bytes=8929752   lpid=4221       cpid=4221       nattch=1
att_time=Tue May 14 14:09:13 2013
det_time=Tue May 14 14:09:13 2013
change_time=Tue May 14 14:09:13 2013
The sum of all those shared memory segments is then equivalent to the output of the statement:
SELECT SHARED_MEMORY_ALLOCATED_SIZE
FROM M_SERVICE_MEMORY
WHERE PROCESS_ID = '4221'
4.1 Memory usage of the row store
  1. The row store is organized in 64MB segments. Deleting large number of rows can lead to sparse segments and unnecessary use of memory. SAP note 1813245 outlines problems and strategies to solve this problem.
  2. Indirectly, high memory usage of the row store can cause problems when parameter client_distribution_mode is set to 'off' in distributed environments. When default value 'statement' is used, the statement is sent to the node where the table is located. With the setting 'off', the statement might then also be directed to the master node. Since all row store tables are usually located on the master node and consequently, the row store uses a lot of memory, materializing a large amount of data on the master node (from a table that is actually located on another node) can then simply be too much. Therefore, in case of problems with resolving internal and external hostnames, parameter client_distribution_mode should not be turned off, but instead the recommendations from SAP note 1780950 are to be followed.
  3. Too many large tables were created as row store tables. The largest tables currently contained in the row store can be retrieved by the following SQL Statement:
SELECT TOP 50 *
FROM M_RS_TABLES
ORDER BY (ALLOCATED_FIXED_PART_SIZE +
ALLOCATED_VARIABLE_PART_SIZE) DESC
It is however important to keep in mind that in many scenarios, switching a table from rowstore to columnstore and vice versa must not be done. This applies to Suite on HANA / BW systems which are delivered with particular settings for different table types. At any case, it makes sense to check whether the system is currently adhering to the recommended configuration or not. SAP note 1659383 contains a list of tables that are supposed to be in the rowstore and a check that is based on this list is part of report RSDU_TABLE_CONSISTENCY. This report also contains a repair option when a deviation is recognized.
5. Heap memory usage
If it can be ruled out that the rowstore is responsible for high memory consumption, the heap memory allocation of the individual processes has to be analyzed next. In most cases, the process of interest is the indexserver, since it usually allocates the greatest share of the available memory. The following possibilities exist to do this:
  1. If an oom error has already occurred or a rte dump was explicitly triggered, the content of the system view M_HEAP_MEMORY (section[PROCESS_INFO]) that was written into the trace file can be used.
  2. If an ad-hoc analysis of the heap memory consumption should be done, the view M_HEAP_MEMORY can be queried directly, for example using HANA Studio.
In case of (1), the output can be directly copied into a csv file and further be analyzed for example using excel. Each line represents an allocator, which is a facility for allocating heap memory in HANA. The following statements can be used on M_HEAP_MEMORY then:
SELECT TOP 15 MS.HOST, MS.SERVICE_NAME,MH.CATEGORY, MH.INCLUSIVE_SIZE_IN_USE, MH.EXCLUSIVE_SIZE_IN_USE
FROM M_HEAP_MEMORY MH, M_SERVICES MS
WHERE MH.PORT = MS.PORT AND
           MH.HOST = MS.HOST AND
           MS.SERVICE_NAME = 'indexserver'
ORDER BY 4 DESC 
Of interest are those pools that are on the top of the list and have an inclusive size close to the exclusive size. Depending on the pool names, the following known issues exist:
Pool/parallel/ihm
There are cases where this allocator used 90% of the available memory. If such a problem is observed, the system should be upgraded to revision 54 when available.
Pool/itab
This allocator is needed for the intermediate result of join operation and translation table used for join operation. This could indicate use of a suboptimal model.
Pool/FemsCompression/CompositeFemsCompressionPool/FemsCompression/CompositeFemsCompression
A known issue is associated with high memory usage that is solved with revision 53. FEMS compression is used in BW systems to keep the size of the dataset that is transferred to the application server as small as possible.
Pool/ValueArray
This is the allocator which keeps the complete resultset in uncompressed form. A known issue has been solved with revision 55 which addresses memory used by this allocator not released after oom situations, thus steadily increasing over time. The details are documented in SAP note 1852425.
Pool/JoinEvaluator/TranslationTable
Translation tables are created for caching join column entries of join engine queries. The number of translation tables that are kept in memory is determined by the the following parameter in indexserver.ini:
[joins]

translator_cache_size = n
The default value is 2000. In order to verify how many translation tables currently exist in the system, the following statement can be used:
select * from sys.M_DEV_JOIN_TRANSLATION_TABLES_
order by TRANSLATION_TABLE_MEMORY_SIZE desc
This statement sums up the memory usage of those tables:
select host, port, transtable_memory from sys.m_dev_joinengine_
In case of excessive memory usage for this allocator, the parameter can be set to 500 as a starting point. Lowering the value can have an impact on query performance though. However, in live systems where this was done, no negative side effects were observed sofar.
Pool/malloc/libhdbcsapi.so
A memory leak was introduced with revision 54 that can lead to a huge growth of this allocator when BW queries containing lots of OR terms instead of IN lists are executed. A possible workaround from application side is the implementation of SAP note 1786777. From HANA side, the problem is solved starting with revision 60 (HANA SP6). See also SAP note 1880274.
6. Memory Usage of the Statisticsserver
Even though the statisticsserver can also run into oom errors, it should not consume a large share of the available memory. The statisticsserver is supposed to have a much smaller allocation limit compared to the indexserver. This following two configuration parameters are relevant:
statisticsserver.ini
  -> memorymanager
         allocationlimit
         minallocationlimit
The allocation limit of the statisticsserver is either a percentage of the availably RAM or just the value of minallocationlimit, depending on what is the bigger value. In case of oom errors that indicate that the statisticsserver has reached its process allocation limit, it can make sense to increase the allocationlimit to 10% or 15% to have at least a workaround for the time being. However, further analysis regarding the heap memory usage has to be conducted (see above). In the oom trace file, such a situation would look like this:
[3] PID=23840, SId=320543005, compactors active, alive, process name: hdbstatisticsse

    AB=6755876864b (6.29gb), U=7135415505b (6.64gb), FSL=0b, PAL=6772669235b (6.30gb), TPAL=6772669235b (6.30gb), W=100
AB, PAL and TPAL are on the same values. Also in this case, further investigation regarding the involved pools is necessary, either using the information contained in the trace files or by using the following SQL statement:

SELECT TOP 15 MS.HOST, MS.SERVICE_NAME,MH.CATEGORY, MH.INCLUSIVE_SIZE_IN_USE, MH.EXCLUSIVE_SIZE_IN_USE
FROM M_HEAP_MEMORY MH, M_SERVICES MS
WHERE MH.PORT = MS.PORT AND
         MH.HOST = MS.HOST AND
         MS.SERVICE_NAME = 'statisticsserver'
ORDER BY 4 DESC

1898317 - How to handle HANA Alert 1: ‘Host physical memory usage’

Symptom

When checking the Alerts tab in HANA Studio and Solution Manager, there is an alert with description "<host name> runs out of physical memory! Installed memory: <memory value>, instance used <memory value>, instance used physical <memory value>”.

Environment

SAP HANA Database

Reproducing the Issue

For HANA Studio, you can find the alert by going to Administration Console -> Alert -> Show: all alert with filter “Installed” (also see attached screenshots).


For Solution Manager, you can find the alert by going to transaction code DBACOCKPIT -> choose HANA system -> expand Current Status -> Alerts (also see attached screenshots).


Cause

Normally there are 2 reasons for this alert:
1. The threshold value of "Check host free physical memory" in "Configure Check Settings" is too low. The default value are Low: 100, Medium: 150, High:200.


2. The other possible reason for this host free memory alert is that an out of memory issue occurs on the HANA system. Either a memory leak, because of old versions do not have the automatic memory release mechanism, or a certain process that is consuming too much memory might lead to an OOM issue.

Resolution

1. The alert triggered due to an improper threshold value setting can be resolved by resetting it to the default or a higher threshold value.


2. The alert triggered due to an OOM issue can be resolved as follows:
  • First you need to check the version of HANA DB and upgrade it to the latest version if you are using an old HANA Revision.
  • Second, if the alert still occurs, you could try to unload the unused huge tables from the memory, and also, find out, which process (es) is consuming so much memory.
    The huge tables can be found in “System Information” -> “Schema Size (MB) for loaded tables” and “Size of tables on disk”.

  • For a more detailed root cause analysis of the memory usage please see the information in the SAP Knowledge Based Article 1840954 - Alerts related to HANA memory consumption. 
3. Restart the HANA database.

Keywords

 threshold; Check host free physical memory; OOM; out of memory; memory utilization; Schema Size (MB) for loaded tables; Size of tables on disk;

2057046 - FAQ: SAP HANA Delta Merges

Symptom

You want to understand merges in SAP HANA environments.

Environment

SAP HANA

Cause

1. What are delta merges in SAP HANA environments?
2. Where can I find more detailed information related to delta merges?
3. Which indications exist for problems in the delta merge area?
4. Is the delta merge an online operation?
5. How is a merge triggered?
6. How can the history of temporal tables be merged manually?
7. How can a manual delta merge be triggered for individual partitions?
8. How can delta merges and the delta storage be monitored?
9. How can peaks of high load due to concurrent or parallelized delta merges be prevented?
10. How can the load, priority and decision functions be interpreted?
11. Can merge related parameters be adjusted?
12. Why can large delta storages be critical?
13. What should be done in case of large delta storages?
14. How can I trace the merge activities in order to analyze problems with merges?
15. Why is more memory and disk space required during the delta merge operation?
16. What can be done to limit the memory requirements during delta merges?
17. How can I check for which tables auto merges are disabled?
18. How can I check for which tables persistent merges are disabled?
19. In which situations are critical merges executed?
20. Why do I see auto merges on tables although auto merges are disabled on table level?
21. What is the meaning of the different TYPE values in table M_DELTA_MERGE_STATISTICS?
22. What do the shortcuts mean which are used in the decision and cost functions?
23. In which scenarios can it be useful to reduce or disable auto merges?
24. Is the default auto merge decision function already optimal?
25. Is the default critical merge decision function already optimal?
26. What needs to be done in case of merge related errors?
27. Is a restart required when changing delta merge related parameters?
28. Is the delta storage empty after a merge?
29. What are reasons why auto merges aren't executed as expected?
30. How can I temporarily disable merges for a critical table?
31. Does M_DELTA_MERGE_STATISTICS also contain merge information from the secondary site of a system replication scenario?
32. For which services is the mergedog required?

Resolution

1. What are delta merges in SAP HANA environments?

Changes (INSERT, UPDATE, DELETE) to SAP HANA column store tables are not directly updated in the main storage. Instead the change information is written to a dedicated delta storage:

From time to time the main storage needs to be updated with the content of the delta storage. This operation is called delta merge. The rough execution steps of a delta merge are illustrated in the following picture:

You can see that during a delta merge two separate main and delta storages exist.

2. Where can I find more detailed information related to delta merges?

More detailed information about the delta merge operation can be found in the SAP HANA Administration Guide.
The SAP HANA Troubleshooting and Performance Analysis Guide provides troubleshooting details for problems in the delta merge area.

3. Which indications exist for problems in the delta merge area?

The following SAP HANA alerts indicate problems in the delta merge area:
Alert Name  Description
10 Delta merge (mergedog) configuration Determines whether or not the 'active' parameter in the 'mergedog' section of system configuration file(s) is 'yes'. mergedog is the system process that periodically checks column tables to determine whether or not a delta merge operation needs to be executed.
19 Record count of delta storage of column-store tables Determines the number of records in the delta storage of column-store tables.
29 Size of delta storage of column-store tables Determines the size of the delta storage of column tables.
88 Auto merge for column-store tables Determines if the delta merge of a table was executed successfully or not.
SQL: "HANA_Configuration_MiniChecks" (SAP Note 1969700, 1999993) returns a potentially critical issue (C = 'X') for one of the following individual checks:
Check ID Details
538 Delta merges > 900 s (last day)
540 Failing delta merges (info messages, last day)
541 Failing delta merges (error messages, last day)
542 Auto merge tables with delta storage > 5 GB
543 Auto merge tables with many delta records
544 Non-auto merge tables with delta storage > 5 GB
545 Non-auto merge tables with many delta records
546 Delta size of small auto merge tables (GB)
547 Non BW tables with disabled auto merge
548 Tables with disabled persistent merge

4. Is the delta merge an online operation?

Most steps of the delta merge are performed online, so concurrent read and write accesses to the table aren't blocked. Only during the following typically short periods of time a lock is acquired blocking change operations:
  • Beginning of merge: Move of uncommitted rows from the source delta storage (Delta1) to the target delta storage (Delta2); see SAP Note 2098801 for more information
  • End of merge: Switch from source to target storages
5. How is a merge triggered?
Merges can be triggered in the following ways - automatically and manually, forced and conditioned. The following picture provides a rough overview:
delta__1.png
Additionally there can be implicit merges - independent of mergedog and manual triggers - in the following contexts:
  • Recovery
  • Log replay
Automatic merges are performed by the mergedog which is configured with the following parameters:
Parameter Value
indexserver.ini -> [mergedog] -> active
true: Activates mergedog (default)
false: Deactivates mergedog
indexserver.ini -> [mergedog] -> check_interval
<frequency_ms>: Defines the interval (in ms) of mergedog invocations (default: 60000, i.e. 1 minute)
Further details for the merge types can be found in the table below:
Merge Type Classification  Usage Scenarios  Details
Auto merge automatic, conditioned Normal system operation Automatic merges are performed by the mergedog process when certain conditions are fulfilled.
Detailed automatic merge conditions are maintained using the following parameter:
indexserver.ini -> [mergedog] -> auto_merge_decision_func
Per default the auto merge option is active for each table. If required it can be activated and deactivated on table level using the following SQL commands:
ALTER TABLE "<table_name>" ENABLE AUTOMERGE
ALTER TABLE "<table_name>" DISABLE AUTOMERGE
Critical merge automatic, conditioned During times of critical SAP HANA memory allocation Critical merges are performed by the mergedog process in order to avoid system instabilities.
The critical merge conditions are maintained using the following parameter:
indexserver.ini -> [mergedog] -> critical_merge_decision_func
Hard merge manual, unconditioned Exceptional situations (e.g. problems with auto merge or specific tables) Hard merges are unconditioned merges controlled by the application or administrator. They are either executed immediately or - in case of high concurrent merge activity - at a later point in time.
Hard merges are triggered with the following SQL command:
MERGE DELTA OF "<table_name>" [FORCE REBUILD]
The FORCE REBUILD option doesn't indicate a forced merge, instead it makes sure that the delta log on disk is cleaned immediately (see SAP Note 2144274).
Alternatively you can use SAP HANA Studio for that purpose:


SAP HANA Studio -> <system> -> Catalog -> <schema> -> <table> -> "Perform Delta Merge..."
In exceptional cases you can trigger a hard merge of all tables using the following Python script delivered as part of the SAP HANA installation:
/usr/sap/<sid>/HDB<inst_id>/exe/python_support/mergeAllDeltaTables.py
Be aware that this action can be responsible for CPU and memory bottlenecks and there is normally no need to execute it, because reasonable merge operations are performed by the mergedog or by smart merges controlled by the application.
Forced merge manual, unconditioned, immediate Exceptional situations (e.g. problems with auto merge or specific tables) A forced merge is a variant of a hard merge which is generally executed immediately.
Forced merges are triggered with the following command:
MERGE DELTA OF "<table_name>" WITH PARAMETERS ('FORCED_MERGE' = 'ON')
Smart merge manual, conditioned Standard approach for merges controlled by application (e.g. in order to avoid interference of auto merges with data load activities) Smart merges are conditioned merges controlled by the application. This approach can be useful during specific application scenarios like BW data loads. It makes sure that auto merge operations don't interfer negatively with the business activities.
Smart merges can be controlled by setting the following parameter to 'true' (enabled, default) or 'false' (disabled):
indexserver.ini -> [mergedog] -> smart_merge_enabled
When a smart merge is triggered by the application, the following parameter controls if a merge is actually executed:
indexserver.ini -> [mergedog] -> smart_merge_decision_func
Smart merges are triggered with the following command:
MERGE DELTA OF "<table_name>" WITH PARAMETERS ('SMART_MERGE' = 'ON')
In SAP ABAP environments you can use the function module TREX_EXT_MERGE_DELTA_INDEX for triggering smart merges from application side.
Memory merge (SPS <= 08) manual, unconditioned, not persisted Exceptional situations (e.g. to avoid disk I/O bottlenecks) Memory merges are only available up to SPS 08. Afterwards they are no longer possible - even if the configuration may still indicate that they should be done.
A memory merge is a variant of a hard merge which is not persisted. The merge only happens in memory with the advantage of reduced disk I/O and the disadvantage of a longer delta log and longer recovery times.
Memory merges are triggered with the following command:
MERGE DELTA OF "<table_name>" WITH PARAMETERS ('MEMORY_MERGE' = 'ON')
Per default the persistent merge option is active for each table. If required it can be activated and deactivated on table level using the following SQL commands:
ALTER TABLE "<table_name>" ENABLE PERSISTENT MERGE

ALTER TABLE "<table_name>" DISABLE PERSISTENT MERGE

6. How can the history of temporal tables be merged manually?

The history part of temporal tables can be merged by adding the key word HISTORY after MERGE:
MERGE HISTORY DELTA OF "<table_name>" ...

7. How can a manual delta merge be triggered for individual partitions?

An individual table partition can be merged by including the PART<part_id> clause in the MERGE command:
MERGE DELTA OF "<table_name>" PART <part_id> ...
8. How can delta merges and the delta storage be monitored?
Delta merge activities are tracked in table M_DELTA_MERGE_STATISTICS and its history in HOST_DELTA_MERGE_STATISTICS.
See SAP Note 2088971 for details how to configure the amount of data available in monitoring view M_DELTA_MERGE_STATISTICS.
You can use SQL: "HANA_Tables_ColumnStore_Merges" (SAP Note 1969700) to retrieve information. It provides information like execution time, merge type, table name, merge duration, number of rows merged and merge errors.
Example output:

The current utilization of the delta storage can be analyzed using SQL: "HANA_Tables_ColumnStore_DeltaStorage" (SAP Note 1969700).
Example output:

Furthermore merge activities can be monitored in the following ways:
Tables SQL statement (SAP Note 1969700) Details
M_SERVICE_THREADS
M_SERVICE_THREAD_SAMPLES
HOST_SERVICE_THREAD_SAMPLES
SQL: "HANA_Threads_CurrentThreads"
SQL: "HANA_Threads_ThreadSamples_FilterAndAggregation"

SQL: "HANA_Threads_ThreadSamples_AggregationPerTimeSlice"
Merge related thread information can provide insight which tables are currently merged and what kind of detailed activity is executed. See SAP Note 2114710 for more information.
M_JOB_PROGRESS SQL: "HANA_Jobs_JobProgress" Starting with SAP HANA SPS 10 the view M_JOB_PROGRESS contains information for the current M_DELTA_MERGE_STATISTICS related activities (JOB_NAME = 'BW F Fact Table Compression', 'Delta Merge', 'Optimize Compression', 'Reclaim Delta').
9. How can peaks of high load due to concurrent or parallelized delta merges be prevented?
All merge operations apart from forced merges can only start if a so-called merge token is assigned. If no merge token is available, the merges have to wait. Typical symptoms are "Semaphore Wait" locks on TRexAPI::TableMergeRequest::TableMergeRequest as described in SAP Note 1999998.
The total number of available merge tokens is calculated via the following parameter:
indexserver.ini -> [mergedog] -> load_balancing_func
The value calculation includes information like the number of available CPUs, the CPU utilization and the heap memory situation.
When a merge is triggered, SAP HANA calculates the number of required merge tokens. If this calculation is not possible, the value defined by the following parameter is used (default: 2):
indexserver.ini -> [mergedog] -> token_per_table
Furthermore there are priority parameters that are used to calculate priorities of merge requests so that they can be processed in an optimized order:
indexserver.ini -> [mergedog] -> auto_merge_priority_func
indexserver.ini -> [mergedog] -> hard_merge_priority_func
indexserver.ini -> [mergedog] -> smart_merge_priority_func
10. How can the load, priority and decision functions be interpreted?
The load, priority and decision function parameters contain quite complex logical expressions with several 3 character shortcuts. It is not trivial to understand their meaning. The SAP HANA Administration Guide describes the shortcuts. With this information it is possible to understand better how the execution of merges depend on factors like CPU, heap memory or the delta memory size. In the following a typical auto merge decision function (SPS <= 08) is illustrated:
( ( THM >= 256000 and                     At least 256 GB of heap
    ( ( ( DMS > 100 or                    Delta memory size > 100 MB
          DCC > 100 or                    Delta cell count > 100 million (rows * columns)
          DLS > 1000                      Delta log size > 1 GB
        ) and
        DRC > MRC / 100                   Delta row count more than 1 % of main row count
      ) or
      ( DMR > 0.2 * MRC and               Deleted main rows > 20 % of main rows and
        DMR > 0.001                       Deleted main rows > 1000
      )
    )
  ) or
  ( THM < 256000 and                      Less than 256 GB of heap
    ( DMS > 50 or                         Delta memory size > 50 MB
      DCC > 8 or                          Delta cell count > 8 million (rows * columns)
      DLS > 100                           Delta log size > 100 MB
    )
  )
) and
( DUC < 0.1 or                            Uncommitted delta row count < 100000
  0.05 * DRC >= DUC                       Uncommitted delta row count <= 5 % of delta row count
) 
With SPS 09 a different default for the auto merge decision function is in place:
( ( ( DMS>PAL/2000 or                             Delta memory size > 0.05 % of process allocation limit
      DCC>100                                     Delta cell count > 100 million
    ) and 
      DRC > MRC/100                               Delta row count more than 1 % of main row count
  ) or
  ( DMR>0.2*MRC and                               Deleted main rows > 20 % of main rows and 
    DMR > 0.001                                   Deleted main rows > 1000
  )
) and 
( DUC<0.1 or                                      Uncommitted delta row count < 100000
  0.05*DRC>=DUC                                   Uncommitted delta row count <= 5 % of delta row count
)
It is possible to include further general conditions into these functions, e.g. to change the behavior during specific times or for specific tables.
Example: (decision function including time dependent rules)
DMS > 1000 or DMS > 42 and weekday(now())=6 and secondtime(now())>secondtime('01:00') and secondtime(now())<secondtime('02:00')
11. Can merge related parameters be adjusted?
Due to the complexity of the merge operation changes to the parameters can lead to unforeseen effects like increased system load, higher memory consumption or performance degradations. Therefore it is recommended to adjust merge related parameters only in exceptional situations. If parameters are adjusted for a specific task (e.g. as part of a migration activity) it is important to undo the changes after the activity has finished.

12. Why can large delta storages be critical?

Large delta storages can be critical for the following reasons:
Problem Details
Increased memory consumption Large delta storages increase the memory requirements because:
  • They are compressed less efficiently than the main storages
  • Not only UPDATE and INSERT, but also DELETE operations result in a growing delta storage. Memory can only be reclaimed with a delta merge (see SAP Note 2088183)
Longer recovery times Large delta storages result in large delta logs which can increase the recovery times.
Performance degradations Database requests can take longer because the delta storage is not read-optimized.
Increased disk size Not only the memory size, but also the disk size interferes with delta merges. See SAP Note 2014987 which describes a situation where a large disk size can be reduced by performing a delta merge operation.
13. What should be done in case of large delta storages?
At first you should check if the large delta storage really has to be treated as a problem. For example it is normal that there are large delta storages during data loads in BW, because BW often suppresses auto merges and uses smart merges at the end. See SAP Note 2043152 for more information.
If you suffer from large delta storages you can proceed according to SAP Note 1977314 in order to understand and resolve the root cause. SAP Note 2039810 describes a scenario where auto merges are no longer executed after a table was renamed.

14. How can I trace the merge activities in order to analyze problems with merges?

To find out details about merge activities you can increase the level of the database trace for "mergedog" and "mergemonitor" to INFO. Results will be written to the standard service trace file. See SAP Note 2119087 for more details.

15. Why is more memory and disk space required during the delta merge operation?

A delta merge needs about twice the size of the table (partition) in memory and on disk to perform the merge. More precisely: size of source main + size of target main + size of source delta. Additionally a temporary working space is needed which is about the size of the largest column of the table (partition).
In case you have a table of 100 GB with 40 GB being the size of largest column, you would need about 100 GB * 2 + 40 GB = 240 GB while the merge is executed.

16. What can be done to limit the memory requirements during delta merges?

The memory requirements during a merge depend both on the overall table size and the delta storage size. The following approaches exist to reduce the memory requirements during delta merges:
Area Details
Partitioning Partition large tables so that smaller delta storages on partition level are used instead of a large table-wide delta storage.
Reduction of memory footprint Reduce the memory requirements of the table by actions like cleanup of basis tables, archiving, reduction of indexes, reduction of CONCAT attributes, data aging and extended storage. See SAP Note 1999997 ("Which options exist to reduce the risk of SAP HANA memory issues?") for more information
Reduction of changes Check from application side if you can reduce the amount of INSERT / UPDATE / DELETE operations.
Merge parameters Make sure that the merge related parameters are set to reasonable values (optimally default values) and avoid settings resulting in merges of large delta storages.
In individual cases you can adjust merge parameters so that merges happen before the delta storage has grown to critical sizes.
Manual merges In individual cases you can execute manual merges before the delta storage has grown to critical sizes.

17. How can I check for which tables auto merges are disabled?

You can use SQL: "HANA_Tables_ColumnStore_AutoMergeDisabled" (SAP Note 1969700) in order to check for tables with disabled auto merges.
Disabled auto merges on table level can be a consequence of table moves from row store to column store using SAP ABAP functionality. See SAP Note 2196615 for more information.

18. How can I check for which tables persistent merges are disabled?

You can use SQL: "HANA_Tables_ColumnStore_PersistentMergeDisabled" (SAP Note 1969700) in order to check for tables with disabled persistent merges.

19. In which situations are critical merges executed?

Critical merges are controlled by the critical_merge_decision_func parameter. A typical default value is:
UPT > 43200 and                           Indexserver uptime at least 43200 seconds (12 hours)
( ( MMS < 10000 and                       Main memory size < 10 GB
    DMS > 1000 and                        Delta memory size > 1 GB
    TMD > 86400                           Table merge delay (time since last merge, in seconds)
  ) or 
  TMD>604800                              Table merge delay (time since last merge, in seconds)
)
This means that critical merges are only executed when the indexserver runs already for at least 12 hour and additionally at least one of the following conditions is fulfilled:
  • The table main size is smaller than 10 GB, the table delta size is greater than 1 GB and the table wasn't merged since at least one day
  • The table wasn't merged since around 7 days
 So we can see that - among others - an unconditioned merge of tables is performed when the last merge happened more than 7 days ago.

20. Why do I see auto merges on tables although auto merges are disabled on table level?

Up to SAP HANA Rev. 121 critical merges are registered with MOTIVATION = 'AUTO' in views like M_DELTA_MERGE_STATISTICS and HOST_DELTA_MERGE_STATISTICS. Therefore you can see a merge categorized as 'AUTO' when a critical merge was executed (e.g. because no successful merge happened within the last 7 days).
Starting with SAP HANA Rev. 122 critical merges are recorded with MOTIVATION = 'CRITICAL' in M_DELTA_MERGE_STATISTICS.

21. What is the meaning of the different TYPE values in table M_DELTA_MERGE_STATISTICS?

In fact not all entries in M_DELTA_MERGE_STATISTICS actually reflect merge operations, it depends on the TYPE. The following TYPE values exist in M_DELTA_MERGE_STATISTICS:
Type Merge  Details
FACT No BW fact table compression (BW_F_FACT_TABLE_COMPRESSION)
HINT No Smart merge request triggered by application, smart_merge_decision_func is evaluated and - if merge is indicated - a subsequent MERGE step is performed
MERGE Yes Delta merge
RECLAIM No Garbage collection in delta storage
SPARSE No Compression optimization (see SAP Note 2112604)
Typically executed as subsequent step of an auto merge, smart merge or critical merge, can also be triggered manually
Not restricted to SPARSE compression, so 'COMPRESS' would be a clearer name for this type

22. What do the shortcuts mean which are used in the decision and cost functions?

The most important shortcuts used in decision and cost functions are:
Shortcut Unit Details
AHM MB Available heap memory
BASENAME   Table name (SAP HANA SPS >= 12)
CLA % CPU load average
CRCSOC million Changed row count since optimize compression
DCC million Delta cell count (number of rows in delta multiplied with number of table columns)
DLS MB Delta log size
DMR million Deleted main rows
DMS MB Delta memory size
DRC million Delta row count (uncommitted rows are not always taken into account)
DUC million Uncommitted delta row count
FULLNAME   Internal table name (SAP HANA SPS >= 12):
  • Table partitioned: _SYS_SPLIT/<table_name>~<part_id>
  • Table not partitioned: <table_name>
GAL MB Global allocation limit
IF(<arg1>, <arg2>, <arg3>)   Returns <arg2> if <arg1> is fullfilled (i.e. true or not 0), otherwise <arg3>
INSTR(<full_string>, <sub_string>)   Returns 0 if <sub_string> isn't contained in <full_string>, otherwise the position of the first occurrence of <sub_string> in <full_string> is returned
LCC   Logical CPU count
LOADED   TRUE if table is loaded, otherwise FALSE
MMS MB Main memory size
MMU million Main max UDIV
MRC million Main row count
NAME   Internal table name:
  • Table partitioned: _SYS_SPLIT/<table_name>~<part_id>
  • Table not partitioned: <table_name>
Due to the special naming convention for partitioned tables you have to use BASENAME (SAP HANA SPS >= 12) or the following approach if you want to refer to a partitioned table:
INSTR(NAME, '<table_name>') > 0
NMU million
Main max UDIV (>= SPS 09) 
NOW()  
Current date and time (localtime of the server timezone)
OCRC million (Last) optimize compression row count
PAL MB Process allocation limit
QDW s Queuing delay wait (wait time for assignment of token)
RHM MB Estimated required heap memory to complete table optimization
RP boolean TRUE if table is range partitioned, otherwise false
SCHEMA   Schema name
THM MB Total heap memory
TMD s Time since last merge of table
UPT s Uptime of indexserver

23. In which scenarios can it be useful to reduce or disable auto merges?

Disabling auto merges is mainly useful in scenarios of mass changes, e.g. data loads (many INSERTs) or archiving (many DELETEs). Several standard functionalities already work in this way:
  • In BW environments auto merges are explicitly disabled for certain table types in order to speed up operations like data loads. BW relies on smart merges triggered on application side.
  • Migration tools disable auto merges or set the auto_merge_decision_func to values that result in less auto merges.
The effect of reducing or disabling auto merges depends on several factors like the COMMIT frequency and the actual DML activities. Furthermore it is a trade-off between performance and memory consumption. The fewer merges are done, the larger is the memory requirement. Therefore a general rule of thumb is not available and you have to test the different options if you perform time-critical mass changes.

24. Is the default auto merge decision function already optimal?

The default auto merge decision function has changed a few times. Regardless of these changes it tends to merge small tables with large delta storages too rarely with SPS 09 and below. For example, an auto merge is typically not triggered for a table with 20 MB main storage and 90 MB delta storage, because the threshold for delta storage sizes is 100 MB or higher. As a consequence the memory allocation is higher than necessary and the performance can suffer.
In order to eliminate this issue, you can adjust the auto_merge_decision_func parameter by adding the following condition after the first bracket in its value:
(DRC * TMD > 3600 * ( MRC + 0.0001 )) or
This formula compares the delta row count with the main row count and additionally takes the time since the last merge into account, so that e.g. a small table with the same number of records in main and delta storage will typically be merged after 1 hour.
Example (Rev. 82):
ALTER SYSTEM ALTER CONFIGURATION ('indexserver.ini', 'SYSTEM') SET ('mergedog','auto_merge_decision_func')=
'((DRC*TMD>3600*(MRC+0.0001)) or (THM>=256000 and (((DMS>100 or DCC>100 or DLS>1000) and DRC>MRC/100) or (DMR>0.2*MRC and DMR>0.001))) or 
(THM<256000 and (DMS>50 or DCC>8 or DLS>100))) and (DUC<0.1 or 0.05*DRC>=DUC)' WITH RECONFIGURE
Be aware that you should always take the default value for the auto merge decision function in the system as a basis and only add the additional term at the beginning. If you upgrade to a later SAP HANA revision, you should revisit this parameter setting and adjust it if the default has changed. With SPS 10 and higher the default parameter value has been adjusted so that also small tables are properly considered. Thus, you have to remove this special configuration when you uses SPS 10 or higher.

25. Is the default critical merge decision function already optimal?

Up to SAP HANA SPS 10 the default value of the critical_merge_decision_func parameter is:
UPT>43200 and ((MMS<10000 and DMS>1000 and TMD>86400) or TMD>604800)
When a table wasn't merged for more than one day and a high amount of changes are executed without COMMIT, so that the delta storage size exceeds 1 GB, the critical merge will kick in. When moving the uncommitted changes to the new delta storage, DML operations on the table are blocked (SAP Note 2098801). This can take significant time in case of a high amount of uncommitted changes. Therefore it is useful to adjust the critical_merge_decision_func parameter so that the critical merge is only executed when the amount of uncommitted changes is on a reasonably low level:
UPT>43200 and ((MMS<10000 and DMS>1000 and TMD>86400) or TMD>604800) and DUC<0.1
Starting with SAP HANA SPS 11 this change is incorporated per default.

26. What needs to be done in case of merge related errors?

Delta merges can fail with various errors, and the details are written to columns LAST_ERROR and ERROR_DESCRIPTION in monitoring view M_DELTA_MERGE_STATISTICS and its history HOST_DELTA_MERGE_STATISTICS. You can check for the top error messages using SQL: "HANA_Tables_ColumnStore_Merges" (SAP Note 1969700).
Example:

Be aware that error details aren't available in earlier revisions. For some time M_DELTA_MERGE_STATISTICS contained the ERROR_DESCRIPTION, but HOST_DELTA_MERGE_STATISTICS not.
Typical errors and required actions are described below. The "Description" of an error code depends on the SAP HANA patch level and so you find more than one description for some of the error codes. The underlying meaning is nevertheless the same. The term "table optimization" in some of the error messages covers merges and compression optimizations.
Error Description Recommendations
1527 exception during deltalog replay. See trace file for more information This error typically indicates an inconsistency. Check the trace file for further details and proceed according to SAP Note 2116157 in order to analyze and resolve the corruption.
If the error is reported for merges on secondary sites of system replication scenarios (SAP Note 1999880), the bug described in SAP Note 2348480 (Rev. 110 - 112.02, Rev. 120) can be responsible.
1999 General error (no further information available) Check the indexserver trace file for more details.
2009 Memory allocation failed
This error is issued when no more memory is available. See SAP Note 1999997 and make sure that sufficient memory is available.
2450 Error during merge of delta index occurred Check the indexserver trace file for more details, e.g. OOM situations. In case of memory problems, see SAP Note 1999997 for more details.
Don't execute memory merges on tables with LOB columns ("main memory merge not supported on tables containing disk lobs!").
2454 general error during historisation of data This error can happen when merges are executed on temporal tables.
For SAP HANA Revisions <= 61 see SAP Note 1895207.
In case the history table is a SAP HANA-optimized DSO you can convert this DSO to a optimized standard DSO (see SAP Note 1849497 and 1849498) without keeping change log.
If the problem happens on newer Revisions, check the trace files for content. Open a SAP incident if you are not able to resolve the issue.
2458 aborted as cancel requested
Table delta merge aborted as cancel was manually requested by a kill session call
If it happens repeatedly: Check why sessions are cancelled manually.
2461 error during optimizeCompression occurred
Check the trace file for more details about the underlying error.
2465 not enough merge tokens for delta merge
This error is issued if during recovery a merge isn't able to proceed due to a lack of merge tokens.
This typically happens in system replication environments (SAP Note 1999880) with a configured logreplay replication mode. SAP Note 2351926 describes a bug that can result in these errors with SAP HANA Rev. 110 to 112.04 and 120 to 122.00. Also with newer Revisions this error is still possible, but typically harmless. If you nevertheless want to reduce it, you can increase the number of merge tokens as a workaround by adjusting indexserver.ini -> [mergedog] -> load_balancing_func and reconfiguring the secondary site as described in SAP Note 2351926.
You can use SQL: "HANA_Tables_ColumnStore_Merges_TokenOwners" (SAP Note 1969700) to understand which activities occupy how many merge tokens.
Normally this error isn't critical because a merge will be repeated at a later time when the current number of tokens is not sufficient.
2480 same table optimization is ongoing
The table in question is already being merged.
If it happens repeatedly: Check why concurrent merge requests are issued against the same table.
2481 same table optimization requested multiple times
There are already other smart merge requests for this table in the queue.
If it happens repeatedly: Check why concurrent smart merge requests are issued against the same table.
The following application SAP Notes can help to reduce the number of unnecessary smart merge request in specific scenarios:
SAP Note Details
2243673 Less smart merges for classic BW DSO tables with RFC calls
2245078 Reduction of smart merges in Bank Analyzer scenarios

2482 table optimization was not indicated
The delta storage is empty or the evaluation of the smart merge cost function indicated that a merge is not necessary.
Usually this error is harmless and can be ignored (SAP Note 2250715).
The following application SAP Notes can help to reduce the number of unnecessary smart merge request in specific scenarios:
SAP Note Details
2243673 Less smart merges for classic BW DSO tables with RFC calls
2245078 Reduction of smart merges in Bank Analyzer scenarios
In case of doubts you can check if the parameter smart_merge_decision_func is set properly (optimally: default value) and if smart merge logic on application side works correctly.
2483 table optimization was disabled
Smart merge is not active (parameter smart_merge_enabled=no)
Check why smart merges are disabled globally and activate them if possible by unsetting parameter indexserver.ini -> [mergedog] -> smart_merge_enabled.
2484 not enough memory for table optimization
Memory required to optimize table exceeds heap limit
Analyze and resolve memory shortages or misconfigurations (see SAP Note 1999997 for more details).
2485 table optimization was rolled back Check the indexserver trace file for more details.
2486 table optimization did not have an effect because all rows in delta were not optimizable If it happens repeatedly: Check why merge operations are started although no row can be processed.
This situation can happen if all records in the delta storage have to be moved to the new delta storage because they are still uncommitted.
2487 table optimization was not possible due to open metadata changes Check the indexserver trace file for more details. Perform a consistency check using CHECK_TABLE_CONSISTENCY in order to identify possible underlying corruptions (see SAP Note 1977584). Open a SAP incident on component HAN-DB if your require assistance from SAP.
2594 General partitioning error Check the indexserver trace file for more details. Among others, SAP HANA bugs in the area of paged attributes (SAP Note 1871386) can be responsible for this error.
2953 BW F Fact Table Compression failed Check the indexserver trace file for more details.
6900 Internal error
Attribute engine failed
Check the indexserver trace file for more details.
6923 Attribute load failed
Check the indexserver trace file for more details, e.g. OOM situations. In case of memory problems, see SAP Note 1999997 for more details.
6924 Attribute save failed Check the indexserver trace file for more details, e.g. OOM situations. In case of memory problems, see SAP Note 1999997 for more details.
6939 AttributeEngine: commit() or rollback() was called without calling prepare() first This error can be the consequence of a SAP HANA bug that can result in an inconsistency between existing columns and metadata information. Upgrade to SAP HANA SPS 09 or higher in order to fix this issue. Open a SAP incident on component HAN-DB if your require assistance from SAP.
6952 AttributeEngine: not enough memory
This error is issued when no more memory is available. See SAP Note 1999997 and make sure that sufficient memory is available.
29020 ltt::exception caught while operating on $STORAGEOBJECT$ This error is related to the persistence layer. Check the indexserver trace file for more details.
999999 message not found This error can be returned if the underlying table is corrupted or has reached the 2 billion record limit. Check the indexserver trace file for more details. Perform a consistency check using CHECK_TABLE_CONSISTENCY in order to identify possible underlying corruptions (see SAP Note 1977584). Open a SAP incident on component HAN-DB if your require assistance from SAP.
When an auto merge fails due to an error, the mergedog waits for some time (typically 1 hour) before trying the same auto merge again.

27. Is a restart required when changing delta merge related parameters?

Delta merge parameters can be changed online and immediately take effect. No restart is required.

28. Is the delta storage empty after a merge?

No, not necessarily:
  • Only committed data is moved from delta storage to main storage. Uncommitted records remain in delta storage. It is a good practice (also considered by the auto merge decision function) to avoid merges if a high fraction of records in delta storage is not committed, yet.
  • Changes performed during the delta merge will also appear in the new delta storage.

29. What are reasons why auto merges aren't executed as expected?

If tables aren't automatically merged although the delta storage is filled significantly, you can check the following potential reasons:
Reason Details
Inadequate auto merge decision function Make sure that the auto_merge_decision_func parameter is default or set to a reasonable value.
Check if the conditions of the tables in question fulfill the auto_merge_decision_func.
Auto merges deactivated on table level Check via SQL: "HANA_Tables_ColumnStore_AutoMergeDisabled" (SAP Note 1969700) if auto merges are disabled on table level and enable them if required.
Mergedog deactivated Make sure that
indexserver.ini -> [mergedog] -> active
isn't explicitly switched off ('false'), because this would disable the mergedog and no more auto merges are performed.
Tables assigned to wrong service The mergedog only considers tables related to the services with own persistent tables like indexserver and statisticsserver. If tables are assigned to other services (e.g scriptserver), no auto merge will happen.

30. How can I temporarily disable merges for a critical table?

In some situations (e.g. in case of a corrupted table) it can be useful to disable both auto and critical merge for a specific table on a temporary basis and enable it again once the problem is fixed. This can be achieved in the following way:
Merge type Deactivation command Activation command
AUTO
ALTER TABLE "<table_name>" DISABLE AUTOMERGE
ALTER TABLE "<table_name>" ENABLE AUTOMERGE
CRITICAL
ALTER SYSTEM ALTER CONFIGURATION ('indexserver.ini','SYSTEM') 
SET ('mergedog', 'critical_merge_decision_func') =  
'INSTR(NAME, ''<table_name>'') = 0 AND <original_critical_merge_decision_func_value>'
WITH RECONFIGURE
If DEFAULT value is used:
ALTER SYSTEM ALTER CONFIGURATION ('indexserver.ini','SYSTEM') 
UNSET ('mergedog', 'critical_merge_decision_func') WITH RECONFIGURE
If system specific value is used:
ALTER SYSTEM ALTER CONFIGURATION ('indexserver.ini','SYSTEM') 
SET ('mergedog', 'critical_merge_decision_func') =  
'<original_critical_merge_decision_func_value>' WITH RECONFIGURE

31. Does M_DELTA_MERGE_STATISTICS also contain merge information from the secondary site of a system replication scenario?

Starting with SAP HANA Rev. 111 the view M_DELTA_MERGE_STATISTICS can also contain merge information from the secondary site of a system replication scenario (SAP Note 1999880). Certain conditions like identical SAP HANA patch level needs to be fulfilled for this feature.
This new behavior can result in some confusion, e.g.:
  • Merge errors 2465 ("not enough merge tokens for delta merge") reported: See "What needs to be done in case of merge related errors?" for more information.
  • Long merge durations due to the fact that a merge on the secondary site needs to synchronize with the recovery queue and so it can't be committed immediately. This can in general be considered as harmless.

32. For which services is the mergedog required?

The mergedog is only activated for services with column store tables:
  • indexserver
  • (standalone) statistics server
For other services the mergedog is intentionally disabled. SAP Note 2240059 describes a statistics server bug that can result in false positive alerts if the mergedog is (correctly) disabled for the dpserver service and suggests to activate the mergedog for the dpserver as a temporary workaround.

Keywords

SAP HANA delta merge M_DELTA_MERGE_STATISTICS optimzable [sic]