Symptom
You are interested in details about garbage collection activities within SAP HANA.
Environment
SAP HANA
Cause
1. What is garbage collection in SAP HANA environments?
2. Why is it important to pay attention for proper garbage collection?
3. Which indications exist for SAP HANA garbage collection problems?
4. What is multi-version concurrency control?
5. How can garbage collection be monitored?
6. What are typical reasons and solutions for problems with garbage collection?
7. When is garbage collection automatically triggered?
8. How can garbage collection be triggered manually?
9. Which threads are responsible for garbage collection?
10. How can the mvcc_anti_ager behavior and trace file entries be controlled?
11. Can garbage collection be considered as defragmentation?
12. What can be the reason for alerts 73 and 74 for services different from the indexserver?
2. Why is it important to pay attention for proper garbage collection?
3. Which indications exist for SAP HANA garbage collection problems?
4. What is multi-version concurrency control?
5. How can garbage collection be monitored?
6. What are typical reasons and solutions for problems with garbage collection?
7. When is garbage collection automatically triggered?
8. How can garbage collection be triggered manually?
9. Which threads are responsible for garbage collection?
10. How can the mvcc_anti_ager behavior and trace file entries be controlled?
11. Can garbage collection be considered as defragmentation?
12. What can be the reason for alerts 73 and 74 for services different from the indexserver?
Resolution
1. What is garbage collection in SAP HANA environments?
The term "garbage collection" in SAP HANA environments is rather unprecise, because it can be related to several disjunct activities. In this SAP Note we will use the following naming conventions when discussing different kinds of garbage collection:Area | Details |
Row store version consolidation | Garbage collection in the row store refers to the cleanup of no longer required consistent read versions of data. In order to avoid misunderstandings, it is better to name this activity "version consolidation". |
Column store version consolidation |
Garbage collection in column store refers to the cleanup of no longer
required consistent read versions of data. In order to avoid
misunderstandings, it is better to name this activity "version
consolidation". In column store both current and old versions of records
are identified via unique document ID vectors (UDIV). If the number of
UDIV values is significantly larger than the number of (current) records
in a table, it can indicate the necessity of version consolidation. Merges and reclaims of delta storage (SAP Note 2057046) and compression optimizations (SAP Note 2112604) happen when significant changes are executed against tables and implicitly take care for a cleanup of no longer required versions. Therefore version consolidation is typically no issue in column store. |
Memory garbage collection | Allocated memory areas are cleaned in order to make more memory available for other purposes. |
Persistence garbage collection | No longer required persistence areas (e.g. undo data) are cleaned so that the disk space can be re-used. Be aware that this is no actual defragmentation, so the allocated disk size won't reduce (see "Can garbage collection be considered as defragmentation?" below). |
LOB garbage collection | Garbage collection related to hybrid LOB columns (see SAP Note 1994962) |
liveCache garbage collection | Garbage collection related the the SAP HANA integrated liveCache |
Calculation engine garbage collection | Model garbage collection of the calculation engine |
2. Why is it important to pay attention for proper garbage collection?
Problems with garbage collection can result in various critical problems like:- Increased memory requirements
- Increased disk space utilization
- Performance degradations up to system standstills
3. Which indications exist for SAP HANA garbage collection problems?
Garbage collection problems are often linked to long running transactions. The following indications can directly or indirectly point to problems in the garbage collection area.The following trace file entries indicate potential issues in the area of garbage collection:
mvcc_anti_ager.cc(01291) : There are too many un-collected versions. ('number of versions > 1000000' or 'maximum number of versions per record > 100000') The cursor possibly block the garbage collection of HANA database.
mvcc_anti_ager.cc(01291) : There are too many un-collected versions on table "<schema>"."<table>" ('number of versions for one of the partitions > 1000000' or 'maximum number of versions per record > 100000') The transaction blocks the garbage collection of HANA database.
mvcc_anti_ager.cc(01199) : long running uncommitted write transaction detected.
mvcc_anti_ager.cc(01082) : The Connection is disconnected forcefully because it is blocking garbage collection for too long period. Statement.cc(03190) : session control command is performed by ..., user=SYSTEM, query=ALTER SYSTEM DISCONNECT SESSION '<conn_id>'
mvcc_anti_ager.cc(00834) : long running cursor detected. The open cursor possibly blocks the garbage collection of HANA database. Please close a cursor in application or kill the connection by "ALTER SYSTEM DISCONNECT SESSION '<conn_id>' "The following SAP HANA alerts indicate potential problems linked to the garbage collection area:
Alert | Name | SAP Note | Description |
42 | Long-idling cursors | 1900261 | Identifies long-idling cursors. |
47 | Long-running serializable transactions | 1977293 | Identifies long-running serializable transactions. |
48 | Long-running uncommitted write transactions | 1977276 | Identifies long-running uncommitted write transactions. |
73 | Overflow of rowstore version space | Determines the overflow ratio of the rowstore version space. | |
74 | Overflow of metadata version space | Determines the overflow ratio of the metadata version space. | |
75 | Rowstore version space skew | Determines whether the rowstore version chain is too long. |
Check ID | Details |
810 | MVCC versions in row store |
811 | Max. MVCC versions in row store (last day) |
812 | Max. versions per record in row store |
815 | Age of transaction blocking row store MVCC (s) |
830 | Active commit ID range |
832 | Max. active commit ID range (last day) |
840 | Active update transaction ID range |
841 | Oldest active update transaction (s) |
842 | Max. active update trans. ID range (last day) |
846 | Table MVCC snapshot range |
850 | Persistence garbage collection history count |
852 | Undo and cleanup files |
854 | Undo and cleanup file size (GB) |
856 | Max. undo size of current transaction (MB) |
857 | Max. redo size of current transaction (MB) |
1115 | Longest running current SQL statement (s) |
- Pool/BitVector
- Pool/PersistenceManager/UndoDirectory
- Pool/RowEngine/Version
- DataContainer::VarSizeEntryUserDataHandler::getPageWithFreeSpaceFromFreeList
- ptime::AntiAger::garbage_collector
- ptime::filter_slot_backend
- ptime::FlushVersion::collect
- ptime::LinkVersion::removeDependency
- ptime::LinkVersion::removeSlotVersionDependency
- ptime::LockTable::collectGarbage
- ptime::LockTable::getVersion
- ptime::LockTable::hash
- ptime::Transaction::get_snapshot_trans_token
- ptime::Transaction::lock_slot_checking
- ptime::Transaction::validate_latest_version
- ptime::TransVersion::collect
- ptime::TransVersion::collectVersion
- ptime::VersionCollector::collect
- ptime::VersionCollector::Partition::collect
- ptime::VersionCollectorJobNode::run
- ptime::VersionMgr::getPageDeleteVersion
- ptime::VersionMgr::getSlotVersion
- ptime::VersionMgr::getVersion
- ptime::VersionTs::compare
- TRexAPI::TableUpdate::writeUdivs
- TrexStore::UdivListManager::insertUdivsWriter
- UnifiedTable::TableContainerImpl::insertMVCCDoWork
4. What is multi-version concurrency control?
Multi-version concurrency control (MVCC) is an implementation of isolation levels based on multiple versions of data. The most frequently used isolation level in SAP HANA is "read committed". This means that every database request sees the committed data of the point in time when the request (not the transaction) started. As a consequence different current and historic versions of the same data may need to be preserved in order to support the points in time when different SQL statements were started.5. How can garbage collection be monitored?
The following possibilities exist to check for garbage collection related details:Area | SQL statement (SAP Note 1969700) / monitoring views | Details | ||||||||||||
General |
SQL: "HANA_GarbageCollection_Overview" |
Overview about various garbage collection aspects including row store, column store and persistence |
||||||||||||
General |
SQL: "HANA_GarbageCollection_Blockers_History" HOST_LONG_IDLE_CURSORS HOST_LONG_RUNNING_STATEMENTS HOST_LONG_SERIALIZABLE_TRANSACTION HOST_UNCOMMITTED_WRITE_TRANSACTION |
Information about historic activities potentially responsible for a blocked garbage collection:
|
||||||||||||
Row store version consolidation |
M_MVCC_TABLES M_MVCC_OVERVIEW (>= SPS 08) M_RS_TABLE_VERSION_STATISTICS SQL: "HANA_GarbageCollection_RowStore_Versions" |
Current information about versions in row store Checked also by SQL: "HANA_Configuration_MiniChecks" (SAP Note 1999993):
|
||||||||||||
Row store version consolidation |
M_LOAD_HISTORY_SERVICE (>= SPS 09) SQL: "HANA_LoadHistory_Services" |
Historic information about versions in row store also shown in the SAP HANA Studio load graph section, also evaluated via SQL: "HANA_Configuration_MiniChecks" (SAP Note 1999993):
|
||||||||||||
Column store version consolidation |
M_CS_MVCC (>= SPS 09) SQL: "HANA_GarbageCollection_ColumnStore" |
This view provides memory allocation information related to column store version consolidation. | ||||||||||||
Persistence garbage collection |
M_GARBAGE_COLLECTION_STATISTICS M_UNDO_CLEANUP_FILES SQL: "HANA_GarbageCollection_Persistence" SQL: "HANA_Transactions_UndoCleanupFiles" |
These views contain information about persistence related garbage collection and still existing undo and cleanup files for which garbage collection is still required. It is also checked by SQL: "HANA_Configuration_MiniChecks" (SAP Note 1999993):
|
6. What are typical reasons and solutions for problems with garbage collection?
Typical reasons and solutions for problems in the area of garbage collection described in the table below. Additional information can be found in the SAP HANA Troubleshooting and Performance Analysis Guide ("Transactional Problems") and in SAP Note 1858357.Reason | Details | |||||||||||||||||||||||||||||||||||||||||||||||||||
Known bugs |
Already known bugs are described in the following SAP Notes:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
Long running update transactions, statements, idle cursors and serializable transactions |
You can use SQL: "HANA_GarbageCollection_Blockers_History" (SAP Note 1969700) in order to check for long running operations being potentially critical for the garbage collection. If you identify a particularly long running operation, you should check if the long runtime is expected. If it is not expected or if it causes significant trouble, you can terminate it, either on client side (e.g. by terminating the related SAP work process) or on SAP HANA side. SAP Note 2092196 describes how active threads and connections can be terminated. |
|||||||||||||||||||||||||||||||||||||||||||||||||||
Hanging threads |
You can use SQL: "HANA_Threads_CurrentThreads" (SAP Note 1969700) in order to check for hanging threads. If you identify a long running thread, you should check if the long runtime is expected. If it is not expected or if it causes significant trouble, you can terminate it, either on client side (e.g. by terminating the related SAP work process) or on SAP HANA side. SAP Note 2092196 describes how active threads and connections can be terminated. |
|||||||||||||||||||||||||||||||||||||||||||||||||||
Internal inconsistency |
In rare cases it can happen that the garbage collection is permanently blocked, but SQL: "HANA_GarbageCollection_Overview" doesn't report any responsible, old transaction. This can happen in case of orphan internal transaction tokens or consistent view entries. You may find these orphan objects by looking at the SAP HANA internal monitoring views M_TRANSTOKEN_DIRECTORY_ and M_CONSISTENT_VIEW_STATISTICS_. In this scenario you can resolve the problem only by restarting SAP HANA. One critical bug caused by insufficient BLOB error handling is resolved with Rev. 102. |
|||||||||||||||||||||||||||||||||||||||||||||||||||
High frequency of DML operations on single row store records |
Avoid massive changes of single row store table records like the QIWKTAB issue described in SAP Note 2000002, because the row store version consolidation may no longer be able to keep up with the changes. Up to SPS 08 you should try to avoid more than 100,000 changes of individual records per hour. Starting with SPS 09 the row store version consolidation is optimized and so also higher throughputs are theoretically possible (although the application design would remain questionable). |
|||||||||||||||||||||||||||||||||||||||||||||||||||
High COMMIT frequency |
A high COMMIT frequency like the getPageWithFreeSpaceFromFreeList issue described in SAP Note 1999998 can result in situations where the persistence garbage collection no longer keeps up with the COMMIT rate. Avoid massive short and concurrent DML operations and use packages in case of mass DML operations rather than single record operations with COMMITs. |
|||||||||||||||||||||||||||||||||||||||||||||||||||
Blocking of garbage collection on global level |
Row store and column store version consolidation is typically blocked on a granular table level, there are some scenarios where garbage collection is blocked globally on all tables:
Persistence garbage collection is generally blocked globally. This means that it will not start at all, even if only a single table's version consolidation is still blocked. |
|||||||||||||||||||||||||||||||||||||||||||||||||||
Calculation engine garbage collection: Frequent CREATE and DROP operations for scenarios |
If the calculation engine garbage collection causes trouble (e.g. due to ceManager_GCLock and calcEngine_CalcEngineManager_ScenarioMapRWLock waits as described in SAP Note 1999998), you should check if you can reduce the number of CREATE and DROP operations for calculation scenarios from an application perspective. Starting with SAP HANA SPS 09 the garbage collection behavior is improved and so the amount of problems should reduce. |
7. When is garbage collection automatically triggered?
The following table describes when garbage collection is automatically triggered:Area | Details | ||||||||||||||||||||
Row store version consolidation |
Garbage collection is triggered after a COMMIT and when the time limit defined with the following parameter is reached:
|
||||||||||||||||||||
Column store version consolidation | Versions are automatically consolidated during merges (see SAP Note 2057046) and compression optimizations (see SAP Note 2112604). | ||||||||||||||||||||
Memory garbage collection |
Memory is reclaimed in order to make sure that SAP HANA doesn't run out of memory. It is controlled by the following SAP HANA parameters:
|
||||||||||||||||||||
LOB garbage collection |
LOB garbage collection (LGC) is executed on a regular basis. The frequency is based on the following parameter setting:
|
||||||||||||||||||||
liveCache garbage collection |
Garbage collection in the embedded liveCache is triggered automatically based on the following parameters:
|
||||||||||||||||||||
Calculation engine garbage collection |
It is triggered when a calculation view / calculation scenario is dropped or at latest after 5 minutes. |
8. How can garbage collection be triggered manually?
Usually there is no requirement to trigger garbage collection manually, because SAP HANA will automatically take care for it whenever necessary and possible. In specific cases garbage collection can be triggered in the following ways:Area | SAP Note | Details |
Row store version consolidation |
ALTER SYSTEM RECLAIM VERSION SPACE |
|
Column store version consolidation | 2112604 |
Old (UDIV) versions within a table can only be cleaned with a compression optimization:UPDATE "<table_name>" WITH PARAMETERS ('OPTIMIZE_COMPRESSION' = 'FORCE') |
Memory garbage collection | 1999997 |
hdbcons 'mm gc -f' |
LOB garbage collection |
Hybrid LOB garbage collection can be triggered for a table using:ALTER TABLE "<table_name>" WITH PARAMETERS ('LOB_GARBAGE_COLLECTION' = '1') |
9. Which threads are responsible for garbage collection?
Garbage collection is performed by the following threads (see SAP Note 2114710 for details):Area | Threads | Details | ||||||
Row store version consolidation |
SQLExecutor MvccAntiAgerChecker MVCCGarbageCollector JobWorker (RSVersionCollectorJobNode/Partition<part>) |
Version consolidation during a COMMIT is executed by the SQLExecutor thread itself. The MvccAntiAgerChecker is responsible to check for potentially critical situations like long-running update transactions and report it in the trace files. The MVCCGarbageCollector performs periodic version consolidation that is not done by the SQLExecutor threads as part of COMMITs. The JobWorker threads of type RSVersionCollectorJob are triggered by the MVCCGarbageCollector in order to perform parallelized version consolidation. |
||||||
Column store version consolidation |
MergeAttributeThread MergedogMerger MergedogMonitor SaveMergedAttributeThread JobWorker (ChangeCompressionJob) |
As a side-effect of merges and compression optimizations versions in column store are consolidated by the mentioned threads. |
||||||
Memory garbage collection | MemoryCompactor | This thread releases no longer required memory, e.g. in cases of a memory shortage. | ||||||
Persistence garbage collection |
JobWorker (GCJob*) LobGarbageCollector |
These threads are responsible for cleaning up garbage in the persistence area. There can be a potentially high number of concurrently running GCJob* collectors. The maximum amount of these threads is controlled by the following parameter (see SAP Note 2100040):
|
||||||
liveCache garbage collection |
EarlyGarbageCollection |
These threads are responsible for liveCache garbage collection. The amount of these threads is controlled by the following parameter:
|
||||||
Calculation engine garbage collection |
CalcEngineGarbageCollector |
This thread is responsible for model garbage collection of the calculation engine. |
10. How can the mvcc_anti_ager behavior and trace file entries be controlled?
The trace file entries generated by the MvccAntiAgerChecker can be controlled with the following SAP HANA parameters:Parameter | Default | Unit | Warning |
indexserver.ini -> [transaction] -> idle_cursor_alert_timeout |
60 | minutes | long running cursor detected |
indexserver.ini -> [transaction] -> idle_cursor_lifetime |
720 | minutes |
The open cursor possibly blocks the garbage collection of HANA database. Please close a cursor in application or kill the connection by "ALTER SYSTEM DISCONNECT SESSION <session>". The long running serializable transaction possibly blocks the garbage collection of HANA database. The long running uncommitted transaction possibly incurs performance degradation of HANA database. The long running external transaction possibly incurs performance degradation of HANA database. Please commit/rollback a transaction in application or kill the connection by "ALTER SYSTEM DISCONNECT SESSION <session>". |
indexserver.ini -> [transaction] -> max_versions_per_record_alert_threshold |
100000 | number of row store versions | There are too many un-collected versions |
indexserver.ini -> [transaction] -> max_versions_per_table_alert_threshold |
100000 | number of row store versions | There are too many un-collected versions |
indexserver.ini -> [transaction] -> number_versions_alert_threshold |
1000000 | number of row store versions | There are too many un-collected versions |
indexserver.ini -> [transaction] -> uncommitted_write_transaction_alert_timeout |
60 | minutes | long running uncommitted write transaction detected |
11. Can garbage collection be considered as defragmentation?
Garbage collection removes no longer required information and so it can more be compared with data management and archiving than with defragmentation. This means that garbage collection doesn't necessarily reduce the amount of allocated space. This is particularly valid for the persistence garbage collection. If persistence garbage collection has problems, the allocated space on disk grows, but when garbage collection is properly executed, it won't reduce. See SAP Note 2000003 ("Where can fragmentation occur in SAP HANA environments?") and consider using ALTER SYSTEM RECLAIM DATAVOLUME if you want to release fragmented space on disk level.12. What can be the reason for alerts 73 and 74 for services different from the indexserver?
The alerts 73 ("Overflow of rowstore version space") and 74 ("Overflow of metadata version space") are issued if a significant portion of the configured version space is occupied. Normally this indicates problems with garbage collection. If the alert is reported for services different from the indexserver although their number of row store versions is at a small level, it may be a consequence of a small version space configuration. The default version space is configured with the following parameter:Parameter | Default | Details |
<service>.ini -> [row_engine] -> lock_table_array_size |
indexserver: 1000003 xsengine, scriptserver: 101 (SPS <= 10) / 10007 (SPS >= 11) other services: 101 |
This parameter defines the size of the lock table / version space. It should be set to a prime number. Large values increase the memory footprint, smaller ones impose the risk of overflows and alerts. If you receive the alert for services different from the indexserver, you can consider to increase the related lock table array size. The increase from 101 to 10007 for the xsengine and scriptserver starting with SPS 11 will only have a minimal impact on the memory footprint. |
Keywords
START_MVCC_TIMESTAMP MIN_MVCC_SNAPSHOT_TIMESTAMP
Header Data
Released On | 26.01.2016 09:43:31 |
Release Status | Released to Customer |
Component | HAN-DB SAP HANA Database |
Priority | Normal |
Category | How To |
very good document
ReplyDelete