Tuesday, February 2, 2016

2114710 - FAQ: SAP HANA Threads and Thread Samples

Symptom
You are interested in information about SAP HANA threads and thread samples.


Environment
SAP HANA


Cause
1. What are SAP HANA threads and thread samples?
2. Where do I find information about SAP HANA threads?
3. For what purpose do I need to evaluate the SAP HANA thread activities?
4. What kind of information is available for the SAP HANA threads?
5. How can I interpret the thread state?
6. What are the main thread types, methods and details?
7. How can I influence the collection and historization of thread samples?
8. What are typical thread constellations?
9. Are there case studies showing the possibilities of thread samples?


Resolution

1. What are SAP HANA threads and thread samples?

The SAP HANA work, like executing of SQL statements or background tasks is performed by threads. Each SAP HANA process like indexserver or xsengine consist of a set of threads.
Thread samples are regular samples of thread activities which are historized.

2. Where do I find information about SAP HANA threads?

Information about current SAP HANA threads can be found at the following locations:
  • Monitoring view M_SERVICE_THREADS
  • SAP HANA Studio -> Administration -> Performance -> Threads
  • DBACOCKPIT -> Performance -> Threads
  • SQL: "HANA_Threads_CurrentThreads" (SAP Note 1969700)
Information about historic thread activities are available at:
  • Monitoring view M_SERVICE_THREAD_SAMPLES
    • Retention is usually at least 2 hours
    • For more details see SAP Note 2088971
  • History HOST_SERVICE_THREAD_SAMPLES
    • Only available if embedded statistics server is used
    • Default retention time may be quite small (1 day)
    • Retention should be increased to 42 days as described in SAP Note 2147247 ("What is a good retention time for statistics server histories?")
  • SQL: "HANA_Threads_ThreadSamples_FilterAndAggregation" (SAP Note 1969700)
  • SQL: "HANA_Threads_ThreadSamples_AggregationPerTimeSlice" (SAP Note 1969700)

3. For what purpose do I need to evaluate the SAP HANA thread activities?

By checking SAP HANA thread information you can get a current or historic snapshot of thread activities. This can help you to answer questions related to the SAP HANA workload, e.g.:
  • Activities responsible for CPU consumption (SAP Note 2100040)
  • Activities responsible for lock situations (SAP Note 1999998)
  • Activities responsible for bad performance (SAP Note 2000000)
  • Expensive SQL statements (SAP Note 2000002

4. What kind of information is available for the SAP HANA threads?

The most important information available in the thread related monitoring views are:
Information Column names Available Details
Host name
Port
Service name
HOST
PORT
SERVICE_NAME
generally Basic information to understand which service on which host owns the thread
Timestamp TIMESTAMP M_SERVICE_THREAD_SAMPLES
HOST_SERVICE_THREAD_SAMPLES
Sample timestamp
Connection ID CONNECTION_ID generally Client connection identifier (-1 if thread has no assigned client connection)
Thread ID THREAD_ID generally Thread identifier
Statement hash STATEMENT_HASH M_SERVICE_THREAD_SAMPLES
HOST_SERVICE_THREAD_SAMPLES
Statement hash of active SQL statement
Active flag IS_ACTIVE M_SERVICE_THREADS TRUE if thread is currently active, otherwise FALSE
History views only contain active threads, so flag is not required for M_SERVICE_THREAD_SAMPLES and HOST_SERVICE_THREAD_SAMPLES
Thread state THREAD_STATE generally Activity state of thread (e.g. running or waiting for a resource), for more details see "What are the main thread types, methods and details?" below.
Thread type THREAD_TYPE generally Thread type (e.g. SQLExecutor or JobWorker), for more details see "What are the main thread types, methods and details?" below.
Thread method THREAD_METHOD generally (but not consequently populated in thread sample views) Thread method (e.g. ExecutePrepared or indexing), for more details see "What are the main thread types, methods and details?" below.
Partially (e.g. for SqlExecutor threads) only written to thread sample views if the following parameter is set to true:
global.ini -> [resource_tracking] -> service_thread_sampling_monitor_thread_detail_enabled = true
Thread detail THREAD_DETAIL generally (but not populated per default in thread sample views) Thread detail (e.g. SQL statement or column store plan step), for more details see "What are the main thread types, methods and details?" below
Only written to thread sample views if the following parameter is set to true:
global.ini -> [resource_tracking] -> service_thread_sampling_monitor_thread_detail_enabled = true
Caller thread
Called thread
CALLER
CALLING
M_SERVICE_THREADS
M_SERVICE_THREAD_SAMPLES (>= SPS 09)
Thread dependencies (caller: ID of thread which called the current thread; calling: ID of thread which is called by the current thread)
Database user name USER_NAME generally Name of database user executing the thread
Application user name APPLICATION_USER_NAME generally Name of application user (e.g. SAP end user in ABAP environments)
Application name APPLICATION_NAME M_SERVICE_THREAD_SAMPLES
HOST_SERVICE_THREAD_SAMPLES
Name of client application (e.g. "ABAP:<sid>" in case of SAP ABAP stacks)
Application source APPLICATION_SOURCE M_SERVICE_THREAD_SAMPLES
HOST_SERVICE_THREAD_SAMPLES
Application details (e.g. report name in case of SAP ABAP stacks)
CPU time CPU_TIME_SELF
CPU_TIME_CUMULATIVE
M_SERVICE_THREADS CPU time in microseconds consumed by the thread itself (CPU_TIME_SELF) or including the associated children (CPU_TIME_CUMULATIVE)
Only populated if the following SAP HANA parameter is set:
global.ini -> [resource_tracking] -> cpu_time_measurement_mode = fast (SPS 08 and below)
global.ini -> [resource_tracking] -> cpu_time_measurement_mode = on (SPS 09 and above)
Transaction ID
Update transaction ID
TRANSACTION_ID
UPDATE_TRANSACTION_ID
generally Identifier of related transaction (-1 if no transaction is associated to thread) and update transaction (-1 if no uncommitted modification is associated to thread)
Lock wait details LOCK_WAIT_COMPONENT
LOCK_WAIT_NAME
LOCK_OWNER_THREAD_ID
generally Component, details of lock waits and ID of thread being responsible for lock wait

5. How can I interpret the thread state?

Thread states are useful to judge if the threads are actually working or if they have to wait, so they are one aspect of a general performance analysis (SAP Note 2000000):
Thread state Area Details SAP Note
Barrier Wait Lock  Barrier wait 1999998
ConditionalVariable Wait Lock  Mainly record and table lock waits 1999998
ExclusiveLock Enter Lock Exclusive read / write lock waits 1999998
Inactive Idle Idle thread  
IO Wait I/O Disk I/O wait (e.g. during column load or hybrid LOB access 1999930
Job Exec Waiting Idle Waiting for a JobWorker thread executing the actual work  
Mutex Wait Lock Mutex lock waits 1999998
Network Poll Network Network I/O between SAP HANA nodes or services 2222200
Network Read Network Network I/O read between SAP HANA nodes or services 2222200
Network Write Network Network I/O write between SAP HANA nodes or services 2222200
Running SQL Request execution, typically CPU consumption 2000002
Semaphore Wait Lock Semaphore lock waits 1999998
SharedLock Enter Lock Shared read / write lock waits 1999998
Sleeping Lock Waiting for a lock 1999998

6. What are the main thread types, methods and details?

Below you can find some important thread types, methods and details:
Thread Type Thread Method  Thread Detail  SAP Note Tasks
Assign assign open data+log volume <id> / local tables 2185955 Indexserver -> nameserver communication in order to open disk volumes
May be blocked if nameserver has issues, e.g. due to an orphan nameserver process like described in SAP Note 2185955
  post_assign open data+log volume <id> / preload tables 2127458 Reloading tables at the end of startup and volume assignment
AsyncSender send working for thread <waiting_thread>   Asynchronous thread / thread communication (sending thread)
AsyncWaiter waiting working for thread <sending_thread>   Asynchronous thread / thread communication (waiting thread)
BWFlattenScenarioHost BWFlattenScenarioHost <host>:<port> partGroup 0   Conversion of BW infocubes from classic to in-memory (BW_CONVERT_CLASSIC_TO_IMO_CUBE)
Cache Request Handler        
CatalogWatchdog        
DiskInfo remotediskinfo     Request of information for remote physical disk
DsoUpdate ActivationQueuePackage <schema>:<table>en: [<id>, <id>) / {Lhs table parts: [<schema>:<table> (<id>)] (not prunable: size 1 set 0), rhs table parts: [<schema>:<table> (<id>)]} 2000002 DSO activation via DSO_ACTIVATE_PERSISTED
  DSOPartActivatorLocal <schema>:<table>en RequestIds <id> / {Lhs table parts: [<schema>:<table> (<id>)] (not prunable: size 1 set 0), rhs table parts: [<schema>:<table> (<id>)]} 2000002 DSO activation via DSO_ACTIVATE_PERSISTED
EmergencyJobForDeferredResource     2127458
Framework used during critical situations (e.g. for unloading columns during memory shortages)
Generic     1999880 A running 'Generic' thread without further method or detail information and with DataAccess::AsyncLogBufferHandlerThread::run in its call stack is responsible for shipping log buffers in asynchronous system replication environments.
Generic Search for trace files to compress   2119087 Check if trace files (e.g. alert traces) exist that should be compressed
Peaks can be caused by I/O issues accessing the trace files (see SAP Note 1999930), example for an I/O related call stack:
__open_nocancel+0x7
__opendir+0x1
System::UX::opendir
FileAccess::DirectoryEntry::findFirst
FileAccess::DirectoryEntry::DirectoryEntry
Diagnose::TraceSegmentCompressorThread::run
__getdents+0x6
__readdir_r+0x105
System::UX::readdir_r
FileAccess::DirectoryEntry::findFirst
FileAccess::DirectoryEntry::DirectoryEntry
Diagnose::TraceSegmentCompressorThread::run
getLastDocids byValue $trexexternalkey$ from=<id> to=<id>    
IndexingQueue indexing <schema>:<table>en.<column> 2160391 Indexing of fulltext indexes
JobexWatchdog       Watchdog for job execution
JobWorker aggregate worker <schema>:<table>en(<pct>%)    
  AttributeIndexJob commitOptimize worker (table oid: <id>, attribute <column>) writing <num> rows    
  Backup::BackupExe_Job   1642148 Threads actually performing the backup, often triggered by _IDX_BackupExecutor slaves
  BackupMgr_ExecuteDataSaveJob   1642148 Data backup threads typically triggered by BackupMgr_SaveDataJob
  BackupMgr_SaveDataJob   1642148 Main data backup thread
  calculateQueryResultDocs2 resultdocs   Parallelized query result processing in column store
  CeCalcAttrMaterializeJob      
  changeCompression
ChangeCompressionJob
ChangeCompressionJob: table=<schema>:_SYS_SPLIT_<table>~<id> (<id>),  columnId=<id>: changeAttributeDefinition
table=<schema>:<table> (<id>),  columnId=<id>: getEstimatedTotalMainMemSize
2112604 Compression optimization (adjustment of column compression)
  changeRowOrder PrepareRenumberJob: table=<schema>:_SYS_SPLIT_<table>~<id> (<id>),  columnId=<id>: prepareRenumber 2112604 Compression optimization
  CHECK_TABLE_CONSISTENCY/local checking row table <schema>.<table>
checking column table <schema>.<table>
1977584 Execution of consistency check CHECK_TABLE_CONSISTENCY
  comm      
  CompressFemsParallel   1999997 FEMS compression
See SAP Note 1999997 -> "What can I do if a certain heap allocator is unusually large?" -> "Pool/FemsCompression/CompositeFemsCompression" for more information.
  createJobs RadixSort    
  create_ReadValueId_Jobs GetValueIds    
  createResultDocs     Creation of results of parallel OLAP engine activities
  DDLDataJob <schema>:_SYS_SPLIT_<table_name>~<id> (<id>)    
  DefaultColumnDict      
  detachResult merger <id>    
  DictScanJob      
  estimateCompression EstimateCompressionJob: table=<schema>:_SYS_SPLIT_<table>~<id> (<id>),  columnId=<id>: chooseCompression   Compression optimization (identification of optimal compression)
  EXPORT thread Table-wise EXPORT thread for <schema>:<table>   Table export
  FemsCompressionPhase2Job Phase <id> Job <id> Num of parts: <id>    
  GCJob@<identifier> GCJob[<identifier>-EID, CleanupFilePassedToGC, TID= <sid>, transIndex= <id>, CleanupStrategy_MVCC, pagecnt=<pg>, size=<size> 2169283 Persistence garbage collection
  generic BuildIndexJobNode    
    commitOptimize worker (table oid: <id>, attribute <column>) writing 1800 rows   Can be linked to mass changes (e.g. UPSERT)
    ConcatenateAttributesJob    
    RowInsertJobNode    
    EmailJob: working on snapshot_ID: <date> 1999998 E-mail processing, may be blocked by MailSenderCallback semaphore waits, see SAP Note 1999998 for more details.
  getMostFrequentValueInfos GetMostFrequentValueInfosJob: table=<schema>:_SYS_SPLIT_<table>~<id>(<id>),  column=<id> 2112604 Compression optimization
  getNextPQItem UpdateMostFrequentValueCountJob    
  GroupBy-group     Row store GROUP BY / DISTINCT processing
  _handleQuery      
  HashDict WorkerJob HashDict worker <num>% done    
  HashDict MergerJob HashDict merger <id>    
  HTTP POST <connector>
PUT <connector>
  HTTP request processing (xsengine)
  IndexConsistencyCheckData   1977584 Consistency check based on CHECK_TABLE_CONSISTENCY
  IndirectScanBvOutJob<range>
IndirectScanVecOutJob<range>
  2000002 Range scan of column with INDIRECT compression
  initIteratorMultiBtQuery initIteratorMultiBtQuery  
 
  JobConcatAttrCalculator JobConcatAttrCalculator<schema>:<table>en <rows> rows 1986747 Creation of a CONCAT attribute, e.g. during index creation
  JobReadValueIdsParallel      
  makeHeap MakeHeapJob    
  OrderBy-fetch     Row store ORDER BY processing
  execute ParallelForJob
ParallelFor Job
AE/getValues BlockJob on table <schema>:<table> (<id>) (parallelFor job <id>/<id>)
AE/getValues SliceJob on table <schema>:_SYS_SPLIT_<table> (<id>) (parallelFor job <id>/<id>)
   
  calc
PlanExecutor calc
plan<id>@<host>:<port>/pop<id> (<action>) 2000002 Column store plan execution
  PlanExecutor comm      
  postProcessScanVecResult SetRangeVecOutJob 2112604 Compression optimization
  processParallel      
  readIndexChunked readIndexChunked    
  RSVersionCollectorJobNode/Partition<part> remaining versions to collect: <versions> versions (<tx> tx, <flushes> flushes) 2169283 These threads are triggered by the MVCCGarbageCollector garbage collector thread in order to perform parallelized row store version consolidation.
  runJobs ItabMat-insert    
  scanParallel      
  scanWithoutIndex IndScanVecOutJob    
    IndScanBvOutJob    
  searchBetween searchBetween    
  searchDocumentsIterateDocidsParallel JobParallelMgetSearch docs=<val1>(VEC) vid=<val2>(BV) 2000002 Table scan
  ServerJob Unnamed-SimpleJob 2222218 Execution of hdbcons command (caller: NetworkChannelCompletionThread)
  SetAttributeJob Source attribute: <column> dest attribute: <column>   Conversion of BW infocubes from classic to in-memory (BW_CONVERT_CLASSIC_TO_IMO_CUBE)
  SidMappingJob SidMappingJob: source scenario <schema>:<scenario> , mapping to new attribute <column> (<id> values)   Conversion of BW infocubes from classic to in-memory (BW_CONVERT_CLASSIC_TO_IMO_CUBE)
  Sorter RangeSortJob    
  sortBlocksByRowID SortByRowIdJob: table=<schema>:_SYS_SPLIT_<table>~<id> (<id>) 2112604 Compression optimization
  sortRestRanges SortRestRangeJob: table=<schema>:_SYS_SPLIT_/<table> (<id>),  columnId=<id> 2112604 Compression optimization
  sparseSearch SparseScanRangeVecJob 2000002 Scan of column with SPARSE compression
  spSearchValueCountsFastScan     Ad-hoc attribute engine scan for counting values (e.g. when RECORD_COUNT is selected from M_CS_COLUMNS / M_CS_ALL_COLUMNS)
  startLoop      
  system replication: async log buffer handler Unnamed-JobNode 1999880 Handler thread, that picks log buffers from the temporary memory buffer and sends them over the network to the secondary site. If asynchronous replication is enabled, this thread is running always (maybe waiting for new log buffers to be created).
  system replication: mark shipped snapshot active Unnamed-JobNode 1999880 After a successful data shipping, the snapshot that has been used as source on primary site is marked as active snapshot (last shipped snapshot to secondary site). 
  system replication: prepare transfer data replica to secondary Unnamed-JobNode 1999880 Preparation of a new data shipping from primary to secondary site, snapshot for data shipping is created on primary site, that is used as source for data transfer.
  system replication: check replication snapshots Unnamed-JobNode 1999880 Check whether old replication snapshots can be dropped on primary site, snapshots are dropped if they are no longer needed or the snapshot's retention time has been elapsed.
  system replication: transfer missing log to secondary Unnamed-JobNode 1999880 Transfer of missing log to the secondary site, that is required to make the backup history complete.
  system replication: transfer replica to secondary Unnamed-JobNode 1999880 Execution of data shipping from primary to secondary site.
  WorkerJob worker <schema>:_SYS_SPLIT_<table>~<id>en (merging)   Merging of parallel OLAP engine data sets
  writeDataIntoDelta Creating concat for <schema>:_SYS_SPLIT_<index>$history$$delta_1$en 2057046 Writing of delta storage information
  Zipper      
LoadDataWriter     2222110 Responsible for writing the SAP HANA load history
May be impacted by I/O problems related to trace file directory (SAP Note 1999930), example call stacks:
__open_nocancel+0x20
__GI__IO_file_open+0x30
_IO_new_file_fopen+0xc7
__fopen_internal+0x80
NameServer::LoadHistoryManager::save
NameServer::LoadDataWriter::ru
__close_nocancel+0x7
_IO_new_file_close_it+0x40
_IO_new_fclose+0x180
NameServer::LoadHistoryManager::save
NameServer::LoadDataWriter::run
LoadField <schema>:<table>en AttributeId=<id>
<column>
2127458 Load of columns into column store
LoadPart load <schema>:_SYS_SPLIT_<table>~<id>en 2127458 Load of columns of partitioned table into column store
LobGarbageCollector     2169283 Lob persistence garbage collection
LocalWatchDog ping <host>:<port>   Nameserver ping check of indexserver
LogBackupThread        
Main        
MemoryCompactor     2169283 Memory garbage collection 
MergeAttributeThread prepareDeltaMerge IndexName: <schema>:<table>en Attribute: <id>/<column> 2057046 Preparation of delta merges
MergedogMerger merging <x> of <y> table(s): <table> 2057046 Execution of delta merges
MergedogMonitor joining waiting for merge threads to complete 2057046 Delta merge monitoring (waiting for merge threads)
  checking <x> of <y> table(s): <schema>:_SYS_SPLIT_<table>~<id> 2057046 Delta merge monitoring (checking)
MvccAntiAgerChecker     2169283 Row store version consolidation and garbage collection check
MVCCGarbageCollector     2169283 Row store version consolidation and garbage collection
PartDistributorThread PartDistributorThread   2044468 Access and modification of records into appropriate table partition
PeriodicSavepoint doSavepoint enterCriticalPhase 2100009 Preparations for entering the critical savepoint phase (part of CRITICAL_PHASE_WAIT_TIME in M_SAVEPOINTS, change operations like DML, DDL or merges can be blocked with ConsistentChangeLock, see SAP Note 1999998)
    enterCriticalPhase(waitForLock)
2100009 Preparations for entering the critical savepoint phase (part of CRITICAL_PHASE_WAIT_TIME in M_SAVEPOINTS, change operations like DML, DDL or merges can be blocked with ConsistentChangeLock, see SAP Note 1999998)
    enterPostCriticalPhase 2100009 Savepoint processing after critical phase
    exitCriticalPhase 2100009 End of critical savepoint phase
    finishSavepoint 2100009 Final savepoint steps
    flushPagesinNonCriticalPhase 2100009 Savepoints writing dirty pages to disk
    preCriticalPhase 2100009 Savepoint activity before starting critical phase
    prepareSavepoint 2100009 Savepoint preparation
    processCriticalPhase 2100009 Critical savepoint phase (CRITICAL_PHASE_DURATION in M_SAVEPOINTS, change operations like DML, DDL or merges can be blocked with ConsistentChangeLock, see SAP Note 1999998)
PhraseIndexBuilder update_phrases <schema>:<table>en.<column> 2160391 Maintenance of fulltext indexes
Queue indexing <schema>:<table>en.<column>  2160391 Indexing of fulltext indexes
RemoteService        
Request BackupManager      
  bwFlattenScenarioPart     Conversion of BW infocubes from classic to in-memory (BW_CONVERT_CLASSIC_TO_IMO_CUBE)
  cachemgr      
  CheckRemoteUniqueConstraint   2044468 Check of uniqueness in other partitions / hosts, should be avoided if possible (e.g. by avoiding partitioning or by performing partitioning based on unique / primary key columns)
Avoid partitioning of SID tables in BW, because they typically have two different unique indexes. For data management of SID tables see SAP Note 1331403.
  csddl/TruncateTableData      
  remotediskinfo     Collection of information for local physical disk by nameserver (NameServer::doRemoteDiskInfo) in order to support a remote request
  distMetdataRequest     Distributed metadata request
  DistRowQueryExecute      
  DistTableStatistics      
  dso/activatePers   2000002 DSO activation via DSO_ACTIVATE_PERSISTED
  executeTableUpdateContainer      
  fs      
  getColumnStat     Determination of column statistics
  __globalTransControl     Control of global transactions involving different services
  _IDX_BackupExecutor   1642148 Backup executor threads, typically triggered by BackupMgr_ExecuteDataSaveJob slaves
  __nsWatchdog     Triggered by nameserver watchdog (LocalWatchDog)
  LOBREMOTE <schema>:<table>en/getContainerIdsFromTable 2220627 LOB container access
  memoryutilization      
  ngdb_console runtimedump dump 1813020 Creation of a runtime dump using MANAGEMENT_CONSOLE_PROC / hdbcons
  OptimizeCompressionData   2112604 Compression optimization (triggering JobWorker processes doing the actual work)
  planexecution     X2 plan execution
  prep/indexing TEXT/<schema>:<table>en.<column>:<id> 2160391 Indexing of fulltext indexes by preprocessor
  Queue Master      
  Queue Pull     Preprocessor request to indexserver for job workers that can be used for preprocessor activities like 'indexing'
  reclaim_data_volume     Execution of 'ALTER SYSTEM RECLAIM DATAVOLUME' command
  RemotePersistenceNoSession      
  search <schema>:<table>/<conditions>    
  __sessionRequest      
  splitIndex <schema>:<table>   Repartitioning of tables, triggers threads like (e.g type 'Split', method 'split/merge attributes')
  stat
core/stat
     
SaveMergedAttributeThread prepareDeltaMergeSave IndexName: <schema>:<table>en Attribute: <id>/<column> 2057046 Preparation of perstisting delta merge to disk
Savepoint       Savepoint execution
SelfWatchDog     1999998 Watchdog thread that regularly checks for the existence of the nameserver topology lock file
An increased amount of SelfWatchDog activities can indicate issues accessing the lock file. See SAP Note 1999998 for more details.
service thread sampler interval 1 sec.   2114710 Collection of M_SERVICE_THREAD_SAMPLES information (1 second sampling interval)
SignalQueue        
SmartMerger waiting <schema>:_SYS_SPLIT_<table>~1en 2057046 Execution of smart merges
Split split/merge attribute <schema>:<table> <column>   Processing of column during table repartitioning, triggered by threads with type 'Split' and method 'splitIndex'
SqlExecutor Authentication     Password check and update of access information like "last login time" in SAP HANA tables
  BatchExecute (batch size:<size>) <sql_statement> 2000002 Execution of a batch SQL statement (e.g. bulk INSERT inserting multiple rows)
  CloseStatement   2000002 Closing of a SQL statement
  CollectVersion     Row store version collection
  CommitTrans   2000000 Execution of a COMMIT
  CursorFetchItab   2000002  
  ExecQidItab   2000002  
  ExecutePrepared <sql_statement> 2000002 Execution of a prepared SQL statement (using bind variables)
  ExecuteStatement <sql_statement> 2000002 Execution of a SQL statement
  FetchCursor <sql_statement> 2000002 Execution of a SQL statement
  LobPutPiece   2220627  
  PrepareStatement <sql_statement> 2000002 Preparation of a SQL statement
SystemReplicationIniFileWatchDog     1999880 Responsible for detecting differences in configuration parameter settings between primary and secondary site
TableReload  reloading table <schema>:<table>en 2127458 Reload of table columns into memory by triggering LoadField threads
WorkerThread (StatisticsServer)   running ID (<id>, deletion: no) for <date> 1917938 Activity of embedded statistics server
XS       Execution of XS engine requests

7. How can I influence the collection and historization of thread samples?

In general it is not required to manually adjust the collection and historization of thread samples. In specific scenarios you can adjust the following settings based on your needs:
Parameter Details
global.ini -> [resource_tracking] -> cpu_time_measurement_mode
Controls the collection of CPU time information in (CPU_TIME_SELF, CPU_TIME_CUMULATIVE):
  • off: No collection of CPU information (default)
  • on: Collection of CPU information
global.ini -> [resource_tracking] -> service_thread_sampling_monitor_enabled
Controls the collection of thread samples in M_SERVICE_THREAD_SAMPLES:
  • true: Thread samples are collected (default)
  • false: No thread samples are collected
 
global.ini -> [resource_tracking] -> service_thread_sampling_monitor_max_sample_lifetime
global.ini -> [resource_tracking] -> service_thread_sampling_monitor_max_samples
Control the availability of data in M_SERVICE_THREAD_SAMPLES. See SAP Note 2088971 for more information.
global.ini -> [resource_tracking] -> service_thread_sampling_monitor_sample_interval
Controls the frequency of thread sample times in M_SERVICE_THREAD_SAMPLES in seconds. The value must be between 1 (default) and 100.
global.ini -> [resource_tracking] -> service_thread_sampling_monitor_thread_detail_enabled
Controls the recording of THREAD_DETAIL information in M_SERVICE_THREAD_SAMPLES:
  • false: No collection of thread details (default)
  • true: Collection of thread details

8. What are typical thread constellations?

There are often caller / calling dependencies between a set of threads. Below you can find typical examples.
Example 1: "Complex" column store SQL statement execution
A SQL statement is in the first place executed by a SqlExecutor thread, but this thread may delegate work to JobWorker threads and also JobWorkers may call additional JobWorkers. In this example we can see a hierarchy of four levels. the main SqlExecutor thread 151759 calls the JobWorker thread 111596, this thread calls JobWorker thread 92120 and this thread calls the four JobWorker threads 37304, 111416, 111437 and 111609. Only the four last threads are active at the sample time ("Running") while the other threads are in status "Job Exec Waiting":

Example 2: Delta merge
Delta merges are performed by various merge related threads supervised by the MergedogMonitor. In the following case the MergeAttributeThreads are active ("Running") processing columns MANDANT and $trexexternalkey$ of table BALDAT while the other involved processes have to wait for them to finish. When they are finished, the SaveMergedAttributeThread will take over and finish the merge.

Example 3: Nameserver ping
The namesever regularly executes ping operations to the index server ("Index Server" -> "Ping Time" in Load Graph). This happens with the following intra-node process communication. The first line is the nameserver thread having sent the "ping" request to the indexserver thread in the second line:

Example 4: Column load
When a query requires data from a column that isn't loaded into memory, yet, it has to wait for the related column load ("AttributeStore Resource Load"). The following table shows an example where queries have to wait until column DATA of table SWWCNTP0 is loaded:

9. Are there case studies showing the possibilities of thread samples?

Analyzing thread samples is an integral part of SAP HANA performance analysis. In the following some real-life cases are discussed.
Case study 1: Long-running INSERT due to critical savepoint phases
A simple INSERT of one record had a high average execution time of 10 ms without any apparent issue visible in the SQL cache. We analyzed the thread samples for the related statement hash on a granular time slice basis of 1 second in terms of THREAD_STATE and the result showed time frames of around 8 seconds where multiple concurrent INSERTs waited for "SharedLock Enter" (see below). Based on SAP Note 1999998 this can be caused by blocking situations due to the critical savepoint phase. A check based on M_SAVEPOINTS in deed showed repeated critical savepoint phases of 8 seconds and more. The customer double-checked the I/O performance (SAP Note 1999930) and was able to improve it with technical changes. Afterwards the INSERT times significantly improved.

Case study 2: High system CPU consumption due to garbage collectors
Spikes of system CPU consumption were observed, e.g. between 10:03 and 10:04. SQL: "HANA_Threads_ThreadSamples_AggregationPerTimeSlice" (AGGREGATE_BY = 'THREAD_METHOD', TIME_SLICE_S = 10) reported a high number of concurrent garbage collectors ('GCJob') as seen below, so we can conclude that they are responsible for the increased CPU consumption. If these kinds of CPU spikes are critical for the system, the parallelism of the garbage collectors could be reduced using the max_gc_parallelity parameter (see SAP Note 2100040).

Case study 3: High user CPU consumption caused by specific application users
The system suffered from massive CPU peaks during certain time frames. Using SQL: "HANA_Threads_ThreadSamples_AggregationPerTimeSlice" (AGGREGATE_BY = 'APP_USER', TIME_SLICE_S = 60) we were able to identify application users being responsible for a high load (see below). In the next step the activity of these users was analyzed by checking related batch jobs and asking the users for their online operations and technical optimizations were implemented.

Case study 4: Top SQL statements responsible for load in a specific time frame
A popular evaluation of thread samples is the determination of SQL statements responsible for the highest SAP HANA load (see SAP Note 2000002). This can be done via SQL: "HANA_Threads_ThreadSamples_FilterAndAggregation" (AGGREGATE_BY = 'STATEMENT_HASH'). In the example below we can see the top SQL statements like statement hash 696d5f902df56c3228a80bd25ef71ee3 that can subsequently be analyzed and tuned as described in SAP Note 2000002. Thread samples without a specific statement hash are reported with "no SQL" and the thread method or thread type is added in brackets. "no SQL (SqlExecutor)" can be linked to thread methods like "execQidItab" or "CommitTrans" which don't have a SQL statement assigned.

Case study 5: High system load due to table scans
During times of very high system load a lot of threads with THREAD_METHOD 'searchDocumentsIterateDocidsParallel' were observed in M_SERVICE_THREADS (see below). This method indicates table scans and usually points to a missing index. Further analysis revealed that one central, frequently executed SQL statement wasn't supported by an index. After having created an appropriate index, the maximum CPU consumption went down from 100 % to 10 %.





Header Data

Released On 01.02.2016 11:27:23
Release Status Released to Customer
Component HAN-DB SAP HANA Database
Priority Normal
Category How To

4 comments:

  1. I really appreciate information shared above. It’s of great help. If someone want to learn Online (Virtual) instructor led live training in SAP HANA , kindly Contact GRONYSA
    Click for SAP HANA Course details SAP HANA
    GRONYSA Offer World Class Virtual Instructor led training on SAP HANA. We have industry expert trainer. We provide Training Material and Software Support. GRONYSA has successfully conducted 10,000 + trainings in India, USA, UK, Australlia, Switzerland, Qatar, Saudi Arabia, Bangladesh, Bahrain and UAE etc.
    For Demo Contact us.
    Nitesh Kumar
    GRONYSA
    E-mail: nitesh.kumar@gronysa.com
    Ph: +91-9632072659/ +1-2142700660
    www.GRONYSA.com


    ReplyDelete
  2. Hi There,
    Thank you for update. From now onward I start to use this blog in my training practice. Thank you for explaining in each step. I use blogs for my easy reference which were quite useful to get started with.
    We have a CAL System for Hybrid Marketing. The replication of business partner data between sap CRM and hybrids marketing failed.
    We get the following error in transaction LTRO: CNV_DMC_HC 019
    Trying to check the db-connection 010: R: R (Report ADBC_TEST_CONNECTION) error "sql error 10 occurred: authentication failed" occurred.
    It enables technical users to manage the SAP HANA database, to create and manage user authorizations, to create new, or modify existing models of data etc.,
    Also, a connect to HANA DB with SYSTEM USER is not possible.
    Anyways great write up, your efforts are much appreciated.

    Many Thanks,
    Kapoor

    ReplyDelete
  3. Hello There,

    Thank You so much for this blog. It helped me lot. I am a Technical Recruiter by profession and first time working on this technology was bit tough for me, this article really helped me a lot to understand the details to get started with.
    I heard a buzzword "Optimistic Latch-Free Index Traversal" in one blog post on Introduction to SAP HANA by Dr. Vishal Sikka. I googled about it in context of SAP HANA Architecture but didn't find anything concrete. Could someone please explain this whole concept thoroughly, in context of SAP HANA?

    By the way do you have any YouTube videos, would love to watch it. I would like to connect you on LinkedIn, great to have experts like you in my connection (In case, if you don’t have any issues).
    Please keep providing such valuable information.

    Best Regards,
    Stephen

    ReplyDelete
  4. Greetings Mate,

    That’s really cool…. I followed these instructions and it was like boom… it worked well..
    This is from my understanding.
    You have the UiRobotSvc service that runs on the machine and that’s how the Robot is able to run. Once the job starts, it opens the UiRobot.exe process until the job is finished. If you are running SAP HANA tutorial USA this job unattended through Orchestrator it runs the job the same and only if the UiRobotSvc is running on the machine. The user id for the Robot will need Remote access or Admin authorization on the machine in order to run the job unattended.

    Appreciate your effort for making such useful blogs and helping the community.

    Many Thanks,
    Abhiram

    ReplyDelete