Helping Notes: 2114710 - FAQ: SAP HANA Threads and Thread Samples

Tuesday, February 2, 2016

2114710 - FAQ: SAP HANA Threads and Thread Samples

Symptom

You are interested in information about SAP HANA threads and thread samples.

Environment

SAP HANA

Cause

1. What are SAP HANA threads and thread samples?
2. Where do I find information about SAP HANA threads?
3. For what purpose do I need to evaluate the SAP HANA thread activities?
4. What kind of information is available for the SAP HANA threads?
5. How can I interpret the thread state?
6. What are the main thread types, methods and details?
7. How can I influence the collection and historization of thread samples?
8. What are typical thread constellations?
9. Are there case studies showing the possibilities of thread samples?

Resolution

1. What are SAP HANA threads and thread samples?

The SAP HANA work, like executing of SQL statements or background tasks is performed by threads. Each SAP HANA process like indexserver or xsengine consist of a set of threads.
Thread samples are regular samples of thread activities which are historized.

2. Where do I find information about SAP HANA threads?

Information about current SAP HANA threads can be found at the following locations:

Monitoring view M_SERVICE_THREADS
SAP HANA Studio -> Administration -> Performance -> Threads
DBACOCKPIT -> Performance -> Threads
SQL: "HANA_Threads_CurrentThreads" (SAP Note 1969700)

Information about historic thread activities are available at:

Monitoring view M_SERVICE_THREAD_SAMPLES

Retention is usually at least 2 hours
For more details see SAP Note 2088971

History HOST_SERVICE_THREAD_SAMPLES

Only available if embedded statistics server is used
Default retention time may be quite small (1 day)
Retention should be increased to 42 days as described in SAP Note 2147247 ("What is a good retention time for statistics server histories?")

SQL: "HANA_Threads_ThreadSamples_FilterAndAggregation" (SAP Note 1969700)
SQL: "HANA_Threads_ThreadSamples_AggregationPerTimeSlice" (SAP Note 1969700)

3. For what purpose do I need to evaluate the SAP HANA thread activities?

By checking SAP HANA thread information you can get a current or historic snapshot of thread activities. This can help you to answer questions related to the SAP HANA workload, e.g.:

Activities responsible for CPU consumption (SAP Note 2100040)
Activities responsible for lock situations (SAP Note 1999998)
Activities responsible for bad performance (SAP Note 2000000)
Expensive SQL statements (SAP Note 2000002)

4. What kind of information is available for the SAP HANA threads?

The most important information available in the thread related monitoring views are:

Information	Column names	Available	Details
Host name Port Service name	HOST PORT SERVICE_NAME	generally	Basic information to understand which service on which host owns the thread
Timestamp	TIMESTAMP	M_SERVICE_THREAD_SAMPLES HOST_SERVICE_THREAD_SAMPLES	Sample timestamp
Connection ID	CONNECTION_ID	generally	Client connection identifier (-1 if thread has no assigned client connection)
Thread ID	THREAD_ID	generally	Thread identifier
Statement hash	STATEMENT_HASH	M_SERVICE_THREAD_SAMPLES HOST_SERVICE_THREAD_SAMPLES	Statement hash of active SQL statement
Active flag	IS_ACTIVE	M_SERVICE_THREADS	TRUE if thread is currently active, otherwise FALSE History views only contain active threads, so flag is not required for M_SERVICE_THREAD_SAMPLES and HOST_SERVICE_THREAD_SAMPLES
Thread state	THREAD_STATE	generally	Activity state of thread (e.g. running or waiting for a resource), for more details see "What are the main thread types, methods and details?" below.
Thread type	THREAD_TYPE	generally	Thread type (e.g. SQLExecutor or JobWorker), for more details see "What are the main thread types, methods and details?" below.
Thread method	THREAD_METHOD	generally (but not consequently populated in thread sample views)	Thread method (e.g. ExecutePrepared or indexing), for more details see "What are the main thread types, methods and details?" below. Partially (e.g. for SqlExecutor threads) only written to thread sample views if the following parameter is set to true: global.ini -> [resource_tracking] -> service_thread_sampling_monitor_thread_detail_enabled = true
Thread detail	THREAD_DETAIL	generally (but not populated per default in thread sample views)	Thread detail (e.g. SQL statement or column store plan step), for more details see "What are the main thread types, methods and details?" below Only written to thread sample views if the following parameter is set to true: global.ini -> [resource_tracking] -> service_thread_sampling_monitor_thread_detail_enabled = true
Caller thread Called thread	CALLER CALLING	M_SERVICE_THREADS M_SERVICE_THREAD_SAMPLES (>= SPS 09)	Thread dependencies (caller: ID of thread which called the current thread; calling: ID of thread which is called by the current thread)
Database user name	USER_NAME	generally	Name of database user executing the thread
Application user name	APPLICATION_USER_NAME	generally	Name of application user (e.g. SAP end user in ABAP environments)
Application name	APPLICATION_NAME	M_SERVICE_THREAD_SAMPLES HOST_SERVICE_THREAD_SAMPLES	Name of client application (e.g. "ABAP:<sid>" in case of SAP ABAP stacks)
Application source	APPLICATION_SOURCE	M_SERVICE_THREAD_SAMPLES HOST_SERVICE_THREAD_SAMPLES	Application details (e.g. report name in case of SAP ABAP stacks)
CPU time	CPU_TIME_SELF CPU_TIME_CUMULATIVE	M_SERVICE_THREADS	CPU time in microseconds consumed by the thread itself (CPU_TIME_SELF) or including the associated children (CPU_TIME_CUMULATIVE) Only populated if the following SAP HANA parameter is set: global.ini -> [resource_tracking] -> cpu_time_measurement_mode = fast (SPS 08 and below) global.ini -> [resource_tracking] -> cpu_time_measurement_mode = on (SPS 09 and above)
Transaction ID Update transaction ID	TRANSACTION_ID UPDATE_TRANSACTION_ID	generally	Identifier of related transaction (-1 if no transaction is associated to thread) and update transaction (-1 if no uncommitted modification is associated to thread)
Lock wait details	LOCK_WAIT_COMPONENT LOCK_WAIT_NAME LOCK_OWNER_THREAD_ID	generally	Component, details of lock waits and ID of thread being responsible for lock wait

5. How can I interpret the thread state?

Thread states are useful to judge if the threads are actually working or if they have to wait, so they are one aspect of a general performance analysis (SAP Note 2000000):

Thread state	Area	Details	SAP Note
Barrier Wait	Lock	Barrier wait	1999998
ConditionalVariable Wait	Lock	Mainly record and table lock waits	1999998
ExclusiveLock Enter	Lock	Exclusive read / write lock waits	1999998
Inactive	Idle	Idle thread
IO Wait	I/O	Disk I/O wait (e.g. during column load or hybrid LOB access	1999930
Job Exec Waiting	Idle	Waiting for a JobWorker thread executing the actual work
Mutex Wait	Lock	Mutex lock waits	1999998
Network Poll	Network	Network I/O between SAP HANA nodes or services	2222200
Network Read	Network	Network I/O read between SAP HANA nodes or services	2222200
Network Write	Network	Network I/O write between SAP HANA nodes or services	2222200
Running	SQL	Request execution, typically CPU consumption	2000002
Semaphore Wait	Lock	Semaphore lock waits	1999998
SharedLock Enter	Lock	Shared read / write lock waits	1999998
Sleeping	Lock	Waiting for a lock	1999998

6. What are the main thread types, methods and details?

Below you can find some important thread types, methods and details:

Thread Type	Thread Method	Thread Detail	SAP Note	Tasks
Assign	assign	open data+log volume <id> / local tables	2185955	Indexserver -> nameserver communication in order to open disk volumes May be blocked if nameserver has issues, e.g. due to an orphan nameserver process like described in SAP Note 2185955
	post_assign	open data+log volume <id> / preload tables	2127458	Reloading tables at the end of startup and volume assignment
AsyncSender	send	working for thread <waiting_thread>		Asynchronous thread / thread communication (sending thread)
AsyncWaiter	waiting	working for thread <sending_thread>		Asynchronous thread / thread communication (waiting thread)
BWFlattenScenarioHost	BWFlattenScenarioHost	<host>:<port> partGroup 0		Conversion of BW infocubes from classic to in-memory (BW_CONVERT_CLASSIC_TO_IMO_CUBE)
Cache Request Handler
CatalogWatchdog
DiskInfo	remotediskinfo			Request of information for remote physical disk
DsoUpdate	ActivationQueuePackage	<schema>:<table>en: [<id>, <id>) / {Lhs table parts: [<schema>:<table> (<id>)] (not prunable: size 1 set 0), rhs table parts: [<schema>:<table> (<id>)]}	2000002	DSO activation via DSO_ACTIVATE_PERSISTED
	DSOPartActivatorLocal	<schema>:<table>en RequestIds <id> / {Lhs table parts: [<schema>:<table> (<id>)] (not prunable: size 1 set 0), rhs table parts: [<schema>:<table> (<id>)]}	2000002	DSO activation via DSO_ACTIVATE_PERSISTED
EmergencyJobForDeferredResource			2127458	Framework used during critical situations (e.g. for unloading columns during memory shortages)
Generic			1999880	A running 'Generic' thread without further method or detail information and with DataAccess::AsyncLogBufferHandlerThread::run in its call stack is responsible for shipping log buffers in asynchronous system replication environments.
Generic	Search for trace files to compress		2119087	Check if trace files (e.g. alert traces) exist that should be compressed Peaks can be caused by I/O issues accessing the trace files (see SAP Note 1999930), example for an I/O related call stack: __open_nocancel+0x7 __opendir+0x1 System::UX::opendir FileAccess::DirectoryEntry::findFirst FileAccess::DirectoryEntry::DirectoryEntry Diagnose::TraceSegmentCompressorThread::run __getdents+0x6 __readdir_r+0x105 System::UX::readdir_r FileAccess::DirectoryEntry::findFirst FileAccess::DirectoryEntry::DirectoryEntry Diagnose::TraceSegmentCompressorThread::run
getLastDocids	byValue	$trexexternalkey$ from=<id> to=<id>
IndexingQueue	indexing	<schema>:<table>en.<column>	2160391	Indexing of fulltext indexes
JobexWatchdog				Watchdog for job execution
JobWorker	aggregate	worker <schema>:<table>en(<pct>%)
	AttributeIndexJob	commitOptimize worker (table oid: <id>, attribute <column>) writing <num> rows
	Backup::BackupExe_Job		1642148	Threads actually performing the backup, often triggered by _IDX_BackupExecutor slaves
	BackupMgr_ExecuteDataSaveJob		1642148	Data backup threads typically triggered by BackupMgr_SaveDataJob
	BackupMgr_SaveDataJob		1642148	Main data backup thread
	calculateQueryResultDocs2	resultdocs		Parallelized query result processing in column store
	CeCalcAttrMaterializeJob
	changeCompression ChangeCompressionJob	ChangeCompressionJob: table=<schema>:_SYS_SPLIT_<table>~<id> (<id>), columnId=<id>: changeAttributeDefinition table=<schema>:<table> (<id>), columnId=<id>: getEstimatedTotalMainMemSize	2112604	Compression optimization (adjustment of column compression)
	changeRowOrder	PrepareRenumberJob: table=<schema>:_SYS_SPLIT_<table>~<id> (<id>), columnId=<id>: prepareRenumber	2112604	Compression optimization
	CHECK_TABLE_CONSISTENCY/local	checking row table <schema>.<table> checking column table <schema>.<table>	1977584	Execution of consistency check CHECK_TABLE_CONSISTENCY
	comm
	CompressFemsParallel		1999997	FEMS compression See SAP Note 1999997 -> "What can I do if a certain heap allocator is unusually large?" -> "Pool/FemsCompression/CompositeFemsCompression" for more information.
	createJobs	RadixSort
	create_ReadValueId_Jobs	GetValueIds
	createResultDocs			Creation of results of parallel OLAP engine activities
	DDLDataJob	<schema>:_SYS_SPLIT_<table_name>~<id> (<id>)
	DefaultColumnDict
	detachResult	merger <id>
	DictScanJob
	estimateCompression	EstimateCompressionJob: table=<schema>:_SYS_SPLIT_<table>~<id> (<id>), columnId=<id>: chooseCompression		Compression optimization (identification of optimal compression)
	EXPORT thread	Table-wise EXPORT thread for <schema>:<table>		Table export
	FemsCompressionPhase2Job	Phase <id> Job <id> Num of parts: <id>
	GCJob@<identifier>	GCJob[<identifier>-EID, CleanupFilePassedToGC, TID= <sid>, transIndex= <id>, CleanupStrategy_MVCC, pagecnt=<pg>, size=<size>	2169283	Persistence garbage collection
	generic	BuildIndexJobNode
		commitOptimize worker (table oid: <id>, attribute <column>) writing 1800 rows		Can be linked to mass changes (e.g. UPSERT)
		ConcatenateAttributesJob
		RowInsertJobNode
		EmailJob: working on snapshot_ID: <date>	1999998	E-mail processing, may be blocked by MailSenderCallback semaphore waits, see SAP Note 1999998 for more details.
	getMostFrequentValueInfos	GetMostFrequentValueInfosJob: table=<schema>:_SYS_SPLIT_<table>~<id>(<id>), column=<id>	2112604	Compression optimization
	getNextPQItem	UpdateMostFrequentValueCountJob
	GroupBy-group			Row store GROUP BY / DISTINCT processing
	_handleQuery
	HashDict WorkerJob	HashDict worker <num>% done
	HashDict MergerJob	HashDict merger <id>
	HTTP	POST <connector> PUT <connector>		HTTP request processing (xsengine)
	IndexConsistencyCheckData		1977584	Consistency check based on CHECK_TABLE_CONSISTENCY
	IndirectScanBvOutJob<range> IndirectScanVecOutJob<range>		2000002	Range scan of column with INDIRECT compression
	initIteratorMultiBtQuery	initIteratorMultiBtQuery
	JobConcatAttrCalculator	JobConcatAttrCalculator<schema>:<table>en <rows> rows	1986747	Creation of a CONCAT attribute, e.g. during index creation
	JobReadValueIdsParallel
	makeHeap	MakeHeapJob
	OrderBy-fetch			Row store ORDER BY processing
	execute ParallelForJob ParallelFor Job	AE/getValues BlockJob on table <schema>:<table> (<id>) (parallelFor job <id>/<id>) AE/getValues SliceJob on table <schema>:_SYS_SPLIT_<table> (<id>) (parallelFor job <id>/<id>)
	calc PlanExecutor calc	plan<id>@<host>:<port>/pop<id> (<action>)	2000002	Column store plan execution
	PlanExecutor comm
	postProcessScanVecResult	SetRangeVecOutJob	2112604	Compression optimization
	processParallel
	readIndexChunked	readIndexChunked
	RSVersionCollectorJobNode/Partition<part>	remaining versions to collect: <versions> versions (<tx> tx, <flushes> flushes)	2169283	These threads are triggered by the MVCCGarbageCollector garbage collector thread in order to perform parallelized row store version consolidation.
	runJobs	ItabMat-insert
	scanParallel
	scanWithoutIndex	IndScanVecOutJob
		IndScanBvOutJob
	searchBetween	searchBetween
	searchDocumentsIterateDocidsParallel	JobParallelMgetSearch docs=<val1>(VEC) vid=<val2>(BV)	2000002	Table scan
	ServerJob	Unnamed-SimpleJob	2222218	Execution of hdbcons command (caller: NetworkChannelCompletionThread)
	SetAttributeJob	Source attribute: <column> dest attribute: <column>		Conversion of BW infocubes from classic to in-memory (BW_CONVERT_CLASSIC_TO_IMO_CUBE)
	SidMappingJob	SidMappingJob: source scenario <schema>:<scenario> , mapping to new attribute <column> (<id> values)		Conversion of BW infocubes from classic to in-memory (BW_CONVERT_CLASSIC_TO_IMO_CUBE)
	Sorter	RangeSortJob
	sortBlocksByRowID	SortByRowIdJob: table=<schema>:_SYS_SPLIT_<table>~<id> (<id>)	2112604	Compression optimization
	sortRestRanges	SortRestRangeJob: table=<schema>:_SYS_SPLIT_/<table> (<id>), columnId=<id>	2112604	Compression optimization
	sparseSearch	SparseScanRangeVecJob	2000002	Scan of column with SPARSE compression
	spSearchValueCountsFastScan			Ad-hoc attribute engine scan for counting values (e.g. when RECORD_COUNT is selected from M_CS_COLUMNS / M_CS_ALL_COLUMNS)
	startLoop
	system replication: async log buffer handler	Unnamed-JobNode	1999880	Handler thread, that picks log buffers from the temporary memory buffer and sends them over the network to the secondary site. If asynchronous replication is enabled, this thread is running always (maybe waiting for new log buffers to be created).
	system replication: mark shipped snapshot active	Unnamed-JobNode	1999880	After a successful data shipping, the snapshot that has been used as source on primary site is marked as active snapshot (last shipped snapshot to secondary site).
	system replication: prepare transfer data replica to secondary	Unnamed-JobNode	1999880	Preparation of a new data shipping from primary to secondary site, snapshot for data shipping is created on primary site, that is used as source for data transfer.
	system replication: check replication snapshots	Unnamed-JobNode	1999880	Check whether old replication snapshots can be dropped on primary site, snapshots are dropped if they are no longer needed or the snapshot's retention time has been elapsed.
	system replication: transfer missing log to secondary	Unnamed-JobNode	1999880	Transfer of missing log to the secondary site, that is required to make the backup history complete.
	system replication: transfer replica to secondary	Unnamed-JobNode	1999880	Execution of data shipping from primary to secondary site.
	WorkerJob	worker <schema>:_SYS_SPLIT_<table>~<id>en (merging)		Merging of parallel OLAP engine data sets
	writeDataIntoDelta	Creating concat for <schema>:_SYS_SPLIT_<index>$history$$delta_1$en	2057046	Writing of delta storage information
	Zipper
LoadDataWriter			2222110	Responsible for writing the SAP HANA load history May be impacted by I/O problems related to trace file directory (SAP Note 1999930), example call stacks: __open_nocancel+0x20 __GI__IO_file_open+0x30 _IO_new_file_fopen+0xc7 __fopen_internal+0x80 NameServer::LoadHistoryManager::save NameServer::LoadDataWriter::ru __close_nocancel+0x7 _IO_new_file_close_it+0x40 _IO_new_fclose+0x180 NameServer::LoadHistoryManager::save NameServer::LoadDataWriter::run
LoadField	<schema>:<table>en	AttributeId=<id> <column>	2127458	Load of columns into column store
LoadPart	load	<schema>:_SYS_SPLIT_<table>~<id>en	2127458	Load of columns of partitioned table into column store
LobGarbageCollector			2169283	Lob persistence garbage collection
LocalWatchDog	ping	<host>:<port>		Nameserver ping check of indexserver
LogBackupThread
Main
MemoryCompactor			2169283	Memory garbage collection
MergeAttributeThread	prepareDeltaMerge	IndexName: <schema>:<table>en Attribute: <id>/<column>	2057046	Preparation of delta merges
MergedogMerger	merging	<x> of <y> table(s): <table>	2057046	Execution of delta merges
MergedogMonitor	joining	waiting for merge threads to complete	2057046	Delta merge monitoring (waiting for merge threads)
	checking	<x> of <y> table(s): <schema>:_SYS_SPLIT_<table>~<id>	2057046	Delta merge monitoring (checking)
MvccAntiAgerChecker			2169283	Row store version consolidation and garbage collection check
MVCCGarbageCollector			2169283	Row store version consolidation and garbage collection
PartDistributorThread	PartDistributorThread		2044468	Access and modification of records into appropriate table partition
PeriodicSavepoint	doSavepoint	enterCriticalPhase	2100009	Preparations for entering the critical savepoint phase (part of CRITICAL_PHASE_WAIT_TIME in M_SAVEPOINTS, change operations like DML, DDL or merges can be blocked with ConsistentChangeLock, see SAP Note 1999998)
		enterCriticalPhase(waitForLock)	2100009	Preparations for entering the critical savepoint phase (part of CRITICAL_PHASE_WAIT_TIME in M_SAVEPOINTS, change operations like DML, DDL or merges can be blocked with ConsistentChangeLock, see SAP Note 1999998)
		enterPostCriticalPhase	2100009	Savepoint processing after critical phase
		exitCriticalPhase	2100009	End of critical savepoint phase
		finishSavepoint	2100009	Final savepoint steps
		flushPagesinNonCriticalPhase	2100009	Savepoints writing dirty pages to disk
		preCriticalPhase	2100009	Savepoint activity before starting critical phase
		prepareSavepoint	2100009	Savepoint preparation
		processCriticalPhase	2100009	Critical savepoint phase (CRITICAL_PHASE_DURATION in M_SAVEPOINTS, change operations like DML, DDL or merges can be blocked with ConsistentChangeLock, see SAP Note 1999998)
PhraseIndexBuilder	update_phrases	<schema>:<table>en.<column>	2160391	Maintenance of fulltext indexes
Queue	indexing	<schema>:<table>en.<column>	2160391	Indexing of fulltext indexes
RemoteService
Request	BackupManager
	bwFlattenScenarioPart			Conversion of BW infocubes from classic to in-memory (BW_CONVERT_CLASSIC_TO_IMO_CUBE)
	cachemgr
	CheckRemoteUniqueConstraint		2044468	Check of uniqueness in other partitions / hosts, should be avoided if possible (e.g. by avoiding partitioning or by performing partitioning based on unique / primary key columns) Avoid partitioning of SID tables in BW, because they typically have two different unique indexes. For data management of SID tables see SAP Note 1331403.
	csddl/TruncateTableData
	remotediskinfo			Collection of information for local physical disk by nameserver (NameServer::doRemoteDiskInfo) in order to support a remote request
	distMetdataRequest			Distributed metadata request
	DistRowQueryExecute
	DistTableStatistics
	dso/activatePers		2000002	DSO activation via DSO_ACTIVATE_PERSISTED
	executeTableUpdateContainer
	fs
	getColumnStat			Determination of column statistics
	__globalTransControl			Control of global transactions involving different services
	_IDX_BackupExecutor		1642148	Backup executor threads, typically triggered by BackupMgr_ExecuteDataSaveJob slaves
	__nsWatchdog			Triggered by nameserver watchdog (LocalWatchDog)
	LOBREMOTE	<schema>:<table>en/getContainerIdsFromTable	2220627	LOB container access
	memoryutilization
	ngdb_console	runtimedump dump	1813020	Creation of a runtime dump using MANAGEMENT_CONSOLE_PROC / hdbcons
	OptimizeCompressionData		2112604	Compression optimization (triggering JobWorker processes doing the actual work)
	planexecution			X2 plan execution
	prep/indexing	TEXT/<schema>:<table>en.<column>:<id>	2160391	Indexing of fulltext indexes by preprocessor
	Queue Master
	Queue Pull			Preprocessor request to indexserver for job workers that can be used for preprocessor activities like 'indexing'
	reclaim_data_volume			Execution of 'ALTER SYSTEM RECLAIM DATAVOLUME' command
	RemotePersistenceNoSession
	search	<schema>:<table>/<conditions>
	__sessionRequest
	splitIndex	<schema>:<table>		Repartitioning of tables, triggers threads like (e.g type 'Split', method 'split/merge attributes')
	stat core/stat
SaveMergedAttributeThread	prepareDeltaMergeSave	IndexName: <schema>:<table>en Attribute: <id>/<column>	2057046	Preparation of perstisting delta merge to disk
Savepoint				Savepoint execution
SelfWatchDog			1999998	Watchdog thread that regularly checks for the existence of the nameserver topology lock file An increased amount of SelfWatchDog activities can indicate issues accessing the lock file. See SAP Note 1999998 for more details.
service thread sampler	interval 1 sec.		2114710	Collection of M_SERVICE_THREAD_SAMPLES information (1 second sampling interval)
SignalQueue
SmartMerger	waiting	<schema>:_SYS_SPLIT_<table>~1en	2057046	Execution of smart merges
Split	split/merge attribute	<schema>:<table> <column>		Processing of column during table repartitioning, triggered by threads with type 'Split' and method 'splitIndex'
SqlExecutor	Authentication			Password check and update of access information like "last login time" in SAP HANA tables
	BatchExecute	(batch size:<size>) <sql_statement>	2000002	Execution of a batch SQL statement (e.g. bulk INSERT inserting multiple rows)
	CloseStatement		2000002	Closing of a SQL statement
	CollectVersion			Row store version collection
	CommitTrans		2000000	Execution of a COMMIT
	CursorFetchItab		2000002
	ExecQidItab		2000002
	ExecutePrepared	<sql_statement>	2000002	Execution of a prepared SQL statement (using bind variables)
	ExecuteStatement	<sql_statement>	2000002	Execution of a SQL statement
	FetchCursor	<sql_statement>	2000002	Execution of a SQL statement
	LobPutPiece		2220627
	PrepareStatement	<sql_statement>	2000002	Preparation of a SQL statement
SystemReplicationIniFileWatchDog			1999880	Responsible for detecting differences in configuration parameter settings between primary and secondary site
TableReload	reloading table	<schema>:<table>en	2127458	Reload of table columns into memory by triggering LoadField threads
WorkerThread (StatisticsServer)		running ID (<id>, deletion: no) for <date>	1917938	Activity of embedded statistics server
XS				Execution of XS engine requests

7. How can I influence the collection and historization of thread samples?

In general it is not required to manually adjust the collection and historization of thread samples. In specific scenarios you can adjust the following settings based on your needs:

Parameter	Details
global.ini -> [resource_tracking] -> cpu_time_measurement_mode	Controls the collection of CPU time information in (CPU_TIME_SELF, CPU_TIME_CUMULATIVE): off: No collection of CPU information (default) on: Collection of CPU information
global.ini -> [resource_tracking] -> service_thread_sampling_monitor_enabled	Controls the collection of thread samples in M_SERVICE_THREAD_SAMPLES: true: Thread samples are collected (default) false: No thread samples are collected
global.ini -> [resource_tracking] -> service_thread_sampling_monitor_max_sample_lifetime global.ini -> [resource_tracking] -> service_thread_sampling_monitor_max_samples	Control the availability of data in M_SERVICE_THREAD_SAMPLES. See SAP Note 2088971 for more information.
global.ini -> [resource_tracking] -> service_thread_sampling_monitor_sample_interval	Controls the frequency of thread sample times in M_SERVICE_THREAD_SAMPLES in seconds. The value must be between 1 (default) and 100.
global.ini -> [resource_tracking] -> service_thread_sampling_monitor_thread_detail_enabled	Controls the recording of THREAD_DETAIL information in M_SERVICE_THREAD_SAMPLES: false: No collection of thread details (default) true: Collection of thread details

8. What are typical thread constellations?

There are often caller / calling dependencies between a set of threads. Below you can find typical examples.
Example 1: "Complex" column store SQL statement execution
A SQL statement is in the first place executed by a SqlExecutor thread, but this thread may delegate work to JobWorker threads and also JobWorkers may call additional JobWorkers. In this example we can see a hierarchy of four levels. the main SqlExecutor thread 151759 calls the JobWorker thread 111596, this thread calls JobWorker thread 92120 and this thread calls the four JobWorker threads 37304, 111416, 111437 and 111609. Only the four last threads are active at the sample time ("Running") while the other threads are in status "Job Exec Waiting":

Example 2: Delta merge
Delta merges are performed by various merge related threads supervised by the MergedogMonitor. In the following case the MergeAttributeThreads are active ("Running") processing columns MANDANT and $trexexternalkey$ of table BALDAT while the other involved processes have to wait for them to finish. When they are finished, the SaveMergedAttributeThread will take over and finish the merge.

Example 3: Nameserver ping
The namesever regularly executes ping operations to the index server ("Index Server" -> "Ping Time" in Load Graph). This happens with the following intra-node process communication. The first line is the nameserver thread having sent the "ping" request to the indexserver thread in the second line:

Example 4: Column load
When a query requires data from a column that isn't loaded into memory, yet, it has to wait for the related column load ("AttributeStore Resource Load"). The following table shows an example where queries have to wait until column DATA of table SWWCNTP0 is loaded:

9. Are there case studies showing the possibilities of thread samples?

Analyzing thread samples is an integral part of SAP HANA performance analysis. In the following some real-life cases are discussed.
Case study 1: Long-running INSERT due to critical savepoint phases
A simple INSERT of one record had a high average execution time of 10 ms without any apparent issue visible in the SQL cache. We analyzed the thread samples for the related statement hash on a granular time slice basis of 1 second in terms of THREAD_STATE and the result showed time frames of around 8 seconds where multiple concurrent INSERTs waited for "SharedLock Enter" (see below). Based on SAP Note 1999998 this can be caused by blocking situations due to the critical savepoint phase. A check based on M_SAVEPOINTS in deed showed repeated critical savepoint phases of 8 seconds and more. The customer double-checked the I/O performance (SAP Note 1999930) and was able to improve it with technical changes. Afterwards the INSERT times significantly improved.

Case study 2: High system CPU consumption due to garbage collectors
Spikes of system CPU consumption were observed, e.g. between 10:03 and 10:04. SQL: "HANA_Threads_ThreadSamples_AggregationPerTimeSlice" (AGGREGATE_BY = 'THREAD_METHOD', TIME_SLICE_S = 10) reported a high number of concurrent garbage collectors ('GCJob') as seen below, so we can conclude that they are responsible for the increased CPU consumption. If these kinds of CPU spikes are critical for the system, the parallelism of the garbage collectors could be reduced using the max_gc_parallelity parameter (see SAP Note 2100040).

Case study 3: High user CPU consumption caused by specific application users
The system suffered from massive CPU peaks during certain time frames. Using SQL: "HANA_Threads_ThreadSamples_AggregationPerTimeSlice" (AGGREGATE_BY = 'APP_USER', TIME_SLICE_S = 60) we were able to identify application users being responsible for a high load (see below). In the next step the activity of these users was analyzed by checking related batch jobs and asking the users for their online operations and technical optimizations were implemented.

Case study 4: Top SQL statements responsible for load in a specific time frame
A popular evaluation of thread samples is the determination of SQL statements responsible for the highest SAP HANA load (see SAP Note 2000002). This can be done via SQL: "HANA_Threads_ThreadSamples_FilterAndAggregation" (AGGREGATE_BY = 'STATEMENT_HASH'). In the example below we can see the top SQL statements like statement hash 696d5f902df56c3228a80bd25ef71ee3 that can subsequently be analyzed and tuned as described in SAP Note 2000002. Thread samples without a specific statement hash are reported with "no SQL" and the thread method or thread type is added in brackets. "no SQL (SqlExecutor)" can be linked to thread methods like "execQidItab" or "CommitTrans" which don't have a SQL statement assigned.

Case study 5: High system load due to table scans
During times of very high system load a lot of threads with THREAD_METHOD 'searchDocumentsIterateDocidsParallel' were observed in M_SERVICE_THREADS (see below). This method indicates table scans and usually points to a missing index. Further analysis revealed that one central, frequently executed SQL statement wasn't supported by an index. After having created an appropriate index, the maximum CPU consumption went down from 100 % to 10 %.

Header Data


Released On	01.02.2016 11:27:23
Release Status	Released to Customer
Component	HAN-DB SAP HANA Database
Priority	Normal
Category	How To

4 comments:

AnonymousNovember 6, 2017 at 11:29 PM
I really appreciate information shared above. It’s of great help. If someone want to learn Online (Virtual) instructor led live training in SAP HANA , kindly Contact GRONYSA
Click for SAP HANA Course details SAP HANA
GRONYSA Offer World Class Virtual Instructor led training on SAP HANA. We have industry expert trainer. We provide Training Material and Software Support. GRONYSA has successfully conducted 10,000 + trainings in India, USA, UK, Australlia, Switzerland, Qatar, Saudi Arabia, Bangladesh, Bahrain and UAE etc.
For Demo Contact us.
Nitesh Kumar
GRONYSA
E-mail: nitesh.kumar@gronysa.com
Ph: +91-9632072659/ +1-2142700660
www.GRONYSA.com

ReplyDelete
Replies
UnknownMarch 23, 2018 at 11:19 PM
Hi There,
Thank you for update. From now onward I start to use this blog in my training practice. Thank you for explaining in each step. I use blogs for my easy reference which were quite useful to get started with.
We have a CAL System for Hybrid Marketing. The replication of business partner data between sap CRM and hybrids marketing failed.
We get the following error in transaction LTRO: CNV_DMC_HC 019
Trying to check the db-connection 010: R: R (Report ADBC_TEST_CONNECTION) error "sql error 10 occurred: authentication failed" occurred.
It enables technical users to manage the SAP HANA database, to create and manage user authorizations, to create new, or modify existing models of data etc.,
Also, a connect to HANA DB with SYSTEM USER is not possible.
Anyways great write up, your efforts are much appreciated.

Many Thanks,
Kapoor
ReplyDelete
Replies
UnknownApril 20, 2018 at 1:33 AM
Hello There,

Thank You so much for this blog. It helped me lot. I am a Technical Recruiter by profession and first time working on this technology was bit tough for me, this article really helped me a lot to understand the details to get started with.
I heard a buzzword "Optimistic Latch-Free Index Traversal" in one blog post on Introduction to SAP HANA by Dr. Vishal Sikka. I googled about it in context of SAP HANA Architecture but didn't find anything concrete. Could someone please explain this whole concept thoroughly, in context of SAP HANA?

By the way do you have any YouTube videos, would love to watch it. I would like to connect you on LinkedIn, great to have experts like you in my connection (In case, if you don’t have any issues).
Please keep providing such valuable information.

Best Regards,
Stephen
ReplyDelete
Replies
UnknownMay 30, 2018 at 3:04 AM
Greetings Mate,

That’s really cool…. I followed these instructions and it was like boom… it worked well..
This is from my understanding.
You have the UiRobotSvc service that runs on the machine and that’s how the Robot is able to run. Once the job starts, it opens the UiRobot.exe process until the job is finished. If you are running SAP HANA tutorial USA this job unattended through Orchestrator it runs the job the same and only if the UiRobotSvc is running on the machine. The user id for the Robot will need Remote access or Admin authorization on the machine in order to run the job unattended.

Appreciate your effort for making such useful blogs and helping the community.

Many Thanks,
Abhiram
ReplyDelete
Replies

Add comment