Symptom
You are interested in details about indexes in SAP HANA environments.
Environment
SAP HANA
Cause
1. What are SAP HANA indexes in the context of this SAP Note?
2. What is the purpose of indexes in SAP HANA environments?
3. Which special aspects and limitations have to be considered with indexes?
4. Where can I find information about existing indexes?
5. What kind of indexes exist in SAP HANA environments?
6. Which general recommendations exist for individually created indexes?
7. Are there tools available which automatically suggest useful indexes?
8. Which DDL operations can be performed on indexes?
9. How are indexes stored in column store?
10. Are indexes persisted to disk?
11. Why do I need single column indexes on column store tables although the column dictionary is already sorted?
12. Where can I see if an indexes is used by a certain SQL statement?
13. Are indexes dedicated storage objects?
14. What happens to indexes when I move a table between row and column store?
15. Are accesses to the underlying table locked when an index is created?
2. What is the purpose of indexes in SAP HANA environments?
3. Which special aspects and limitations have to be considered with indexes?
4. Where can I find information about existing indexes?
5. What kind of indexes exist in SAP HANA environments?
6. Which general recommendations exist for individually created indexes?
7. Are there tools available which automatically suggest useful indexes?
8. Which DDL operations can be performed on indexes?
9. How are indexes stored in column store?
10. Are indexes persisted to disk?
11. Why do I need single column indexes on column store tables although the column dictionary is already sorted?
12. Where can I see if an indexes is used by a certain SQL statement?
13. Are indexes dedicated storage objects?
14. What happens to indexes when I move a table between row and column store?
15. Are accesses to the underlying table locked when an index is created?
Resolution
1. What are SAP HANA indexes in the context of this SAP Note?
SAP HANA indexes in the context of this SAP Note are data structures on tables, that can provide efficient table data access and / or guarantee uniqueness. This index concept is very common for relational databases.In general the term "index" in SAP HANA can also mean "table" or "data". For example, the main SAP HANA process "indexserver" obviously works on data in the first place and not on indexes.
2. What is the purpose of indexes in SAP HANA environments?
The main reasons for creating indexes in SAP HANA environments are:- Efficient table data access (see SAP Note 2000002)
- Efficient support of unique and primary key constraints
3. Which special aspects and limitations have to be considered with indexes?
For an efficient index design it is good to know the following aspects and limitations of indexes:Scenario | Details |
No support of MIN and MAX | Although the index (and the underlying dictionary) provides sorted data, it is not possible to retrieve the minimum or the maximum value directly. Instead the whole data has to be scanned to find the minimum / maximum. Main reason for this technical restriction is the fact that two separate data areas exist (main and delta storage) and so no globally sorted data set is available. |
No support of ORDER BY | Although the index (and the underlying dictionary) provides sorted data, it is not possible to support an ORDER BY with an index. Instead always an explicit sorting needs to be done. Main reason for this technical restriction is the fact that two separate data areas exist (main and delta storage) and so no globally sorted data set is available. |
Risk of performance regressions with INVERTED HASH indexes | Hash collisions or range conditions on index columns can result in performance regressions when using INVERTED HASH indexes. See SAP Note 2109355 for more details. |
Increased memory requirements of FULLTEXT indexes | Due to the fuzzy content of FULLTEXT indexes the memory requirements can be significantly higher than for normal indexes. |
Indexes on PREFIXED and SPARSE columns may not be used | For technical reasons indexes on columns with PREFIXED or SPARSE compression may not be used for efficient data access. See SAP Note 2000002 for more information. |
Implicit indexes when creating primary key | When a unique or primary index is created on multiple columns of a column store table, an (unnamed) implicit single column index is created on all individual columns. If for example a primary key is created on columns MANDT, BELNR and POSNR, implicit single column indexes are created on column MANDT, on column BELNR and on column POSNR. |
4. Where can I find information about existing indexes?
Information about indexes is available in the following SAP HANA tables and monitoring views:Table name | Details |
FULLTEXT_INDEXES | Fulltext indexes |
GEOCODE_INDEXES | Geocode indexes |
INDEXES | All existing indexes |
INDEX_COLUMNS | Columns of all existing indexes |
M_CS_INDEXES | Column store indexes |
M_FUZZY_SEARCH_INDEXES | Fuzzy search indexes |
M_RS_INDEXES | Row store indexes |
SQL statement | Details |
SQL: "HANA_Indexes_Columns" | Index columns |
SQL: "HANA_Indexes_ColumnStore_IndexesOnSparseAndPrefixedColumns" | Shows single column indexes on columns with SPARSE or PREFIXED compression type (which may not provide performance benefit, see SAP Note 2000002 for more information) |
SQL: "HANA_Indexes_ColumnStore_RedundantIndexes" | Shows redundant single column indexes (which are already implicitly created on columns of primary key or unique indexes) |
SQL: "HANA_Indexes_HashCollisions" | Hash collisions of INVERTED HASH indexes (SAP Note 2109355) |
SQL: "HANA_Indexes_LargestIndexes" | Overview of largest indexes |
SQL: "HANA_RowStore_TotalIndexSize" | Calculation of total row store index size and comparison with Pool/RowEngine/CpbTree heap allocator in order to detect memory leak |
5. What kind of indexes exist in SAP HANA environments?
The following indexes are available in SAP HANA environments:Store | Index type | SAP Note | Details | Creation command |
Row store | BTREE [UNIQUE] | B*tree index on row store table |
CREATE [UNIQUE] BTREE INDEX ... |
|
Row store | CPBTREE [UNIQUE] | 2112604 | B*tree index with compressed prefix on row store table |
CREATE [UNIQUE] CPBTREE INDEX ... |
Column store | FULLTEXT | Fulltext index |
CREATE FULLTEXT INDEX ... |
|
Column store | INVERTED HASH [UNIQUE] | 2109355 |
INVERTED HASH index, more memory efficient alternative to INVERTED VALUE indexes Maps column dictionary value IDs to row IDs, no B*tree structure |
CREATE [UNIQUE] INVERTED HASH INDEX... |
Column store | INVERTED VALUE [UNIQUE] |
INVERTED VALUE index, standard column store index that maps value IDs of dictionary to row IDs of column Maps column dictionary value IDs to row IDs, no B*tree structure |
CREATE [UNIQUE] INVERTED VALUE INDEX ... |
6. Which general recommendations exist for individually created indexes?
The following general recommendations should be considered when creating indexes individually:Recommendation | Details |
As few indexes as possible | Every index imposes overhead in terms of space and performance, so you should create as few indexes as possible. |
As small indexes as possible | Specify as few columns as possible in an index, so that the space overhead is minimized. |
Prefer single column indexes in column store |
Single column indexes in column store have much less space overhead,
because they are implemented as rather small additional column data
structure. Therefore you should use single column indexes whenever
possible. Due to the in-memory approach it is typically fine to define an index only on the most selective column in SAP HANA environments, while on other relational databases often only a multi-column index provides optimal performance. |
7. Are there tools available which automatically suggest useful indexes?
While the individual design of secondary indexes is typically an outcome of SQL optimization there are already the following general approaches available to determine useful secondary indexes:Area | Details |
SAP Suite on HANA | See SAP Note 1794297 that provides some reports for identifying useful indexes. |
SAP Bank Analyzer | See SAP Note 2015986 for general suggestions on index design in Bank Analyzer environments. |
SAP FI-CA | See SAP Note 2160560 for useful single column indexes in FI-CA environments. These indexes are created per default as of SAP ERP 6.0 EhP7 SP09. |
SAP IS-U | See SAP Note 2226904 for useful single column indexes in IS-U environments. |
General | $DIR_INSTANCE/exe/python_support/indexAdvisor.py is a Python script that identifies useful indexes. |
8. Which DDL operations can be performed on indexes?
Important DDL operations on indexes are:Operation | Command |
Create |
CREATE ... INDEX "<index_name>" ON "<table_name>" ... |
Drop |
DROP INDEX "<index_name>" |
Rebuild |
ALTER INDEX "<index_name>" REBUILDOnly relevant for row store, as column store indexes are automatically rebuilt during delta merge operations. |
Rename |
RENAME INDEX "<index_name>" TO "<new_index_name>" |
9. How are indexes stored in column store?
Single-column indexes in column store are rather light-weight data structures on top of the column structure, so called inverted indexes.Multi-column indexes in column store are stored as internal columns, so called CONCAT attributes. See SAP Note 1986747 for more information how multi-column indexes are stored as internal columns. On a CONCAT attribute columns also an inverted index is created, just like for columns being used by a single-column index.
10. Are indexes persisted to disk?
The following overview shows which kind of indexes are persisted to disk:Store | Index type | SPS | Detail | ||||||
Column store | Multi column indexes | <= 06 | Only maintained in memory, recreated during column load | ||||||
>= 07 |
Column specific inverted index structures recreated during load The following parameter controls if CONCAT attributes are persisted to disk:
|
||||||||
Column store | Single column indexes | Only maintained in memory, recreated during column load | |||||||
Row store | Only maintained in memory, recreated during startup |
11. Why do I need single column indexes on column store tables although the column dictionary is already sorted?
The column dictionary contains the existing column values in a sorted way, but it doesn't contain the information, in which rows of the table a certain value exists. This mapping from the dictionary value ID to the related table row IDs is only available via an index ("inverted index"). Without index, the whole column has to be scanned for a specific value.The following picture illustrates the direct mapping of dictionary value IDs to table row IDs via an inverted index (right hand side):
12. Where can I see if an indexes is used by a certain SQL statement?
This information is available via PlanViz (see SAP Note 2073964).Example: (inverted index on column X used for access)
13. Are indexes dedicated storage objects?
On other databases administrators are used to consider indexes as dedicated storage objects like segments (e.g. DBA_SEGMENTS entries with SEGMENT_TYPE = 'INDEX' on Oracle). This doesn't apply for SAP HANA. As already seen above, indexes are extensions to column structures (inverted indexes) or internal columns (e.g. CONCAT attributes, TREX external key, see SAP Note 1986747), and the allocated space is in the first place purely linked to the underlying table. Only with specific analysis tools like SQL: "HANA_Tables_LargestTables" or SQL: "HANA_Indexes_LargestIndexes" (SAP Note 1969700) it is possible to understand better how much space is allocated by index structure.14. What happens to indexes when a table is moved between row store and column store?
Existing indexes are kept unchanged when a table is moved from row store to column store or vice versa. Only the internal representation is changed (row store: CPBTREE, column store: INVERTED). Taking into account that an optimal index design is different for row store (more indexes on more columns) and column store (less indexes, typically on a single column), you can adjust the index design individually when you move large tables between the stores. Otherwise the memory utilization or performance may not be optimal.15. Are accesses to the underlying table locked when an index is created?
When an index is created, the underlying table is locked against changes with a transactional object lock. See SAP Note 1999998 ("How can transactional lock waits be analyzed and optimized?") for more information and possible optimizations like setting a reduced transaction lock wait timeout:SET TRANSACTION LOCK WAIT TIMEOUT <timeout_in_ms>In addition the following CREATE INDEX specific command options can be used:
Option | Validity | Details |
ONLINE | Rev. >= 110 row store |
The creation of the index no longer requires a permanent table lock. |
NOWAIT | Rev. >= 110 |
If no exclusive table lock can be acquired immediately (e.g. because there are still open changes on the underlying table), the CREATE INDEX will fail with the following error: SQL error 146: Resource busy and NOWAIT specified: (lock table failed) |
Header Data
Released On | 13.01.2016 11:39:14 |
Release Status | Released to Customer |
Component | HAN-DB SAP HANA Database |
Priority | Normal |
Category | How To |
It's so nice article thank you for sharing a valuable content
ReplyDeleteSAP HANA Online Access
Hello There,
ReplyDeleteAwesome article. Thanks for making that available. I've been using your help to build my own POC and will publish the steps in another blog soon.
I have always worked on graphical Hana VDM modelling and SQL scripting. Now I now the entire development focus is shifting to CDs views. I have situation where the graphical views already created needs to be wrapped in a ABAP CDS view. Is it possible to use calculation views inside a CDS definition? I am not getting exactly what syntax should I use to call the calculation view. Would it be same as calling a schema table? Or do we need to wrap the calculation view in external views then use that external views in the CDS views.
Can you please let me know your opinion or suggest me the directions which I need to explore?
Appreciate your effort for making such useful blogs and helping the community.
Regards,
Sam
Hello There,
ReplyDeleteThank You so much for this blog. It helped me lot. I am a Technical Recruiter by profession and first time working on this technology was bit tough for me, this article really helped me a lot to understand the details to get started with.
I heard a buzzword "Optimistic Latch-Free Index Traversal" in one blog post on Introduction to SAP HANA by Dr. Vishal Sikka. I googled about it in context of SAP HANA Architecture but didn't find anything concrete. Could someone please explain this whole concept thoroughly, in context of SAP HANA?
The key distinctions between HANA and the previous generation of SAP systems, are that it is a column-oriented, in-memory database that combines OLAP and OLTP operations into a single system SAP HANA Training . Thus, in general, SAP HANA is an OLTAP system.
By the way do you have any YouTube videos, would love to watch it. I would like to connect you on LinkedIn, great to have experts like you in my connection (In case, if you don’t have any issues).
Please keep providing such valuable information.
Best Regards,
Kevin