Saturday, February 6, 2016

2079372 - SAP HANA - Statement Routing

Symptom
Performance issues due to high inter-node communication when executing SQL queries on a distributed SAP HANA system.


Other Terms
SAP HANA, Performance, Scale-Out, Statement Routing, Prepared Statements, Inter-node Communication


Reason and Prerequisites
Client applications communicate with a SAP HANA system via a client library (SQLDBC, JDBC, ODBC, DBSL, ODBO ...) for SQL or MDX access from different platforms and types of clients. In distributed systems, the application has a logical connection to the SAP HANA system; the client library may use multiple connections to different hosts or change the underlying connection over time.
The SQL client libraries will connect to the first available host specified in the connect string, which can contain single hosts or a list of hosts. All hosts that could become active master, because they are one of the configured master candidates, should be listed there to allow initial connect to any of them in case of a host auto-failover. As result the client libraries receive a list of all hosts. During operation the statements may be sent to any of these hosts. The client connection code uses a "round-robin" approach to reconnect and ensures that these clients can reach the SAP HANA database. In the failover case a list of hosts is parsed until the first accessible host is found.
To ensure transactional consistency in distributed setups, SAP HANA supports distributed transactions, which are transactions that involve data that is stored on different index servers. In this context the client library supports load balancing by connection selection and minimizes communication overhead by statement routing based on information about the location of data. Depending on the setting of ini parameter client_distribution_mode in the indexserver.ini, connection selection and/or statement routing are executed (the default setting is statement for statement routing).


Solution
Statement routing helps to optimize performance by reducing the amount of data that must be sent over the internal network zone. In principle, clients can send requests to any index server in a distributed system. This, however, could lead to a performance decrease, if, for example, a query involving data located on server1 is sent to server2. Server2 would have to forward the query execution to server1 and forward the response from server1 to the client. This overhead can be avoided if the client directly sends the query to server1. With the usage of prepared statements this can be achieved.
A prepared statement is a (typically parameterized) SQL statement that is sent to the server where it is precompiled. The prepared statement is returned to the client. For execution the client sends the prepared statement (usually filled with the actual parameters) to the server where (most of) the data is located by opening a new connection.
This two-step protocol (prepare and execute) is used for statement routing in a distributed system, where the “prepare” request may return information about the preferred locations for executing the statement. Based on this information, the client library sends requests for executing the prepared statement directly to one of the indicated index servers. The response for an “execute” request may also contain location information, if the situation has changed since the “prepare” request was processed. Statement routing is done by the client library and is transparent to the application. With statement routing, different statements of the same transaction may be executed on different servers.
How to enable statement routing (please check the attached .pdf document for details):
a) Server parameter (please refer to SAP note 2036111):
The client distribution mode can be controlled via the following INI parameter:
indexserver.ini ( [distribution] client_distribution_mode = [ off | connection | statement | all ] )
off all distribution features are disabled
connection client side load balancing is enabled
statement statement routing is enabled
all all distribution features are enabled
When enabled, you can check the distribution mode at M_INIFILE_CONTENTS and M_CONNECTIONS for each connection.
For optimal connection performance, the connection property DISTRIBUTION should be the same level as the server client_distribution_mode.
As of SAP HANA SPS 11, in order to enable the "all" distribution mode, the connection property DISTRIBUTION must be set to "ALL" (see Configuration combination between client and server section below).

b) Client parameter (client SAP HANA SPS <= 10):
The client distribution mode can be changed or disabled via the following connection property:
Connection property ( DISTRIBUTION = [ OFF | CONNECTION | STATEMENT_ONLY | STATEMENT ] )
OFF The topology is not used. The interface connects exactly to the specified hosts.
CONNECTION A different node is used for each connection.
STATEMENT_ONLY The different nodes are used on a statement level.
STATEMENT Combination of CONNECTION and STATEMENT_ONLY. This is the default mode for connections when nothing is specified.

c) Client parameter (client SAP HANA SPS >= 11):
The client distribution mode can be changed or disabled via the following connection property:
Connection property ( DISTRIBUTION = [ OFF | CONNECTION | STATEMENT | ALL ] )
OFF The topology is not used. The interface connects exactly to the specified hosts.
CONNECTION A different node is used for each connection.
STATEMENT The different nodes are used on a statement level. This is the default mode for connections when nothing is specified.
ALL Combination of CONNECTION and STATEMENT.

d) Configuration combination between client and server (SAP HANA SPS <= 10):
Client
Server
Enabled Feature at Connections
*
off
<disabled>
CONNECTION
connection
<connection distribution enabled>
CONNECTION
statement
<connection distribution enabled>
CONNECTION
all
<connection distribution enabled>
OFF
connection
<disabled>
OFF
statement
<disabled>
OFF
all
<disabled>
STATEMENT
connection
<connection distribution enabled>
STATEMENT
statement
<statement routing enabled>
STATEMENT
all
<both enabled>
STATEMENT_ONLY
connection
<connection distribution enabled>
STATEMENT_ONLY
statement
<statement routing enabled>
STATEMENT_ONLY
all
<statement routing enabled>

e) Configuration combination between client and server (SAP HANA SPS >= 11):
Client
Server
Enabled Feature at Connections
*
off
<disabled>
OFF
*
<disabled>
CONNECTION
connection
<connection distribution enabled>
CONNECTION
statement
<disabled>
CONNECTION
all
<connection distribution enabled>
STATEMENT
connection
<disabled>
STATEMENT
statement
<statement routing enabled>
STATEMENT
all
<statement routing enabled>
ALL
connection
<connection distribution enabled>
ALL
statement
<statement routing enabled>
ALL
all
<both enabled>

Scenarios:
a) ABAP on HANA:
- Open SQL:
Open SQL statements are converted into database-specific SQL in the Open SQL interface of the database interface, and are then sent to the database system, where they are executed.
Most ABAP code uses Open SQL for DB queries. By default, Open SQL statements are internally converted into prepared statements.
- Native SQL:
Native SQL statements are transferred unchanged from the Native SQL interface to a database system, and executed there.
The full SQL language scope of the relevant database can be used and the addressed database tables do not have to be declared in the ABAP Dictionary.
Native SQL is always hard-coded and input is passed to placeholders as in prepared statements.
- ADBC - ABAP Database Connectivity:
ADBC is an object orianted API for the Native SQL interface of the AS ABAP that allows to send database specific SQL commands to a database system and process the result.
The class CL_SQL_PREPARED_STATEMENT makes it possible to execute prepared SQL statements with different parameters passed to it.

b) SAP HANA Native:
- SAP HANA Extended Application Services (XS) applications:
When implementing SAP HANA XS applications you use for the server-side procedural logic JavaScript, i.e. the XS JavaScript API which supports prepared statements (see attachment - js_sample_code_snippet.txt).
Another option to query the data from the database is to use OData. SAP HANA allows generating OData REST services from any existing SAP HANA Table or View. And such an OData service call is then transformed/translated into a prepared statement on HANA DB level.
- HANA Stored Procedures / SQLScript Views:
When programming stored procedures or scripted calculation views to call other HANA artefacts no explicit prepare statement is necessary as this will be done by the indexserver.
However as for all artifacts the application calling the procedure/HANA model needs to send prepared statements. That means that the CALL statement can be prepared in that way which would make it appropriate for the calling interfaces (Load Services, chron jobs, python test scripts, etc.).

c) BI Solutions:
- Business Intelligence Consumer Services (BICS, e.g. used by SAP Design Studio, A-Office):
The BICS component always uses prepared statements against SAP HANA, as it is also a (priority high) security requirement. It is not possible to switch this off.
- BusinessObjects Web Intelligence:
SAP Note 1963887 - BusinessObjects Web Intelligence sends unprepared statements by default to HANA which prevents usage of HANA statement routing in HANA scale out landscapes.
- BusinessObjects Analysis, edition for Microsoft Office:
SAP Note 2141024 - SBOP ANALYSIS MS OFFICE 1.4 (AAO) SP11 and lower versions might send unprepared statements to HANA which prevents usage of HANA statement routing in HANA scale out landscapes.

d) Data Provisioning Solutions:
- SAP Data Services:
SAP Data Services uses for the communication with SAP HANA the ODBC interface and in most cases prepared statements are executed.



Header Data

Released On 22.11.2015 19:56:38
Release Status Released for Customer
Component HAN-DB SAP HANA Database
Other Components
HAN-DB-CLI SAP HANA Clients (JDBC, ODBC)
Priority Recommendations / Additional Info
Category FAQ

No comments:

Post a Comment