The month of October is the perfect time to reflect on your business strategy. What has been working? What hasn’t? What needs to change going into the holiday season?
As many customers start considering and evaluating SAP HANA for their businesses, one of the key elements that everyone wants to understand is if the process of migrating data from their legacy systems is actually working. Also on their must-know list is how to process operational data that sits in various systems in their IT landscape and transfer it to the SAP HANA platform. There are already best-of-breed methodologies in place to replicate data in real-time or batch from any source system to the SAP HANA database.
Data provisioning is the process of developing, preparing, and allowing a network to offer data to its user. Data has to be loaded to SAP HANA before it arrives at the user end by way of a front-end tool. Data provisioning includes transporting data from various SAP and non-SAP systems into SAP HANA.
Notice how there seem to be external tools for almost anything now? There are. And, in this case, you have available SAP-certified external tools that help you to integrate or replicate data into the HANA system. These external tools, however, require additional licensing and a completely independent infrastructure. Some of these tools are inbuilt data provisioning methods that do not require separate infrastructure but may involve additional licensing costs. These components are used to acquire data from SAP and non-SAP sources – in real time or in batches.
SAP Landscape Transformation (SLT): Simple to install and configure, SLT works with both SAP and non-SAP source systems. It also works with an SAP-supported database. The main advantage of SLT is that it uses the ABAP stack and hence can read and use clusters and pool tables from older systems such as ECC. Mostly, SLT is used if the source system is SAP ECC, SAP CRM, and so on, and it can be used in both scenarios —as N(Source) : 1(HANA as Target) or 1(Source(SAP):N (HANA as Target, Max 4).
SLT can be scaled to handle very large transaction volumes. It is integrated with the HANA system and runs on the SAP NetWeaver platform.
It works on the trigger concept to identify database changes and record information in logging tables. SLT supports complex transformation capabilities and provides data transformation and filtering capabilities before loading onto the HANA database. Scheduled data replication into SAP BW minimizes the size of overnight data uploads andd Delta updates on BW Data Sources without the delta mechanism.
Logging table – The source system tracks database changes by using database triggers. It records information about changes in logging tables.
Read modules – These can be located either in the SLT replication server (in case of non-SAP) or in the SAP source system. They transfer the data to the SLT replication server.
The SAP LT replication server is connected to the HANA system through a DB connection for fast data replication. Once DD02L (list), DD02T (short descriptions), and DD08L (definitions) tables are replicated automatically, the SAP HANA system knows which tables are available in the source system.
You can configure SLT in the below-mentioned ways:
- SAP LT replication server installed as a different system – Here, the data loading performance is high; however, it involves an additional cost for setting and maintenance. Nevertheless, the source system doesn’t get disturbed.
- SAP LT replication server installed within the source system itself.
Performing configuration and monitoring: LTR or LTRC
LTR – A replication configuration can be created via web interface or via transaction LTR.
LTRC– Configuration is possible from the ABAP interface within the replication cockpit, using the transaction LTRC.
LTRS – This option is for advanced replication setting (Customization).
|1. Choose the Load…||button to load the current data of a table from the source system (initial update)|
|2. Choose the Replicate…||button to replicate a table; it includes the load of the current data and the replication of all changes in the source system (initial as well as delta update)|
|3. Choose the Suspend…||button to suspend data replication but keep delta recording active|
|4. Choose the Resume…||button to resume a previously suspended data replication|
|5. Choose the Stop…||button to delete the logging table and trigger status and start the entire process again.|
SAP Data Services: This is a certified ETL tool from SAP that is used to perform batch loading into SAP HANA.
- It is an engine to load all data into SAP HANA. Note that the version should be 4.0 or greater to allow this loading.
- When to use this tool – If you require proper data cleaning, do not require real-time updating, and want to perform a large number of transformations to the data, then consider using SAP Data Services.
- Data Services handles complex integration of any data, cleansing, validation, and enrichment. It is capable of full load and delta.
- Data Services requires additional software/hardware components.
Traditionally, Data Services performs the transformations of data in the Data Services engine and then loads the results to the data target as ETL process. However, with SAP HANA, Data Services stages the source data, and then pushes down the transformation processing to SAP HANA. It works as ELT.
Establish Connectivity – To connect to the source system or the target system, the following objects have to be followed:
- Formats (Used in case of File, Hadoop, Excel, CSV, Text)
- Data store (Used for anything other than file (ex. Database (Oracle, SQL, Any database), SAP Application (CRM, ECC)), SAP BW, Non-SAP apps (Siebel, Oracle apps))
Data Services Objects:
|Projects||Like a folder, a highest-level and single-use object that allows the grouping of jobs|
|Job||Executable, schedulable objects than contain either workflows, data flows, or both|
|Work flows||Optional sub-jobs; they manage data flows and the operations that support them|
|Data flows||They involve moving data from one or more sources to one or more target tables or files|
|Transforms||Optional objects in a data flow that allow data to be transformed as it moves; some transforms include case, map operation, merge, query, row generation, and SQL validation|
|Scripts||Optional code to fine tune logic in flow|
There are different scenarios in which data services are used:
- Loading data from an Excel file to the HANA Database.
- Using template table as target in Dataflow; when target is unknown or not available, then the system will create the table in the target – same as the source and load data
- Loading data from SAP application (Except BW) to HANA DB (Target)
- Loading data from multiple tables to HANA DB
- Loading data by using ABAP Dataflow
- Importing metadata from SAP application to HANA database using data services
- Loading data from Oracle database to HANA DB (Target)
- Loading data from SAP BW to HANA DB (Target); BW data is available in the form of ‘Infoprovider’ and not in the form of tables. Process chain is required in this case.
Some common steps involved are:
- Create Data Store between Source and BODS
- Import the metadata (Structures) to BODS
- Create Data Store between BODS and HANA
- Import the metadata into the HANA system
- Re-Import the same metadata from HANA to BODS
- Create Project
- Create Job (Batch/Real time)
- Create Work Flow
- Create Data Flow
- Execute the job
- Check the data preview in HANA
Direct Extractor Connection (DXC)
The DXC is only useful if you have SAP BW and SAP ECC in your landscape. It is used when you want to make use of Data Sources available in SAP BW for SAP ECC Systems. With ERP comes a BW component known as Embedded BW. Without BW, DXC is not possible. Using DXC, you can redirect the data loads of Data Sources from embedded SAP BW system (For SAP ECC extractors) into SAP HANA as a table. Once you have data in SAP HANA, you can start building information models on it. The minimum version to implement DXC is SAP BW 7.0 or higher; in case this version is not available, you would need to install a separate BW server – a process referred to as the Side Car Approach.
- The SAP HANA Direct Extractor Connection (DXC) is used to redirect data from embedded SAP BW system (For SAP ECC extractors) to the HANA table using an HTTP connection.
- Can reuse the same extractors with new delta queues to provide data to SAP HANA. Banking services extractors are an exception to this.
- DXC can handle batch loads from the SAP Business Suite and is capable of full load and delta.
- It provides a special In-Memory Data Store Object (IMDSO) for use in HANA standalone.
RSA5- Identify corresponding business content data source to be replicated from the source system and activate it.
RSA6- Here, the activated version data source is available.
______________BW in ERP________________
RSDS- Repository data source – Replicated Data source
RSA1- Activated data source
Whenever the data source is activated in SAP BW, it will generate the following table in the configured schema in the HANA database. Only one schema is configured for DXC in HANA.
- /BIC/A00 – IMDSO Active Table
- /BIC/A40 – IMDSO Activation Queue
- /BIC/A70 – Record Mode Handling Table
- /BIC/A80 – Request and Packet ID information Table
- /BIC/AA0 – Request Timestamp Table
- RSODSO_IMOLOG – IMDSO related Table
SAP Replication Server (SRS): High-Performance data movement
SAP has renamed Sybase Replication Server (SRS) to SAP Replication Server (SRS) since SAP has already bought Sybase. It works with both SAP and non-SAP source systems and supports many databases. It is a bidirectional real-time replication and supports log-based replication from and to heterogeneous databases and is, hence, low impact on the source system.
Some key features of SRS
- Zero operational downtime
- 1: M/N: 1 replication possible
- Data distribution and migration – It can be used to migrate from an older version of the database platform to a newer one
- SRS uses a publish-and-subscribe model for data replication between primary and replicate databases
- Replication Server security also incorporates login names, passwords, and permissions
- It relies on a log-based replication technique known as the Changed Data Capture (CDC)
- High performance and transactional integrity
- Replication Agent for SAP HANA (RAH) comes with capabilities such as real time data distribution and real time reporting
Limitations of SRS
- SRS faces issues while reading and using pool and cluster tables; however, new SAP systems such as HANA do not use these tables at all. This makes this limitation less of an issue
- It does not support filtering or transformation of data
If you are reading this far, odds are you needed help on moving data into HANA. Deciding on the right tools is always desperately challenging. It’s a strategic shift, which always takes time. However, IT MUST BE DONE.
Start slowly but begin on your decision process now. Is the tool in line with your ongoing project’s deeper objective or not? Your findings may uncover some ways in which you could get some other project challenges straightened out. Good luck!
Author: Chetan Patil
Chetan is a Senior Consultant – SAP – at Knack Systems with over 7.0 years of experience in SAP implementation, support, and upgrade. He is currently working for a leading consumer goods customer as an “ABAP-er.” Chetan has extensive knowledge in SAP BW, HANA modeling, and SAP BO Data Services. He has expertise in ABAP for providing solutions for different modules in SAP ECC, data archiving, performance tuning, production support, problem analysis and resolution, and unit and integration testing. He has worked extensively in different business scenarios in translating business requirements into technical design and document-supported activities.