Quantcast
Channel: SCN : Blog List - All Communities
Viewing all articles
Browse latest Browse all 2548

Cross Database Comparison – CDC, extended usage with enhanced Data Model and Function Module

$
0
0

Overview:

During the data migration process, a usual question is generally asked - how to ensure data consistency for the source and target system after migration. Cross Database Comparison (CDC), one of Data Consistency Management tools, can be used to compare data sources with a complex structure or hierarchy. Those comparison data sources can be across different systems and different types (ABAP, Non-ABAP type, XML & OData). By reviewing the result of CDC, we can get the percentage of matched records, percentage and detail of fail records etc. Besides, CDC can act as a tool to validate the migration program correctness.

 

Cross Database Comparison - Solution Manager - SCN Wiki

 

For simple usage, where source and target have similar data structure and no complex data conversion logic, we can use the “Remote Database (Using ADBC)” or “RFC to Generic Extractor” to construct the CDC Data Model. Those standard documents already have covered the simple usage and will not be covered here. Below will show an extended usage of CDC by enhanced Data Model and ABAP Function Module.

 

 

Case Study:

Source Table – (ZLEGY_ADDRESS)

pic1.png

 

Target Tables – (BUT0ID, BUT021_FS, ADRC, ADRCITYPRT, ADR2)

(NOTE: Only show BUT0ID and BUT021_FS definition below)

pic2.png

pic3.png

 

According to business requirement, the key fields (CLIENTID, ADDRESSTYPE) of legacy table ZLEGY_ADDRESS are mapped to key field of CRM tables BUT0ID (IDNUMBER) and BUT021_FS (ADR_KIND) accordingly. Thus, a proper CDC Data Model can be constructed as below:

 

pic4.png

Besides, there are certain business rules and some will lead to problems:

  1. Target side (right) - Only specific “TYPE” of BUT0ID and “RS_USER” of ADR2 are related. This can be handled by Object Filter as shown in diagram.
  2. Target side (right) - Not all records from source table have PHONE value, thus no corresponding row in target table ADR2. <Problem> As CDC data model by default use inner key join link tables, empty row in ADR2 lead to whole linked rows be filtered out.
  3. Source side (left) - CLIENTID of ZLEGY_ADDRESS is INT4 (integer) but the mapped filed IDNUMBER of BUT0ID is CHAR(60). <Problem> Sorting order problem may occur when execute the CDC comparison and comparison will be aborted in the middle. As CDC detect the current row keys has lesser value than the previous row keys on either Source or Target.
  4. Source side (left) - Same CLIENTID may have multiple addresses, which are distinguished by the key ADDRESSTYPE: A, B AND C. Under SAME CLIENTID those multiple ADDRESSTYPE are converted in Target table with below logic:

 

Under same CLIENTID

Source - ADDRESSTYPE

Target - ADR_KIND

Has ADDRESSTYPE = ‘C’

‘C’

‘DEFAULT’

 

‘B’

‘SHIP_TO’

 

‘A’

‘HOME’

Has ADDRESSTYPE = ‘B’
NO ADDRESSTYPE = ‘C’

‘B’

‘DEFAULT’

 

‘A’

‘HOME’

NO ADDRESSTYPE = ‘C’
NO ADDRESSTYPE = ‘B’

ANY

‘DEFAULT’

 

Obviously, this ADDRESSTYPE logic is a bit complicate and cannot be handled by the CDC’s ConverisonID.

 

Problem 2) Solution:

It can be resolved by changing the default “inner key join” into “outer join”. In this case, the Data Model Source Type 2 (Right side) has to change to – “RFC to Generated Extractor”, which allowed to generate ABAP Function Module. After the success generation of Function Module. Use SE80 ABAP editor to apply change. There is NO NEED to change others except those SQL Select statements (Total 3 similar SQL Select). Below is the modified SQL Select as reference:

Modified SQL with Outer Join

SELECT BUT021_FS~ADR_KIND, ADRCITYPRT~CITY_CODE, ADRC~STREET,

ADRC~POST_CODE1, ADR2~TEL_NUMBER, ADRCITYPRT~POST_CODE, BUT0id~IDNUMBER

INTO TABLE @lt_source_data

FROM BUT0id AS BUT0id 

  JOIN BUT021_FS AS BUT021_FS ON BUT021_FS~PARTNER = BUT0id~PARTNER

  JOIN ADRC AS ADRC ON ADRC~ADDRNUMBER = BUT021_FS~ADDRNUMBER

  JOIN ADRCITYPRT AS ADRCITYPRT ON  ADRCITYPRT~CITY_CODE = ADRC~CITY_CODE

  AND ADRCITYPRT~CITY_PART = ADRC~CITY2

  LEFT JOIN ADR2 AS ADR2 ON  ADR2~ADDRNUMBER = ADRC~ADDRNUMBER AND (  ADR2~R3_USER <> '3' )

FOR ALL ENTRIES IN @lt_key

WHERE  BUT0id~IDNUMBER = @lt_key-CLIENTID_C AND  BUT021_FS~ADR_KIND = @lt_key-ADDRESSTYPE

    AND (  BUT0id~TYPE = 'CRM001' )

AND ( BUT021_FS~ADR_KIND = 'DEFAULT' or BUT021_FS~ADR_KIND = 'SHIP_TO' or BUT021_FS~ADR_KIND = 'HOME' ).

The ONLY change is from “JOIN ADR2” to “LEFT JOIN ADR2” (same as LEFT OUTER JOIN). All SQL Select statements inside need to apply the LEFT JOIN as above.

 

 

 

Problem 3) Solution:

Whenever the sorting order issue is caused by data type mismatch, it is almost impossible to resolve by CDC ConverisonID. An ultimate resolution is to add another CLIENTID field (e.g. CLIENTID_C ) but in CHAR type in the table and ensure trim out leading and ending SPACE. In that way, we can ensure that both the Source and Target table have the same sorting order.

See the enhanced Table definition:

pic5.png

 

Problem 4) Solution:

With the ADDERSSTYPE logic mentioned, a proper algorithm is required. Data Model Source Type 1 (Left side) has to change to type “RFC to Generated Extractor” to generate the ABAP function model. However, with ABAP code alone, it may consume many lines code to apply the conversion logic above. So as to simplify the ABAP logic as much as possible, an enhanced Data Model can help to the situation.

Create two table views from the source table ZLEGY_ADDRESS. Each view only need to pick three fields from ZLEGY_ADDRESS: (CLIENTID, ADDRESSTYPE, CLIENTID_C). One table view (e.g. ZLEGY_ADDR_C) has “Selection Condition” for ADDRESSTYPE = ‘C’. While the other one (e.g. ZLEGY_ADDR_B) has filter ADDRESSTYPE = ‘B’.

 

pic6.png

pic7.png

The main purpose here is to form SQL Dataset where ZLEGY_ADDRESS , ZLEGY_ADDR_C and ZLEGY_ADDR_B are joined together. Thus each row of ZLEGY_ADDRESS has corresponding ADDRESSTYPE=”C” and ADDRESSTYPE=”B” listed at the right side. If those ADDRESSTYPE types do not exist, still leave an empty row at right side. I.e. Link them together by SQL OUTER JOIN.

 

Imaginary Joined Tables with data (shown only key fields).

CLIENTID_C

ADDRESSTYPE

ADDRESSTYPE in
ZLEGY_ADDR_C

ADDRESSTYPE in
ZLEGY_ADDR_B

1111111111

C

C

B

1111111111

B

C

B

1111111112

B

 

B

1111111112

A

 

B

1111111113

A

 

 

 

With the solution of Problem 3) & 4), the enhanced Data Model should be similar as this:

pic8.png

 

Part of Function Module code:

Portion of Enhanced Function Module - ZC_ADDR_SRC

FUNCTION ZC_ADDR_SRC.

*"----------------------------------------------------------------------

TYPES:

BEGIN OF ts_source_data,

ADDRESSTYPE(10) TYPE C, CITY TYPE

ZLEGY_ADDRESS-CITY, ADDRESS TYPE ZLEGY_ADDRESS-ADDRESS, ZIP TYPE

ZLEGY_ADDRESS-ZIP, PHONE TYPE ZLEGY_ADDRESS-PHONE, TOWN TYPE

ZLEGY_ADDRESS-TOWN, CLIENTID_C TYPE ZLEGY_ADDRESS-CLIENTID_C,

END OF ts_source_data,

 

BEGIN OF ts_source_data1,

ADDRESSTYPE TYPE ZLEGY_ADDRESS-ADDRESSTYPE, CITY TYPE

ZLEGY_ADDRESS-CITY, ADDRESS TYPE ZLEGY_ADDRESS-ADDRESS, ZIP TYPE

ZLEGY_ADDRESS-ZIP, PHONE TYPE ZLEGY_ADDRESS-PHONE, TOWN TYPE

ZLEGY_ADDRESS-TOWN, CLIENTID_C TYPE ZLEGY_ADDRESS-CLIENTID_C,

ADDRESSTYPE_C TYPE ZLEGY_ADDRESS-ADDRESSTYPE,

ADDRESSTYPE_B TYPE ZLEGY_ADDRESS-ADDRESSTYPE,

END OF ts_source_data1,

:

DATA:

lt_source_data TYPE STANDARD TABLE OF ts_source_data,

lt_source_data1 TYPE STANDARD TABLE OF ts_source_data1,

lt_temp_source_data TYPE STANDARD TABLE

                                       OF ts_source_data,   "#EC NEEDED

lv_remaining_block_size TYPE i,                         "#EC NEEDED

 

lt_temp_source_data1 TYPE STANDARD TABLE

                                        OF ts_source_data1,

ls_source_data TYPE ts_source_data,

ls_source_data1 TYPE ts_source_data1,

:

:

*** Part 3: Source Data Extraction ***

  IF iv_block_size < 0.

* Count expected number of rows only

SELECT  COUNT(*)

INTO  @ev_total

FROM ZLEGY_ADDRESS AS ZLEGY_ADDRESS

LEFT OUTER JOIN ZLEGY_ADDR_C AS ZLEGY_ADDR_C ON ZLEGY_ADDR_C~ADDRESSTYPE = 'C'

AND ZLEGY_ADDR_C~CLIENTID_C = ZLEGY_ADDRESS~CLIENTID_C

LEFT OUTER JOIN ZLEGY_ADDR_B AS ZLEGY_ADDR_B ON ZLEGY_ADDR_B~ADDRESSTYPE = 'B'

AND ZLEGY_ADDR_B~CLIENTID_C = ZLEGY_ADDRESS~CLIENTID_C

.

RETURN.

  ENDIF.

:

:

:

SELECT ZLEGY_ADDRESS~ADDRESSTYPE, ZLEGY_ADDRESS~CITY,

ZLEGY_ADDRESS~ADDRESS, ZLEGY_ADDRESS~ZIP, ZLEGY_ADDRESS~PHONE,

ZLEGY_ADDRESS~TOWN, ZLEGY_ADDRESS~CLIENTID_C, ZLEGY_ADDR_C~ADDRESSTYPE,

ZLEGY_ADDR_B~ADDRESSTYPE

INTO TABLE @lt_source_data1

FROM ZLEGY_ADDRESS AS ZLEGY_ADDRESS

LEFT OUTER JOIN ZLEGY_ADDR_C AS ZLEGY_ADDR_C ON ZLEGY_ADDR_C~ADDRESSTYPE = 'C'

AND ZLEGY_ADDR_C~CLIENTID_C = ZLEGY_ADDRESS~CLIENTID_C

LEFT OUTER JOIN ZLEGY_ADDR_B AS ZLEGY_ADDR_B ON  ZLEGY_ADDR_B~ADDRESSTYPE = 'B'

AND ZLEGY_ADDR_B~CLIENTID_C = ZLEGY_ADDRESS~CLIENTID_C

FOR ALL ENTRIES IN @lt_key

WHERE  ZLEGY_ADDRESS~CLIENTID_C = @lt_key-CLIENTID_C AND

ZLEGY_ADDRESS~ADDRESSTYPE = @lt_key-ADDRESSTYPE

.

 

SORT lt_source_data1 BY CLIENTID_C ADDRESSTYPE.

 

CLEAR:

ev_act_key,

ev_block.

ev_lines = sy-dbcnt.

 

LOOP AT lt_source_data1 INTO ls_source_data1.

MOVE-CORRESPONDING ls_source_data1 to ls_source_data.

IF ls_source_data1-ADDRESSTYPE_C = 'C'.

CASE ls_source_data1-ADDRESSTYPE.

WHEN 'C'.

ls_source_data-ADDRESSTYPE = 'DEFAULT'.

WHEN 'B'.

ls_source_data-ADDRESSTYPE = 'SHIP_TO'.

WHEN 'A'.

ls_source_data-ADDRESSTYPE = 'HOME'.

ENDCASE.

ELSEIF ls_source_data1-ADDRESSTYPE_C = '' AND ls_source_data1-ADDRESSTYPE_B = 'B'.

CASE ls_source_data1-ADDRESSTYPE.

WHEN 'B'.

              ls_source_data-ADDRESSTYPE = 'DEFAULT'.

WHEN 'A'.

              ls_source_data-ADDRESSTYPE = 'HOME'.

ENDCASE.

ELSEIF ls_source_data1-ADDRESSTYPE_C = '' AND ls_source_data1-ADDRESSTYPE_B = ''.

ls_source_data-ADDRESSTYPE = 'DEFAULT'.

ENDIF.

APPEND ls_source_data to lt_source_data.

ENDLOOP.

 

SORT lt_source_data BY  CLIENTID_C ADDRESSTYPE.

:

:

 

Key changes of the function module:

  • Change the ADDRESSTYPE to CHAR(10) of ts_source_data
  • Create separated ts_source_data1 type to include ADDRESSTYPE_C and ADDRESSTYPE_B of those views
  • Use LEFT OUTER JOIN to join all table and views
  • Collect those whole resultset of Select into table "lt_source_data1" instead of default "lt_source_data" of CDC.
  • Perform conversion from "lt_source_data1" and append to CDC default table "lt_source_data"

 

For detail about the function module, please refer to the attachment "ZC_ADDR_SRC.txt"


Viewing all articles
Browse latest Browse all 2548

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>