Automate Data Validation between Oracle ERP Cloud and Data warehouse

Updated: Apr 25, 2020

Validating data once an ETL/ELT process is completed will be an inevitable process to maintain the data integrity. This becomes harder when we do not have direct access to the database to compare.

If no access, how people can validate?

They follow a three-step process:

1. Extract data from source (Oracle ERP Cloud):

As we have no access to the ERP cloud database, people extract the data in the form of CSV files by creating BIP data models and reports for each dimension/fact or let us say a test case in terms of testing.

2. Extract data from target Data Warehouse:

We are aware that current market has a handful of data warehouses, which are on premise as well as cloud-based. Extract the data into CSV files for their respective test cases.

3. Compare the data:

Now validate the source and target CSV files w.r.t test cases for any data discrepancies.

Sounds simple!

Really! Think of doing this for some 30+ test cases.

Exhausted! Don’t we have an alternative?

We do, by using the BI Publisher Web Services, Python and Jenkins we can make the above three-step process automated and generate test reports.

How? Let us get into it

For this we used below BI Publisher web services

Refer to BI Publisher Web Services for all other services and methods.

In addition, we need an input CSV/Excel file with below inputs for all test cases and a parameter file with source and target connection details to the program.

I have used Python as I felt it is simpler compared to Java and Jenkins to generate test report and make utilize the Test Results Analyzer.

Below are the hierarchy of steps in our program

1. Login to Oracle Cloud ERP

2. Create Data Model with the SOURCE_QUERY input

3. Create Report with the data model created in step 2

4. Run the report and extract the data to CSV file.

5. Connect to target Data Warehouse.

6. Run the TARGET_QUERY input.

7. Extract the results into a CSV file.

8. Validate the results using Pytest module.

9. Run the python script with junitxml and html to generate XML and HTML test reports.

10. Create a project in Jenkins and run the python script from Jenkins, use Test Analyzer plugin to analyze your test runs.

Login to Oracle Cloud ERP

Create BIP Data Model

Create BIP Report

Run Report and extract the response to CSV file

Connect to target data warehouse and extract the data.

Let us do it for SNOWFLAKE, which is a cloud based data warehouse

Compare the CSV files for differences

Use Pytest to make a test suite and generate test reports

Create a Jenkins project and run the python script using junitxml and html arguments

HTML Test report generated for the build in Jenkins

Result analyzer with all test cases and their status

Line, Pie and Bar representation of the test case status in Jenkins

Thanks for reading! Check out my other posts here and share, if you liked it.

99 views0 comments