Wednesday, December 21, 2016

Data warehouse basics: What is Incremental Data Load ?

So we are back again to brush the basics. I have always felt that for a BI/DW developer to become successful he/she should have a very strong base about the basics. Here is one basic concept we hear often, Incremental load.

So what exactly is Incremental data load ?

Incremental data load is a type of data load scenario in a Data warehouse environment where the data that is updated/new are inserted to the target data warehouse with each scheduled run.

Imagine a sales person table as below which will be our source table in this case.


On the first day of loading to an EDW table, all the above three records would be moved.  The target table would be same as above. 

Now a day after they have added two new employees and also made a change to first name of a previous employee. Refer the screenshot below, 


Here with incremental update the two new records are considered and written to the target table. Also checks the source if the already existing records in EDW has any changes in source. There is the first record where FST_NAME was changed from Steve to Steven, hence this record is also considered and the change is updated in the target table. 

The advantages of incremental loads are the following,

  1. Reduces a major chunk I/O operations between source and target. 
  2. Faster than a full base load.





No comments:

Post a Comment