Sunday, January 1, 2017

Dimensional Modelling - Loan Dimension in Banking Data warehouse - Part 1

Loan Dimension in a Banking Enterprise Data warehouse is very crucial. It is undoubtedly one of the if not biggest data warehouse dimension in a banking DW architecture. It is also of one of the most confirmed dimensions in an banking data warehouse. Loan dimension has a lot of textual attributes and also an almost equal number of measurable attributes. It is also joined to many other entities of a banking EDW. A few examples of the key entities are the following,
  • Branch 
  • Primary Customer
  • Teller
  • Loan Product
  • Time/Dates ( Maturity, Commencement, Settlement, Disbursement, Dormancy )
As we discussed before there are many textual attributes are unique to a loan dimension and few examples are listed below,

Account status,
Product Name
Settlement, Account open date, Loan Approved date, First disbursement date, Last Transaction date.

Considering the size of a loan dimension it can be split into multiple tables to improve performance and maintainability. A good way to start modelling a dimensional model is to create bus matrix as suggested by Ralph Kimball.

The bus matrix for a Loan dimension in an ideal scenario would look as below,

Bus Matrix for Loan Dimension.

The loan dimension being one of the largest entities will take up a number of columns. It is a good practice to split the dimension as follows in case it has a large number of table,

The first partition of the dimension table should have all the most commonly used attributes, like Status, Balance etc. The second partition should have all the measurable attributes like Approved amount, applied amount, and the corresponding date keys. These measures can also act as attributes in some cases. The second table can also be used as a fact in some cases. The third or the next partitions should have all the junk values related to the dimension. Advised to put all the rarely used dimension columns here. These tables can be made to one single dimension in BI tool for example with the help of Logical table sources in OBIEE.

Why we are advising this strategy is because otherwise the maintenance of this table would be down the line a headache. Imagine a table with 500 columns. A single insert statement would be of large size. Hence taking more time to execute and load into target schema.

Naming conventions, You can go ahead with standard naming conventions followed by Oracle or have one of your own. An Oracle naming standard would be,

W_LOAN_D

If you have multiple sources, Then you can have a source abbreviation also in the same name.