Showing posts with label ODI 12c. Show all posts
Showing posts with label ODI 12c. Show all posts

Wednesday, March 1, 2017

Importance of creating separate schema for Work in ODI

We had come across an article a long while back on the significance of using a separate schema for work instead of Target schema in ODIExperts blog. We later learned from this as we could clearly see the advantage of doing the same. 

We use OBIEE as our front end tool and we could see that all the work tables of ODI for each target table would be present in the target schema. The more tables in the target schema, the more time it takes for OBIEE to process the import metadata step. Imagine you have 10 tables in your DWH target schema, if you don't define a separate work schema for each table then for one table, W_STUDENT_D you will have the following work tables also, 

C$_W_STUDENT_D, 
I$_W_STUDENT_D,
E$_W_STUDENT_D

This not only makes OBIEE import slow also it takes up table space on the target schema making database management hard. If maintained separately then you can purge all tables in the work schema at once. The worry of accidentally truncating any of your target tables are also eliminated this way. 

Another important step as they discussed is to not use the 'Stage area different from target option', which is done at the interface level. The same can be done at the physical schema in topology level, 

ODI Work Schema.
 

Using ODI Procedures to load data from Source to Target

We came across a scenario where we used ODI procedure to load data from source to target. The scenario was that we had to implement a very complex logic using sub queries in an ODI Interface. given the short timeline, we went ahead with the approach of loading the same using query in an ODI Procedure. We will discuss the same below,

An ODI Procedure has two parts, Command on Source and Command on Target,



  • ODI Command on Source is where you write the source SQL query that you use to fetch data from source. 
  • ODI Command on Target is where you write the insert script to load data into the target. 
  • Load data from one source to another source (eg: Oracle to Teradata, MSSQL to Oracle, etc)
  • The source side query can be a plain query. The number of columns in select though should match with the number columns in the insert script and their datatypes also should match. 
 eg: SELECT ROLL_NO,STUD_NAME FROM STUDENT
  • The insert script should contain bind variables to select values from the source query, the insert script for the above source query would be, 
  INSERT INTO W_STUDENTS_D VALUES (:ROLL_NO,:STUD_NAME)
  • Extremely inefficient compared to interfaces.
  • Never preferred over interfaces. Interfaces are always faster as ODI follows all standards while generating queries. Preferred for loading of tables with less volumes of data.'
  • Row by row processing happens and hence it takes longer time for data load. 
Typically the errors you get are either SQL syntax errors, Data type mismatch, column size errors, column order mismatch errors. 

Thursday, February 23, 2017

All about Oracle Data Integrator(ODI) Variables

ODI variable is an ODI object that can be used to store values of string, date or number datatypes. Variables can be both static and dynamic, they can be updated at run-time using a query. These variables can be passed to other ODI objects like interfaces or procedures as filters or as column values.
  • Can store String/Date/Number values.
  • Values are updated during run-time.
  • Values can be passed to other ODI objects like Procedures and Interfaces.
  • Can either be Static/Dynamic in nature. 
  • Dynamic ODI variables can be refreshed with a SQL query executed against a source database.
  • Variables are commonly used for filter conditions, case conditions etc in interfaces, procedures and packages.
The scope of an ODI variable is two, 

Global: The variable can be called across projects in ODI repository.

Project: The scope of the ODI variable is limited to the particular project. 

There are three attributes to an ODI variable, 
  1. Name
  2. Datatype
  3. Action
Action determines how values are to be stored in an ODI variable, The action types are 

Persistent: Values are stored for a particular session
Last Value: Stores the last value. 
Historize: Stores current the value and the others in history. Very useful if your variable values change frequently over time. Helps in debugging. 

An ODI variable can be called using the below syntax,

#PROJECT_NAME.Variable_Name

ODI used scenario examples, 
  1. Imagine you are passing a date value dynamically from database to fetch records for ETL load. Instead of using a sub query filter, you can fetch the date value to a variable and use it in all your procedures and interfaces. 
  2. Another scenario is storing JDBC url. You might be using JDBC urls at different places in an ODI instance. If hardcoded at each place, it would be hard to find and map the change wherever the same JDBC url has been used. Instead you can save the JDBC url as an ODI variable and then use the variable name wherever needed in topology. 
Oracle has a pretty neat document on variables. 


Wednesday, November 30, 2016

ODI Performance Tuning - Array fetch and Batch update size parameters

There are two parameters in Oracle Data Integrator topology that can be tweaked to improve performance of loading. These two parameters are, 

  • Array Fetch size 
  • Batch Update size
The Array Fetch size parameter can be tweaked for the source data server. This defines the number of records read from the source at a time and this data is stored in the agent. 

The Batch Update size parameter may be tweaked for the target data server. This defines the number of records fetched in one batch to be written to be server stored in the agent. 

When you increase number of records per batch, the time taken to read and write the total records will be less. This would mean you have lesser batches in total, hence improving performance. Since more data is stored in RAM, a good RAM size would be a necessity. 

Oracle recommends the following, 
  • For good networks with little latency values less than 30 can be kept. However with a poor network the parameters have to be set as greater than 100. 
  • The typical values as per oracle is between 50 and 500. 
Recommendation from Oracle ODI Performance tuning documentation. 

Here is a very interesting read on how this parameter if optimally set improves performance.