When I first heard about the new load plan feature in ODI 220.127.116.11 I thought it could replace our custom developed scheduling mechanism.
While the load plan feature makes it easier to run scenarios in parallel it became quickly clear that there are inefficiencies in defining dependencies and as a result in exploiting parallelism.
In the example below we are loading two fact tables. Fact 1 has dependencies on Dimension 1 and Dimension 2, whereas Fact 2 just has a dependency on Dimension 2.
Fact 1 needs to wait for both Dimension 1 and 2 to finish before it can be loaded. Fact 2 just needs to wait for load of Dimension 2 to finish before it can be loaded itself. So the most efficient load plan would be to load Dimension 1 and Dimension 2 in parallel. Once Dimension 2 has finished loading, processing of Fact 2 kicks in. Once load for Dimension 1 has been kicked off, the load for Fact 1 should be kicked off as well.
The problem is that a load plan in ODI can’t be set up like this.
In the ODI load plan both fact tables need to wait for both parallely loaded dimensions to finish loading before the fact tables can be loaded themselves in parallel.
This inefficiency in parallelism is typically no problem when you are loading in nightly batch windows and when dealing with just a hundred or so scenarios. However, the unnecessary waits may add up. Another feature that I am missing is a feature to limit the number of scenarios that can be executed in a parallel step at a time. As an example you may have ten scenarios in a parallel step. However, you just want five to execute in parallel, e.g. because of resource constraints on the database server. Once one of the scenarios has finished one of the remaining five kicks in and so on.
We have a custom developed solution that has all of the above features and a lot more, and seamlessly integrates with ODI. Get in touch to find out more.