Under the hood of the Sunopsis Memory Engine. Part 1.
Get our e-books Discover the Oracle Data Integrator 11g Repository Data Model and Oracle Data Integrator Snippets and Recipes
Oracle Data Integrator not only uses hsqldb for its demo environment. It is also used as the in memory database for the so called Sunopsis Memory Engine.
“It is possible to run HSQLDB in a way that the database is not persistent and exists entirely in random access memory. As no information is written to disk, this mode should be used only for internal processing of application data, in applets or certain special applications. This mode is specified by the mem: protocol”. (hsqldb documentation)
ODI uses version 1.7.3 of hsqldb. You can download this version together with the documentation from here.
When you use the Sunopsis Memory Engine in an ODI interface or procedure an hsqldb in-memory database is created inside the JVM of the agent. The memory that is allocated to the hsqldb is taken from the JVM. As a result the maximum memory size that can be allocated to the in memory database is determined by the heap size parameter in the odiparams.bat file.
The JDBC Url syntax that is used in ODI is actually the old syntax as can be seen from the figure below.
The correct syntax for the above JDBC Url should be jdbc:hsqldb:mem:<name_of_db>. As you can see you can specify the name of the database. This allows you create multiple hsqldb databases inside the same JVM.
“Version 1.7.2 introduces a uniform method of distinguishing between different types of connection, alongside new capabilities to provide access to multiple databases. The common driver identifier is jdbc:hsqldb: followed by a protocol identifier (mem: file: res: hsql: http: hsqls: https:) then followed by host and port identifiers in the case of servers, then followed by database identifier” (hsqldb docs).
When you create an hsqldb in-memory database it persists over the lifetime of the JVM, until the agent is stopped, or if you use the local agent until you close down Designer. Alternatively you can issue a Shutdown command against the hsqldb database. Any of the above will destroy objects that were created and will release the allocated memory. It is important to understand that memory is not released upon execution of an interface or when an ODI session ends. Tables or other objetcs that were created in an ODI package persist across the lifetime of this package. This is quite different to the lifecycle of say a Jython/Java BeanShell variable. You need to make sure that you clean up after yourself by e.g. dropping any objects you create.
As far as I know you can’t connect from an hsqldb mem database inside a JVM to one in another JVM.
“In-process (standalone) mode supports connections from the client in the same Java Virtual Machine, while server modes support connections over the network from several different clients” (hsqldb docs).
This means that you can’t use a client such as Squirrel SQL to connect to tables that you create in an interface. What you can do however, to debug is to manually create datastores in a Sunopsis Memory Engine model and use the local agent to execute interfaces that create and populate this table. You can then use the View Data functionality to view the content of the datastore. This is also documented in note 424597.1.
When does it actually make sense to use the hsqldb in mem database? I can see two scenarios where this feature may be useful.
First, it can be used for simple low volume transformations (less than 100K records; exact number really depends on the parameters of your own environment such as JVM heap size, concurrency etc.). This can be particularly useful for temporary interfaces if you don’t want to physically materialize the data on disk. Remember, however, that it is not the purpose of the ODI agent that runs the in mem database to do a lot of heavy lifting. This would not make any sense in an ELT tool where all the heavy duty stuff is done by the target RDBMS.
You need to be careful not to run out of memory when you use the in mem database for this purpose. I will show you in the second part of this series what best practices you can use to achieve this. I am not sure how efficient hsqldb is in executing complex multi-table joins with subqueries. Currently it also does not support analytic functions, which in my opinion is a big disadvantage.
Secondly, the Sunopsis Memory Engine can be used to persist variables across sessions and scenarios or to populate ODI variables from Jython or the Java BeanShell. This is handy as all your logic is executed inside the JVM of the agent and you don’t have to switch back and forth between different technologies. As a result you will keep overhead to a minimum. We will have a look at how this can be achieved in the third part of this series.
ODI Training. Learn ODI from the experts.
You may also be interested in our ODI training courses. Value-priced. Customised. Onsite-Offsite. Online. Get all the details from our ODI training site