In this example we will read XML data from a table with an XMLTYPE column in the Oracle database and convert the XML to tables in the same Oracle database.
We will use Flexter to extract the data to and from the same database in this example.
We need to give Flexter table and column name that contains the XML data.
JDBC input sources require us to include the input path with a jdbc: protocol.
[flexter_banner]

$ xml2er -T <[[Catalog.]Schema.]Table]> -C <XML Column> -U <user> -P <password> [OPTIONS] jdbc:<database url>
# or using a sql query
$ xml2er -T "<SQL QUERY>" -C <XML Column> -U <user> -P <password> [OPTIONS] jdbc:<database url>

Example:

$ xml2er -T tab -C xml_col -U rob -P rob_pass [OPTIONS] jdbc:oracle:thin:@//d.org.io:<port-no>
# or using a sql query
$ xml2er -T "select xml_col from tab" -C xml_col -U rob -P rob_pass [OPTIONS] jdbc:oracle:thin:@//d.org.io:<port-no>

This command will create one connection and against the source database and retrieve the data sequentially.
To read in parallel with multiple threads (available on a single machine as well), a sequential ID column name and its range must be provided. The options which we are using are given in the below table

-T, --table TABLE               Table or SQL which brings XML data
-C, --column COLUMN             Column which brings XML data
-I, --part-in-col PART_COL      Partition or primary key column
-<MIN>, --part-in-min MIN       Lower bound to retrieve rows
+<MAX>, --part-in-max MAX       Upper bound to retrieve rows

We have a table “pnr_xml_msg” which has xml content in the column “source_data”. The Primary Key is seq_nbr. Let’s take a look at the table and the xml content that will parse with Flexter.
We can see that the table contains the “source_data” column of XMLTYPE. This column contains the xml data.
[flexter_button]
 

We then try to generate a logical schema based on the xml content in the column

# running the command with -s option
$ xml2er -s \
      -T "select source_data,seq_nbr from pnr_xml_msg where seq_nbr between 495301 and 495321" \
      -C source_data \
      -I seq_nbr -495301 +495321 \
      -U <source username> \
      -P <source password> \
        jdbc:oracle:thin:@//d.org.io:<port-no>

If the flexter command works fine, we can proceed with generating the logical schema for the xml content in the table

# generating the logical schema for the xml content
$ xml2er -g3 \
          -T "select source_data,seq_nbr from pnr_xml_msg where seq_nbr between 495301 and 495321" \
          -C source_data \
          -I seq_nbr -495301 +495321 \
          -U <source username> \
          -P <source password> \
            jdbc:oracle:thin:@//d.org.io:<port-no> \
# schema
         origin:  22
        logical:  7
            job:  34
# statistics
        startup:  3699 ms
          parse:  4633 ms
          stats:  6065 ms
            map:  1677 ms
  unique xpaths:  227

Here, we are using -U and -P parameter as the input username and password to the jdbc URI

-U, --iuser USER                Input user
-P, --ipassword PASSWORD        Input password
-F, --input-format FORMAT       Input format. (jdbc,parquet,orc,json,csv, tsv)

Now as we have generated the logical schema we can extract the data to the output jdbc connection. We can use the following options to generate the output as desired

-o, --out OUTPUT                Output location path
-u, --ouser USER                Output user
-p, --opassword PASSWORD        Output password
-B, --batchsize BATCHSIZE       Batch size to write into databases
                                  default: 1000
-f, --format FORMAT             Output format. (jdbc, parquet, orc, json, csv, tsv)
-z, --compression COMPRESSION   Parquet, csv, tsv compression mode
                                  (none, snappy, gzip, lzo, lz4, bzip2, xz)
-S, --savemode SAVEMODE         Save Mode when table, directory or file exists
                                  ex: [e]rror, [a]ppend, [o]verwrite, [i]gnore
                                  default: append

Initially we use the the primary key to partition the data we are extracting using the -I option, We can also parse and extract the entire XML content if we skip the -I option

$xml2er -l7 \
           -T "select source_data,seq_nbr from pnr_xml_msg where seq_nbr between 495301 and 495321" \
           -C source_data \
           -U <source username> \
           -P <source password> \
             jdbc:oracle:thin:@//d.org.io:<port-no> \
           -o jdbc:oracle:thin:@//d.org.io:<port-no> \
           -u <target username> \
           -p <target password>
# output
           path:  jdbc:oracle:thin:@//d.org.io:<port-no>
           user:  <target username>
       password:  ***
         format:  jdbc
# schema
         origin:  22
        logical:  7
            job:  35
...
16:37:10.426 INFO  table EMD_Info: creating
16:37:11.337 INFO  table EMD_Info: writing
16:37:14.849 INFO  table Ticketing: creating
16:37:15.144 INFO  table Ticketing: writing
...
# statistics
        startup:  4097 ms
           load:  9211 ms
          parse:  1693 ms
          write:  54730 ms
          stats:  1526 ms
            map:  4 ms
  unique xpaths:  227

In the above output we can see that the Flexter is writing the tables in the target db as shown in the output. The XML content parsed can be seen in the form of tables in the given jdbc path.
We can see that the parsed tables are created in the target db as per the jdbc path.

We can run basic queries to see if the data is parsed correctly

sql> describe tax;
sql> describe ticketing;
sql> describe airtraveler;
sql> select * from tax limit 5;
FK_PTC_FAREBREAKDOWN   AMOUNT 	  CURRENCYCODE   	TAXCODE
0000000000000000082	0.91	    EUR	       EX
0000000000000000012	6.5	    EUR	       HB
0000000000000000006	21.06	    CHF	       UP
0000000000000000012	16.04	    GBP	       UB
...
sql> select * from <table-name> where <condition>
...

[faq_button]