Converting Financial Information eXchange XML (FIXML) to MS SQL Server
FIX is a messaging standard used for trades in the financial services industry. The FIX Trading Community is the independent body behind the FIX messaging standard. FIX messages are encoded in XML. The FIXML schema files (XSD) together with the specification are available for download from the FIX website.
In this blog post we will use the Flexter XML converter to parse a FIXML message with an embedded FpML message to MS SQL Server.
Flexter, our ETL tool for XML, can generate a target schema from an XML file or a combination of XML and XML schema (XSD) files. In our example we process data based on the FIXML schema. Both the XML files and the XSD are available and we use the information from both files to generate the target schema in the SQL Server database.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
# We first test that the XML file is well formatted by simulating the execution the skip switch (-s) $ xml2er -s cme_fix_trdcaprpt.xml # Next we extract the statistics from cme_fix_trdcaprpt.xml. Statistics are used to generate the target schema. We use the xml2er command line tool without the skip (-s) switch. $ xml2er cme_fix_trdcaprpt.xml … # The result of this operation is an ID (origin: 5). We will use this ID in subsequent steps origin: 68 job: 176 # Some useful execution statistics startup: 4994 ms parse: 760 ms stats: 6401 m Map: 2 ms |
Now that we have gathered statistics from our XML sample we can generate the logical target schema with the xsd2er command line tool using the -k switch (-k is shortcut for –use-stats), where we will use the origin id generated in the previous step.
1 |
-k, --use-stats <ID[,ID2..]> Use the stats to generate the new schema |
Let’s go through the steps
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
# Template $ xsd2er -s -k<XML Schema ID> -g<Optimization Level> INPUTPATH # We first simulate generating the target schema with -s skip switch # Flexter can read through the xsd files stored in zip format $ xsd2er -s -k68 -g3 fixmlschema_FIX.5.0SP2_EP228.zip # everything worked. Now running the command for real without skip $ xsd2er -k68 -g3 fixmlschema_FIX.5.0SP2_EP228.zip … # schema origin: 69 logical: 45 job: 177 # statistics startup: 514 ms stats: 65 ms parse: 870 ms build: 319 ms write: 49 ms map: 314 ms |
Now we use the Logical Schema ID (Logical:45 and origin: 68) to convert the XML data.
1 2 |
# First simulating the conversion process $ xml2er -s -l45 cme_fix_trdcaprpt.xml |
When the command is ready, removing –skip or -s, allows us to process the data. We can directly send the parsed output to our SQL Server target database, which we will create in the below steps.
[flexter_button]
To install and configure SQL Server on Linux, we will proceed as below and run the following commands in a terminal to install the mssql-server package.
1 2 3 4 5 6 7 8 9 10 11 12 |
# Importing the public repository GPG keys: wget -qO- https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add - # Registering the Microsoft SQL Server repository (for Ubuntu) sudo add-apt-repository "$(wget -qO- https://packages.microsoft.com/config/ubuntu/16.04/mssql-server-2017.list)" # Run the following commands to install SQL Server: sudo apt-get update sudo apt-get install -y mssql-server # Finally we run mssql-conf setup and follow the prompts to set the password sudo /opt/mssql/bin/mssql-conf setup # To verify that the service is running: systemctl status mssql-server … |
To create a database, you need to connect with a tool that can run Transact-SQL statements on the SQL Server. The following steps install the SQL Server command-line tools: sqlcmd and bcp.
Use the following steps to install the mssql-tools on Ubuntu.
1 2 3 4 5 6 7 8 9 10 11 12 |
# Update the sources list and run the installation command with the unixODBC developer package. sudo apt-get update sudo apt-get install mssql-tools unixodbc-dev sudo apt-get install mssql-tools # Add /opt/mssql-tools/bin/ to your PATH environment variable in a bash shell.To make sqlcmd/bcp accessible from the bash shell for login sessions, modify your PATH in the ~/.bash_profile file with the following command: sudo apt-get update sudo apt-get install -y mssql-server # Finally we run mssql-conf setup and follow the prompts to set the password echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bash_profile #To make sqlcmd/bcp accessible from the bash shell for interactive/non-login sessions,modify the PATH in the ~/.bashrc file with the following command: echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc source ~/.bashrc |
The following steps use sqlcmd to locally connect to your new SQL Server instance. Run sqlcmd with parameters for your SQL Server name (-S), the user name (-U), and the password (-P). In this tutorial, you are connecting locally, so the server name is localhost. The user name is SA and the password is the one you provided for the SA account during setup.
1 2 3 |
# First simulating the conversion process sqlcmd -S localhost -U flexter -P '<YourPassword>' # If successful, you should get to a sqlcmd command prompt: 1>. |
Guide to install the SQL Server for other distributions is given here. Also, we can use SQl Server docker version as given here.
Next, we will use sqlcmd to create a new database where we will export the data extracted using Flexter.
From the sqlcmd command prompt, paste the following Transact-SQL command to create a test database:
1 2 3 4 5 6 7 8 |
# Creating a target database for flexter output 1>CREATE DATABASE FlexterDB 2>SELECT Name from sys.Databases 3>GO dbo FlexterDB guest … |
Now as we can extract the data to the output SQL Server connection. We can use the following options to generate the output as desired
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
-o, --out OUTPUT Output location path -u, --ouser USER Output user -p, --opassword PASSWORD Output password -B, --batchsize BATCHSIZE Batch size to write into databases default: 1000 -f, --format FORMAT Output format. (jdbc, parquet, orc, json, csv, tsv) -z, --compression COMPRESSION Parquet, csv, tsv compression mode (none, snappy, gzip, lzo, lz4, bzip2, xz) -S, --savemode SAVEMODE Save Mode when table, directory or file exists ex: [e]rror, [a]ppend, [o]verwrite, [i]gnore default: append |
Initially we use the the primary key to partition the data we are extracting using the -I option, We can also parse and extract the entire XML content if we skip the -I option. Here we will use the logical schema id 45, created in the previous step.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
$xml2er -l45 \ cme_fix_trdcaprpt.xml -o jdbc:jtds:sqlserver://localhost:5000/<Database-Name>\ -u <target username> \ -p <target password> # output path: jdbc:jtds:sqlserver://localhost:5000/<Database-Name>\ user: <target username> password: *** format: jdbc 16:43:26.254 INFO writing tables 16:43:28.619 INFO table FIXML: writing 16:43:29.669 INFO table Batch: writing 16:43:30.285 INFO table party: writing 16:43:30.571 INFO table swapStream: writing 16:43:31.191 INFO table RptSide: writing 16:43:31.410 INFO table Pty: writing 16:43:31.641 INFO table Sub: writing 16:43:31.844 INFO table RegTrdID: writing 16:43:32.040 INFO calculating statistics 16:43:33.640 INFO writing statistics 16:43:33.682 INFO table du_stat: writing # schema origin: 70 logical: 46 job: 179 # statistics startup: 5197 ms load: 7833 ms parse: 1754 ms write: 51619 ms stats: 1723 ms map: 3 ms |
As you can see Flexter is loading the FIXML output directly to the target database.
We can run basic queries to see if the data has been parsed correctly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
# using the flexterdb which we created before to store the data 1>USE FlexterDB 2>GO # Displaying all the table names 3>SELECT TABLE_NAME FROM <DATABASE_NAME>.INFORMATION_SCHEMA.TABLES WHERE TABLE_TYPE = 'BASE TABLE' 4>GO fixml Batch party swapstream rptside pty sub Regtrdid # Describing the schema of a table 5>SELECT * FROM INFORMATION_SCHEMA.Columns WHERE TABLE_NAME= 'party' 6>GO FK_Batch id partyId # Listing distinct values of a column in a table 5>SELECT DISTINCT PARTYID FROM PARTY 6>GO A-50925 CME … |
We can run similar SQL Queries with the data to do basic data analysis with the data extracted from flexter. The text version of the tables are attached and available for download. Also, the ER Diagram for the tables is available for download.
[faq_button]