Converting Workday XML to Snowflake

Overview
In this post we will show you how you can easily convert Workday XML or any other XML to a Snowflake database. We will use Sonra’s Paranoid to mask the XML data and Flexter to convert Workday XML automatically without any manual coding.
Table of Contents
What is Flexter?
Flexter converts any XML/JSON to a readable format in seconds. Any type. Any size. Any volume. Any target
Projects that can take weeks or months, or never get finished, can be completed in a day or two with Flexter. Flexter requires no coding skills and is a totally automated way to un-silo industry-standard XML data and convert it into a readable database.
Flexter is unique in data management and data warehousing software in that it completely automates the process of data conversion.
Specialist skills are often required and need to be hired at costly rates.
That slows down the flow of information that data analysts need and, higher up the chain, it impacts important decision-making.
Flexter changes all that, freeing your time and budget up to focus on providing real value to the business and getting your data rapidly into a readable and analysable database format.
Without the need to write a single line of code.
What is Paranoid?
Paranoid is a powerful open source command line tool that masks and obfuscates XML and JSON files. It can target a specific xpath or mask entire files without any coding. It is best used in combination with Flexter.
What is Workday?
Workday is a cloud human resources management system. It combines human capital management, enterprise resource management, and financial management applications into one system. It is designed for medium to large companies thanks to its features:
- human resource management (HRM)
- talent management
- payroll management
- time tracking
- data analysis
- And more
What is Snowflake?
Snowflake is a cloud data platform for data warehousing, data lakes, data engineering, data science, data application development, and secure sharing and consumption of real-time / shared data. It is a fully managed solution provided as a SaaS, with support for ANSI SQL. It lets you focus on querying data, while reducing your time needed to invest into administration or DBA activities.
Masking Workday XML
We will mask Workday XML data with one quick command. If needed we can specify which xpath we can mask, but for this use case we will mask the entire XML file.
1 |
paranoid -i workday -o workday-out |
Converting Workday XML with Flexter
We will convert XML/JSON to Snowflake in a few simple steps.
Step 1 – Authenticate
Step 2 – Create a Data Flow from source file
Step 3 – Generate schema (target data model) / optional
Step 4 – Define your target connection, e.g. Snowflake
Step 5 – Process your XML/JSON data
Let’s go through these steps one by one and process our data.
Step 1 – Authenticate
To get an access_token you need to make a call to /oauth/token with Authorization header and 2 form parameters:
- username=YOUR_EMAIL
- password=YOUR_PASSWORD
1 2 3 4 5 |
faascli login --username “workday” --password “workday” Mar 21 2022 17:52:03 You have successfully logged into FaaS |
Step 2 – Create a data flow
The Data Flow creates the logical target schema and the mappings from your source XML or JSON elements to the table columns in your target schema.
For XML you can create a data flow from
- A source schema (XSD)
- A sample of source data (a sample of XML documents)
- A combination of source schema (XSD) and source data (a sample of XML documents).
For JSON you can create a data flow from a sample of source data only (a sample of JSON documents). We currently don’t support JSON schema.
1 |
faascli data_flow create_upload --name workday --source_data /home/user/paranoid/workday-out/payment_types_response.zip --data_type xml |
Output
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
{ "create_date": "2022-03-22T12:14:41.968+0000", "data_source_connection": { "create_date": "2022-03-22T12:14:41.576+0000", "name": "workday11647951281238", "path": "file-41d02f22-ec68-42e8-8a32-fe049cd09ea9", "size": 20055, "source_type": "uploaded_file", "type": "xml" }, "name": "workday", "status": "INITIALIZED" } |
Step 4 – Create target connection
Before you create a Conversion you need to create a Target Connection. A Target Connection is a reusable location for your target schema.
1 |
faascli target_connection create snowflake --name workday --host sidano243.eu-central-1.snowflakecomputing.com --username workday --password "workday" --database Workday --warehouse WORKDAY --schema WORKDAY --role SYSADMIN |
Output
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
{ "create_date": "2022-03-22T17:00:27.069+0000", "details": { "database": "WORKDAY", "role": "SYSADMIN", "schema": "WORKDAY", "user": "WORKDAY", "warehouse": “WORKDAY" }, "name": "workday", "path": "sidano243.eu-central-1.snowflakecomputing.com", "target_type": "snowflake" } |
Step 5 – Process your XML/JSON data
Once you have created a Data Flow you can use it to convert XML or JSON documents in a Conversion.
Mandatory parameters
The name of the Data Flow and the name of the Source Connection for the Conversion Data are mandatory parameters.
Optional parameters
The name of a Target Connection is optional. If you don’t provide a Target Connection, Flexter will generate a download link for your data in the response.
If you select a Target Connection of type S3 or of type Download you can also optionally provide the type of file format, i.e. CSV, TSV, Parquet, ORC, Avro.
If you don’t specify a file format for Target Connection of type S3 or Download, Flexter will generate TSV files as the default type.
There are also some other optional parameters that you can specify when you create a Conversion.
- Prefix. When you specify a prefix all of the target table names or file names will be prefixed with the value you specify.
- Suffix. When you specify a suffix all of the target table names or file names will be suffixed with the value you specify.
- By default a Conversion appends data to your Target Schema. You can optionally overwrite the data in your Target Schema when executing a Conversion.
The tables, columns, and relationships are generated in your Target Schema the first time you execute a Conversion.
You can re-use a Data Flow with different Target Connections when you create a Conversion, e.g. you can use the Data Flow to first convert data to Oracle and then to SQL Server.
1 |
faascli conversion execute_upload --name workday --source_data /home/user/paranoid/workday-out/payment_types_response.zip --source_data_type=xml --data_flow workday --target_connection workday |
Output
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
{ "create_date": "2022-03-23T08:20:44.301+0000", "data_source_connection": { "create_date": "2022-03-23T08:20:14.389+0000", "name": "workday1648023614009", "path": "file-5b07f689-b249-49df-b120-9f2760d2160b", "size": 20055, "source_type": "uploaded_file", "type": "xml" }, "dataflow": "workday", "name": "workday", "status": "INITIALIZED", "target_connection": { "create_date": "2022-03-22T17:00:27.069+0000", "details": { "database": "WORKDAY", "role": "SYSADMIN", "schema": "WORKDAY", "user": "WORKDAY", "warehouse": "WORKDAY" }, "name": "workday", "path": "sidano243.eu-central-1.snowflakecomputing.com", "target_type": "snowflake" } } |
Conclusion
We have masked XML data and processed it with ease. We did in a matter of minutes what would normally take a few days.
Ralph Kimball the father of dimensional modelling and data warehousing already knew:
“Because of such inherent complexity, never plan on writing your own XML processing interface to parse XML documents.
The structure of an XML document is quite involved, and the construction of an XML parser is a project in itself—not to be attempted by the data warehouse team.”
Our enterprise edition can be installed on a single node or for very large volumes of XML on a cluster of servers.
You can also request a demo of Flexter or reach out to us directly.
Enjoyed this post? Have a look at the other posts on our blog.
Contact us for Snowflake professional services.
In this video, we use Flexter to automatically convert very complex FpML XML to Snowflake tables. Book a demo to see the power of Flexter in action!