Flexter completely automates the whole process of converting XML/JSON files to a relational format. No custom development is needed. This can save up to 80% of the overall conversion costs. You don’t need to hire external consultants with XML/JSON expertise. As one customer put it recently “You did in one day what would have taken us a year”.
Flexter can process data in real time. As soon as an event happens in the real world it can be analysed instantly with the help of Flexter.
Flexter eliminates project risk. We have seen many XML/JSON conversion projects fail. The failure rate grows exponentially with the complexity of the XML/JSON and the volume of data that needs to be converted.
Flexter is big data ready. We have built scalability into Flexter from the ground up. Flexter can scale up and out across multiple servers. It can handle any volume of data and meet any SLA.
Flexter significantly shortens the duration of XML/JSON conversion projects. Developers can focus on data analytics tasks that add value to the business rather than having to wrangle with XML.
Flexter can meet any service level agreement. In tests it was an astonishing 800 times faster than competing solutions.
Flexter handles different versions of XML/JSON standards gracefully. You can compare different versions of the standard and generate upgrade scripts between different versions of scripts automatically. No coding needed.
The Flexter platform consists of three plugable modules:
Schema Analyser (xsd2er)
Mapping generator (CalcMap)
XML Processor (xml2er)
Step 1: The Schema Analyser is a dedicated module that loads, parses out, processes and stores the XML/JSON schema information in Flexter's internal metadata DB. This step is only required to be performed once for each schema to be processed. You can either supply an XSD or a representative sample of XML/JSON files for this step.
Step 2: Now that we know the exact layout of the source XML/JSON it is possible to generate the relational equivalent. Flexter's module, Mapping Generator generates the output schema layout and the mapping to it. Various optimisations of the target schema can be applied during this step to make the schema more compact.
Step 3: The XML/JSON Processor module takes the information generated from the two previous steps, processes the XML/JSON, and writes the data to the relational target schema.
Yes. Flexter supports very large XML files greater than 1 GB.
The core strength of ETL tools is to transform structured data and work with relational databases. They often struggle with semi-structured data in XML/JSON files. While most ETL tools offer functionality to handle simple and flat XML/JSON files at low volumes, they have serious limitations:
- They don’t automate the conversion process. ETL developers still need to create data flows (potentially hundreds for complex XMLs) and data pipelines. A significant development effort indeed.
- ETL tools don’t scale beyond a single server for XML/JSON processing.
- ETL performance for JSON/XML is poor. We have seen ETL processes running for 22 hours to process a small number of 50K XMLs.
- Most ETL tools can’t handle XML files in batches. They process XML/JSON files individually, which has a significant impact on performance
Here are two blog posts where we compare Flexter against two popular ETL tools.
Yes. Flexter supports real-time use cases through its streaming engine.
Yes, we offer version control. With Flexter you can easily identify what has changed between different versions of your XMLs/XSD, e.g. which elements have been added or removed.
We support both individual XML/JSON files and batches of XML/JSON files in archives and compressed formats (zip, gzip etc.).
We can pull XML/JSON files from network drives, (S)FTP servers, HDFS, S3, XMLTYPE/CLOB/BLOB, BJSON in databases etc.
Converting G1 XML to AWS S3
We support any relational databases, e.g. Oracle, MS SQL Server, DB2, PostgreSQL, MySQL, Redshift, Snowflake, BigQuery etc.
We support comma separated and tab separated files as output.
When we generate the target schema we also provide various optional optimisations, e.g. we can influence the level of denormalisation of the target schema and we may optionally eliminate redundant reference data and merge it into one and the same entity.
If you are only using an XSD to generate the target schema all of the possible XPaths will be translated into the target schema. The target schema is more verbose and complex. If the XML files you process conform to your standard then you should not get any warning messages.
However, we often see that XSD designers have been sloppy and do not properly define relationships, cardinality etc. in the XSD. For those scenarios its best to use both the XSD and XML. For gaps and sloppy design in the XSD we override the schema with the stats from the XML sample.
Converting FHIR JSON to CSV with Flexter
Converting Google Analytics JSON to S3 on AWS
Converting XML to TSV on HDInsight
Converting GS1 XML to S3 on AWS
Converting TVAnytime XML to Impala and Parquet
Convert ESMA XML to Snowflake
Convert FpML XML to BigQuery
Convert XML to AWS ATHENA
[Video] Convert ACORD Insurance XML to Talend Data Preparation
Tips for parsing and loading complex NDC XML files in ODI
Converting and Analysing EU Tender data in XML with Flexter and Dataiku DSS
Converting GS1 XML to S3 with Free Flexter
Converting Google Analytics JSON to S3 on AWS
Converting Clinical Trials XML to PostgreSQL
Converting MedlinePlus XML to QlikView
Convert WITSML XML to Excel
Convert FIXML XML to Excel
Convert FHIR XML to Excel
Convert ISO 20022 XML to Text (TSV/CSV) without code
Flexter can be used to trickle feed XML/JSON files in real time to an analytics engine.
Flexter can be used for data exchange scenarios that require translation of XMLs/JSONs to a relational format.
Flexter can be used to migrate large volumes of historic XML/JSON files to a database.
Flexter can be used to migrate an XML database to a relational database.
https://sonra.io/2018/04/05/converting-xml-tsv-hdinsight/
Free | FaaS (API) | Enterprise | |
---|---|---|---|
Data | |||
Data volume restrictions | 1 MB zipped (~10 MB raw) | Unlimited (1 GB raw per API call) | Unlimited |
Overwrite | |||
Append | |||
Formats (input) | |||
XML | |||
JSON | |||
TSV, CSV, PSV etc. | |||
RDBMS | |||
Databases (output) | |||
Teradata | |||
Oracle | |||
MySQL | |||
PostgreSQL | |||
Snowflake | |||
Redshift | |||
MS SQL Server | |||
Azure SQL Data Warehouse | |||
Query engines (output) | |||
Hive | |||
Impala | |||
File formats (output) | |||
CSV, TSV, PSV etc. | |||
ORC | |||
Parquet | |||
Avro | |||
API | |||
API | |||
Webhooks | |||
Optimisations | |||
Elevate | |||
Reuse | |||
Naming | |||
Schema generation | |||
JSON sample | |||
XML sample | |||
XSD | |||
XSD + XML sample | |||
RDBMS | |||
Text | |||
Sources | |||
FTP, SFTP | |||
S3 | |||
Azure BLOB Storage | |||
HTTP(S) | |||
HDFS | |||
MapR-FS | |||
Network drive | |||
Local file system | |||
Upload | |||
Access to Metadata | |||
Metadata repository | |||
Data lineage | |||
ER diagram | |||
Platforms | |||
Hadoop | |||
Openshift | |||
Nomad | |||
Spark | |||
Stand-alone server | |||
Merge | |||
Merge of output files (small files problem) | |||
File format conversion | |||
Location | |||
Cloud | |||
On-premise | |||
Support | |||
Phone | |||
Price | Free | Contact Us | Contact Us |
Which data formats apart from XML also give you the heebie jeebies and need to be liberated? Please leave a comment below or reach out to us.