XML,

XSD to Database Schema Guide – Create SQL Tables Easily

Maciek
by Maciek

Maciek is the Co-founder of Sonra. He has a knack for turning messy semi-structured formats like XML, JSON, and XSD into readable data. With a brain wired for product and data architecture, Maciek is the magic ingredient to making sure your systems don’t just work—they shine.


Published on September 5, 2024
Updated on November 25, 2024

Both databases and XML have a concept of schemas. In databases the schema is defined by Data Definition Language in XML it is defined by XML Schema Definition.

Schemas whether used in XML or databases are needed for defining the structure, ensuring the validity, promoting interoperability, providing documentation, and maintaining the integrity and security of data across different systems and environments.

XML and databases serve different purposes when working with data. While XML is frequently used for data exchange, databases excel at transaction processing and analytics. Sometimes we need to convert from one format to another. As an example we use XML for data exchange between companies. We now want to report on this data. XML and XQuery are not a great fit for querying this data whereas databases and SQL are a perfect match. For this reason we need to convert the XSD schema to a database schema.

Creating a database from XSD converts the XSD to table structures represented as DDL in a database.

It involves the following steps:

  • Analysis: Read and analyse the XSD
  • Conversion: Convert XSD to SQL table structures and create table DDL
  • Documentation I: Create an ER diagram to view the XSD as database tables
  • Documentation II: Create a mapping document that maps the XPaths to database tables and columns. This is aka Source to Target Map. You map the XSD source elements to the target table columns in the database schema.

In this blog post we cover these steps in detail and show you step by step how to create a database schema from an XSD.

You can perform these steps manually but the whole process can also be automated using an XSD to database schema tool.

One such tool is Flexter. We show you how to use Flexter to create SQL tables from an XSD. You can then query those tables with SQL.

With Flexter you have three options to generate an SQL schema from XSD.

Option 1: You can create the database schema just from an XSD

Option 2: You can create the database tables from an XSD and a sample of XML files

Option 3: If you don’t have an XSD you can create the database from a sample of XML files only

Each of these options has advantages and disadvantages. We will go through them and walk you through the process one step at a time.

But first let’s take a step back and give an overview of schemas, the way they are used in XML and databases, and why we need to convert between them.

In a rush or hurry? Here are the key takeaways

  • Both databases and XML utilise the concept of schemas. In XML a schema is optional.
  • In databases a schema is defined by Data Definition Language (DDL). In XML it is defined by XML Schema Definition (XSD).
  • Sometimes we need to convert from an XSD to a database schema as these formats serve different purposes.
  • You can manually convert an XSD to database tables or use an XSD to SQL tables tool to automate the process. We have reviewed Flexter and XSD2DB to automate the schema generation from XSD only.
  • Sometimes, converting an XSD to a database isn’t possible because XSDs have features that provide significant flexibility, but these features don’t always have direct equivalents in database structures.
  • For those scenarios you can use a combination of XSD and XML samples to create the database schema. Flexter can generate SQL tables from both XSD and XML.
  • Using a combo of XSd and XML has the additional advantage that you only convert those parts of the XSD that are actually in use.

Overview of database and XML schema

What is a schema?

A schema is a blueprint or structure that defines the organisation, format, and constraints of data in a database or XML document.

Database schema (DDL) vs. XML schema (XSD)

In databases, a schema defines the structure of the data, specifying the tables, columns, data types, constraints and relationships between tables. Data that is loaded to tables needs to conform to this predefined format.

Similarly in XML, a schema (XSD) defines the structure of an XML document, specifying which elements and attributes are allowed, their data types, and their relationships. This ensures that XML documents conform to a predefined format.

In databases a schema is mandatory. In XML it is not.

In databases, we use Data Definition Language (DDL) to create schemas that define the structure of the data. Similarly, in the XML world, we use a language called XML Schema Definition (XSD) to define the structure and rules for XML documents.

XSD for XML schema

Let’s go through an example of an XSD. We took the CBAM XSD which is available online. CBAM stands for Carbon Border Adjustment Mechanism. It is a policy tool designed to address carbon leakage by placing a carbon price on imports of certain goods into the EU. It requires companies to submit data in XML based on the CBAM XML schema.

Let’s have a look at one of the elements inside the XSD. ImportedQuantityType is a complex type.

This complex type is made up of multiple elements that in turn refer to other simple and complex types.

The element with name SequenceNumber is defined by a simple type with name n..5

The data type for n..5 is an integer and it has a constraint or in XSD parlance a restriction of pattern \d{1,5}. This regular expression pattern means that the values must NOT be NULL and at a minimum have 1 and at a maximum have 5 digits.

The element with name ImportArea is based on a complex type ImportAreaType, which itself is based on a simple type an..5 of data type string. The constraint <xs:minLength value=”1″/> specifies that the minimum length of the string must be 1 character. This means that the string cannot be empty (NOT NULL); it must contain at least one character.

The constraint <xs:maxLength value=”5″/> specifies that the maximum length of the string is 5 characters. The string cannot have more than 5 characters.

Simple type vs. complex type in XSD

Simple types contain only text and no child elements or attributes. Simple types are primarily used for defining the data types of elements and attributes, such as strings, integers, dates, or custom data types like enumerations.

 

Complex types can contain child elements, attributes, or a mixture of text and child elements.

They are used when the XML element requires a more detailed structure, including nested elements or attributes.

Let’s have a look at how this complex type in the XSD can be represented as table DDL in a database schema

Data Definition Language (DDL) for databases

We converted the complex type ImportedQuantityType to an SQL table defined by DDL.

In the figure below I have highlighted various elements that make up the structure of the table and define the schema.

A data type defines the kind of data that can be stored in a particular column of a database table. It determines the type of values the column can hold, such as integers, floating-point numbers, characters, strings, dates, or binary data. In our example we have data types NUMBER (38,0), e.g. SEQUENCENUMBER and VARCHAR of different lengths, e.g. IMPORTAREA_IMPORTAREA.

A primary key is a column (or a combination of columns) in a database table that uniquely identifies each row in that table. The primary key constraint enforces uniqueness and ensures that no duplicate or NULL values are allowed in the primary key column(s). In our example DDL the primary key is the column PK_IMPORTEDQUANTITY.

A foreign key is a column (or a set of columns) in one table that refers to the primary key in another table. The foreign key constraint enforces a link between the two tables, ensuring referential integrity by making sure that the value in the foreign key column(s) matches an existing value in the primary key column of the referenced table. In our example DDL we have a foreign key constraint named FK_IMPORTEDGOOD. It links the table IMPORTEDGOOD to table IMPORTEDQUANTITY

The NOT NULL constraint ensures that a column cannot have a NULL value. When this constraint is applied to a column, every row in the table must contain a value for that column. PROCEDURE_REQUESTEDPROC is an example of a column with a NOT NULL constraint.

XSD to Entity-Relationship (ER) Diagram

An Entity-Relationship (ER) diagram is a visual representation of the structure of a database. It illustrates how entities (such as tables) in the database relate to one another.

The challenging part is mapping and converting the XSD to the database’s DDL. However, creating an ER diagram from the database schema is much easier, especially if you have a data modelling tool. With such a tool, you can easily reverse-engineer the DDL into a visually appealing ER diagram that serves several important purposes.

  • Communication Tool: ER diagrams are used to communicate the database design to stakeholders, including developers, business analysts, and project managers. They provide a clear and understandable representation of the database structure.
  • Problem Solving: ER diagrams can help in identifying and resolving issues in the database design by visualising the relationships and dependencies between different parts of the data model.
  • Training and Onboarding: ER diagrams are valuable for training new developers and database administrators, as they offer a straightforward way to understand the structure and relationships within the database.

We created the DDL from the CBAM XSD using Flexter. You can download the DDL from our website. We used DBeaver, a popular open source database tool that you can also use to reverse engineer

Step 1: Reverse engineer DDL

We collected the DDL from the database

Step 2: Create the ER diagram

To create the ER Diagram in DBeaver we just right click on a schema and select “View Diagram”

Step 3: Adapt layout

DBeaver also has an option to change the colours of tables and notations. You can also use the auto layout option to format the diagram..

XSD to Avro schema

With the rise of Big Data and Hadoop a wave of new data serialisation formats emerged. One such format is Avro. Avro is a row based data serialisation format. It is schema-based, meaning that data is always serialised together with its schema. This makes it self-describing with support for schema evolution.

Here are some of the use cases for Avro

  • Data Storage: Storing large volumes of data in a row based format.
  • Data Transmission: Streaming data between systems with the need for fast serialisation and deserialisation.

Avro is not a good fit for transaction processing (OLTP) or data analytics workloads (OLAP). A database schema is a much better fit for these workloads

Avro schema example

Why do we create a database schema from an XSD?

The short answer is that each format serves a different purpose. Different systems and applications often use different formats to store and exchange data. XML is a common format for data exchange, especially in web services and business-to-business communication. Databases, on the other hand, are often the primary storage mechanism for structured data for transaction processing and data analytics.

XML based on an XSD can be difficult to query directly, especially when dealing with complex or nested structures. By converting this data into a relational database schema, you can use SQL queries on tables to perform reporting and analysis more efficiently.

For the long answer refer to the section Why do we convert XML? in our ultimate guide on XML conversion.

Step by step tutorial: Converting XSD to tables in a database

In this step-by-step tutorial, we will guide you through the process of converting an XML Schema Definition (XSD) into a set of tables within a database. This conversion is essential for integrating XML-based data structures into relational databases.

Step 1: Read and analyse the XSD

  • You need to read through the information that comes with the XSD. For simple schemas, this is easy, but for big schemas based on data standards like FpML, the documentation can be several hundred pages long.

Here is an example of the CBAM XSD documentation (the zip file includes both the documentation and XSDs). CBAM is a simple standard of the European Union for reporting on Carbon Border Adjustment Mechanism.

Extract from the CBAM documentation of a simple XSD

  • You will need to perform a structural analysis looking at element and attribute definitions including their types, default values, and use.

A simpleType inside file stypes.xsd of CBAM.

  • Examine simple and complex types defined in the XSD to understand their constraints (e.g., length, pattern, enumeration).

A complex type inside file qreport.xsd of CBAM

  • Understand the XML Schema data types (e.g., string, integer, dateTime) that are used.

String, date and dateTime data types inside stypes.xsd of CBAM

  • Examine type derivations by checking the usage of extensions and restrictions to derive new types from existing ones.
  • Review where elements and attributes are defined globally (available throughout the schema) versus locally (within a specific context).
  • Analyse namespaces.
  • Analyse the use of <include> and <import> to modularize the schema. Verify that external schemas are correctly referenced and integrated. Very large and complex XML Schema Definitions often contain multiple dependent XSD files. When analysing the XSD you need read all of these XSD files to get the full picture

<include> inside qreport,xsd that includes stypes.xsd of CBAM

  • Analyse constraints
    • Cardinality Constraints: Verify the correctness of cardinality constraints (minOccurs, maxOccurs) on elements.
    • Identity Constraints: Review key, keyref, and unique constraints to ensure they accurately define uniqueness and referential integrity within the XML documents.

Constraints inside styles.xsd of CBAM

  • Complexity Analysis: Evaluate the complexity of the schema, particularly recursive definitions and deeply nested structures
  • If the schema is part of a versioning system, analyse the changes between versions to ensure backward compatibility.

Version 18.20 of XSDs of CBAM, which might be different from previous versions

Step 2: Convert XSD to database schema

In this step you use the output of the analysis in the first step to model the tables in your database schema.

It typically involves the following tasks

  • Map XSD Attributes to SQL Columns: Convert attributes of elements into columns of the corresponding SQL tables. Suggested reading: Find out everything you need to know about XML mapping with our comprehensive guide!
  • Define Data Types: Map XSD Data Types to SQL Data Types: Match XSD data types (e.g., xs:string, xs:integer, xs:date) to corresponding SQL data types (e.g., VARCHAR, INT, DATE).
  • Handling Restriction and Constraints: If the XSD defines restrictions (e.g., string length) or constraints, apply these as constraints in the SQL schema, e.g. if the XSD contains a nillbale constraint define this as a NOT NULL constraint in the databases table.
  • Establish Primary and Foreign Keys
  • Handle Complex Types and Nested Structures
  • Create SQL Table Definitions (DDL – Data Definition Language): Include constraints such as NOT NULL, UNIQUE, and CHECK as necessary to enforce the rules identified in the XSD

Here is the DDL for the SQL tables that Flexer made from the CBAM XSD. As you can see it added the mappings from XPath to table columns in the DDL COMMENT.

Pro Tip: Trim your XSD to just what you need.

You can create a leaner database schema that is easier to maintain, performs better, and is more in line with your actual business needs by restricting the XSD to database conversion to the parts you actually use.

 

You can do this by examining the XML documents that conform to the XSD to see which elements and attributes are present. We show you how you can automate this process in a separate section and also outline the downside of this approach.

Step 3: Documentation. XSD to ER diagram and mapping

In the last step you need to document the database schema you created from an XSD in an ER diagram. Using a data modelling tool you can reverse engineer the DDL of the tables and view the XSD as an ER diagram.

We took the CBAM XSD and two CBAM sample XML files: The Quarterly Report sample file for representatives and the CBAM Quarterly Report sample file for importers. We analysed all the files, used Flexter to create a database schema from both the XSD and the XMLs. As you can see we just ended up with SQL tables for data that was actually used in the XML sample files.

 

If not already done you also need to create a mapping document that maps the elements in the XSD to the columns in the SQL tables.

Here is a data lineage diagram mapping the XPaths in the XML to the columns in the SQL tables.

Creating SQL tables with an XSD to database schema tool

When it comes to turning your XML/XSD into a relational schema, you’ve got a few paths you can take:

  • Option 1 – Straight from the XSD: This method covers all possible XPaths and XML elements, which is great for completeness but can lead to a massive relational target schema that might be overkill.
  • Option 2 – Best of both worlds: If you can, using both the XSD and some XML samples is a smart move. The XML samples act like a reality check for the XSD, helping you ditch the unused parts and keep what’s actually needed.
  • Option 3 – No XSD? No problem: Sometimes, you don’t have an XSD handy. In those cases, you can create a schema based on a sample of XML files.

We’re going to dive into each of these methods, from their pros to their cons, so that you can figure out which one works best for you. Throughout, we’ll also be taking you through how a dedicated XSD to database schema tool like Flexter can do the heavy lifting for you.

Option 1: XSD to database schema straight from the XSD

When you decide to use an XSD to build a database schema, there are some pros and cons to think about:

The good

  • Comprehensive coverage: Making a database schema from an XSD means you’re all set to handle any XML document that fits the XSD. All the possible data points from the input are mapped to the output with their cardinality and datatypes known. Note this remains true as long as the XML Source Data fully conforms to the XSD definitions used.
  • Speedy setup: You don’t need to have a representative sample of XML docs. You can jump straight to making your target schema without them, which is a time-saver.
  • Smart organisation: XSD to database schema can spot shared types across your data and put them together in one table in your target schema. This neat trick makes everything cleaner and simpler to manage.

The not-so-good

  • Overkill alert: If your XSD is one of those big ones used in industry standards, it’s probably covering a zillion business processes. But here’s the catch – your company is likely only using a slice of those. So, you end up with a target schema packed with tables and columns you’ll never use. It’s like having a toolbox where you only use the hammer.
  • For highly complex XSDs designed with extensive flexibility, it may not be feasible to generate a database directly from the XSD. This is because the possible combinations of XPaths can reach into the hundreds of thousands or even millions. For those scenarios you need to combine XML and XSD to generate a database schema.
  • Memory munchers: Got XPaths in your XSD that go on and on, looping back on themselves? Preparing those for your target schema can be a real memory hog.

How Flexter can help you

And how do we tackle creating a database schema from just an XSD with Flexter?

It’s a one-command wonder. You throw in some mandatory and optional parameters, and Flexter does its magic, crafting the database schema and mappings to SQL tables all by itself.

Let’s look at how we can create the database model with Flexter’s Elevate optimization (switch -g1 in the command line prompt). We took the XSD from CBAM and processed it with Flexter.

Example of output:

While analysing the XSD of CBAM 18.20, Flexter came across an issue.

While the element of type an..4000 is used inside the QReport_18.20.xsd it has not been defined. Flexter does not abort processing but sets the type anyType for this element. It sets the target column to VARCHAR with length NULL which will be saved as a LOB data type, e.g. a CLOB in Oracle.

At the end of the XSD analysis process, Flexter prints out the ID of the generated database schema. This ID can be used for converting XML to the database schema in a separate step.

Flexter also generates an ER diagram from the CBAM XSD.

Example of XSD to ER diagram created by Flexter

Flexter also automatically generates a source to target map. The source to target Map provides a detailed overview of how each element and attribute in the source XML document is mapped to the corresponding columns and tables in the relational database.

Pro tip: XSD to Avro schema?

In this blog post we have covered how to turn an XSD to SQL tables in a database. You can also use Flexter to convert an XSD to an Avro schema. In fact, Flexter also supports XSD to other data lake formats such as Parquet, ORC, and Delta Tables.

Option 2: XSD to database schema using XSD and XML

You can also create a target schema by combining the information from the XSD with the intel from the XML documents. This combo method is pretty neat because you get to merge insights from both the data and the schema.

The good

  • Comprehensive and informed: You get a super smart database schema, thanks to info from both sources.
  • Easy to navigate: The Target Schema gets streamlined. You can spice up the XSD info with what you learn from the XML data. This means you only bother with the XPaths that are actually being used inside the XML.
  • Loopy XPath expert: It’s great at handling those loopy XPaths in XSDs (deep nesting) and Substitution Groups using some stats magic. Plus, you can use Flexter’s reuse optimisations.

The not-so-good

  • Heavier prep work: You need to do some homework on your XML/JSON data and analyse how it fits in with your XSD. If getting a good mix and sample of XML files is a headache, you might have to use all your XML data.
  • You might be missing important XPaths that do not exist in your set of XML sample files, which may require refactoring and schema evolution at a later stage.

How Flexter can help you

Let’s have a look at how Flexter handles this.

First, Flexter gathers information from a sample of XML documents. You can repeat this step as many times as needed, feeding new samples of XML documents to Flexter each time.

Example of output:

Next we augment the information from the XML documents with intel from the XSD, combine the two and create the database schema.

We pass the ID of the schema created in the previous step together with the CBAM XSD files. We also apply Flexter’s Elevate optimisation (-g1 switch).

Example of output:

The generated database schema can be used in a Conversion task to convert XML documents to SQL tables.

Example of database schema using CBAM XSD and CBAM XML sample files:

XSD & XML to ER diagram online

You can use the free version of Flexter to create table structures online from an XSD and a sample of XML. The enterprise version can generate SQL from an XSD only. Please refer to the Flexter data sheet for a full set of features.

Option 3: Creating a database schema from XML only

You can create a database schema from XML documents. You don’t necessarily need an XSD.

For JSON files, that’s pretty much your go-to since few JSON documents come with a JSON schema.

You need a representative sample of XML documents that have coverage of all possible XPaths. Or, if you’re feeling thorough, just use all the XML or JSON docs you’ve got.

Let’s break down the highs and lows of using XML samples inside Flexter:

The good

  • XSD-independent: You can get your XML sorted without XSD.
  • JSON schema creation: You can create a target schema for JSON files as well.
  • Just the essentials: The metadata is super streamlined – only includes elements that are actually in your data.
  • Mismatch fixer: It may offer an alternative solution for scenarios where the XSD and XML samples don’t match, or have different versions.
  • Recursive depth analysis: It’s ideal for recursive XPaths to determine the depth of the hierarchy

The not-so-good

  • Limited functionality: You can’t apply Flexter’s reuse optimisation.
  • Homework required: You’ve got to gather some intel on your XML/JSON first, which means figuring out a good sample or, worst-case, using all your data.

How Flexter can help you

Let’s get hands-on with Flexter and create a target database schema from XML documents only.

Point Flexter to your sample of XML documents and it will analyse the sample to derive data types, relationships, constraints etc. and create an optimised relational SQL schema from the sample.

Example of output:

Flexter can convert an XML to ER diagram online.

XSD2DB: Creating a SQL Server database schema from XSD

XSD2DB is an open source XSD to database tool and command line utility used to convert an XML Schema Definition (XSD) into a SQL Server database schema.

This utility reads an XSD file and interprets its structure, and generates corresponding SQL Server database schema objects such as tables, columns, blob fields, memo fields and relationships like PK,FK and other constraints.

It runs on Windows and only supports MS SQL Server to create tables from an XSD.

Install XSD to database converter XSD2DB

Download the XSD2DB utility from the site: XSD2DB Utility and unzip the folder to a directory.

Open the command line interface (CLI) and traverse to the path where the utility xsd2db.exe has been extracted.

To check the available options for all input parameters for this utility ,execute the help (-h) option on CLI.

Let’s dive into the steps for using this utility with some sample XSD files. We’ll use the Carbon Border Adjustment Mechanism (CBAM) QReport as an example. The file QReport_18.20.xsd contains the XML schema definition related to imports under the CBAM policy. QReport_18.20.xsd includes complex types and references another XSD file, stypes_18.2.xsd, which provides additional types and definitions necessary for the QReport.

Create database from XSD

Step 1: Download CBAM QReport: CBAM QReport. Unzip the folder and copy the files QReport_18.20.xsd and stypes_18.20.xsd to any desired location.

Step 2: On the command line interface traverse to the path where the XSD2DB.exe file is located. Execute the following command

 -force (-f) : Drop the database if it exists and recreates new database

-location (-l) : The Microsoft SQL Server Host name in this example

-name (-n) : The name of the database to be created eg : xsd2db

-schema (-s) : The absolute path to the root XSD schema definition file

eg: <<.\.>>\QReport_18.20.xsd

-type (-t) : The type of database to connect to [Jet | Sql | OleDb]

eg : Sql is used for Microsoft SQL server

Step 3: While executing the utility we experienced errors like “n..5 not declared”. In QReport_18.20.xsd is the root XSD and includes Stypes_18.20.xsd. XSD2DB is not able to figure out the relative path to the include file. The database schema generation failed.

Error

Include section

Step 4: To resolve this issue, we specified the absolute paths for references to other XSD files, such as the one in the include section for stypes_18.2.xsd. We modified the include section in QReport_18.20.xsd to address this problem.

Step 5: Another issue with this utility is that it fails and throws an exception if any element or attribute declarations are missing. For example, in the error message “an..4000 is not declared,” the type “an..4000” is referenced in QReport_18.20.xsd, but the definition for this element type is missing in the referenced file stypes_18.20.xsd

Step 6: To resolve this issue we manually added the type definitions in stypes_18.20.xsd.

XSD to ER diagram

In the next step we created an ER diagram from the SQL tables that XSD2DB generated.

Step 1: Connect to Microsoft SQL Server and open Object Explorer to check out the database created. You’ll see tables with primary keys that match the XML Schema Definition. The tool also sets up relationships between the tables using foreign keys. In total, 42 CBAM Qreport tables were created. The screenshot below shows some of these tables, along with their primary and foreign keys.

Step 2: Right-click on the database xsd2db in Object Explorer, then go to Tasks -> Generate Scripts. This will open a wizard where you can select the objects you want to include. Save the DDL file to your desired location. This script contains the CBAM Qreport DDL.CBAM XSD DDL.

Step 3: You can view the database’s physical model using DBeaver. Just download DBeaver, create a new database connection, and connect it to Microsoft SQL Server.

Step 4: Once successfully connected we can view the xsd2db database objects on DBeaver. Expand the schema and tables to view the objects.

Step 5: Select the tables in DBeaver, right-click, and choose to create a new ER diagram. Check out the CBAM XSD ER Diagram at the provided link. CBAM XSD ER

Limitations of XSD2DB

In addition to the challenges we faced with absolute paths and the inability to process the XSD when non-fatal errors occur, the most significant limitation is the utility’s difficulty in handling complex XSD files.

We tried to convert the FpML 5.12 XSDs to a database schema. XSD2DB failed this task. Using Flexter’s advanced feature of creating a database schema from both XSD and XML on the other hand succeeded.

As you can see the schema is rather complex

Here is a summary of some of the limitations of XSD2DB

  • XSD2DB only supports MS SQL Server to generate tables from an XSD.
  • This XSD to database schema tool does not provide any optimisations such as Flexter’s Elevate or Reuse optimisation. These optimisation algorithms make sure that your database schema is as neat and simple as possible. While XSD2DB created 42 tables, Flexter’s smart optimisations reduced this number to a handful.
  • It does not provide a feature to create a database schema from a combination of XML and XSD or just from XML.
  • It does not work with JSON.
  • It does not give you a mapping document that maps the elements in the XSD to the columns in the SQL tables.
  • When a large XSD schema is split into multiple files but belongs to the same namespace, those files are often referenced in the main XSD file. If an absolute path to these referenced XSD files is not provided, it can lead to issues during processing.
  • If an attribute definition is missing in the XSD, it will trigger an error and prevent the creation of database objects. The utility cannot assign default attribute types in such cases, causing the process to halt.

In conclusion, XSD2DB works for very simple scenarios but fails at creating a database schema from more complex XSDs.

Flexter data converter

Make it easy with Flexter

Getting from XMLs or XSDs to a database schema is just one piece of the whole XML conversion puzzle. You start by analysing the XML and any XSD you’ve got, then move on to mapping those XML and XSD elements to table columns before looking at the actual data conversion, testing, error logging, and documentation.

Doing all this by hand? Yeah, it’s a slog. But here’s a little secret: Flexter can be your shortcut.

Flexter can take your XSD, XML, or both, and do all the heavy lifting for you. It not only builds the relational database schema but also converts the data and even creates some handy docs like ER diagrams and data lineage maps showing where everything came from and where it’s headed. Talk about a time saver!

Why not give the free online version of our XML converter Flexter a try? See for yourself what XML conversion software can do for you. Alternatively, talk directly to one of our XML conversion experts to discuss your use case in detail.

Further reading

The Concept of Transformation of XML Documents into Quasi-Relational Model. Chapter in this book Beyond Databases, Architectures and Structures

XSD to Database schema optimisation algorithms

Re-use algorithms for converting XML and JSON to a relational format

Optimisation algorithms for converting XML and JSON to a relational format

Complete Oracle XSD Guide – Register XSD & Validate XML

Frequently Asked Questions

Can you always create a database schema from an XSD?

It is not always possible to create a database schema directly from an XSD. XSD offers a lot of flexibility in schema design, which can not always be translated into a relational database schema.

Here are some examples

<any> and <anyAttribute> elements

The XSD <any> and <anyAttribute> elements allow XML documents to include elements or attributes that are not explicitly defined in the schema. This level of flexibility is difficult to replicate in a database schema because relational databases rely on predefined columns and data types. Handling arbitrary elements or attributes may require dynamic schema modifications or generic storage (e.g., key-value pairs), which can be inefficient and lead to data inconsistency.

Substitution Groups

In an XSD with Substitution Groups, different elements that share the same base type can appear in the same place in the XML document, depending on the actual content. This is similar to polymorphism in object-oriented programming.

Relational databases are typically not designed to handle polymorphic structures easily. Representing a substitution group in a database might require the use of a single table with many nullable columns (to accommodate different possible elements), or multiple tables with complex joins and foreign key relationships. This can lead to a database schema that is difficult to manage and query.

What methods are available to convert XSD to a database schema?

For simple scenarios you can typically convert the XSD to a database schema without providing XML. For more complex XSD schemas we advise to take a combination of XSD and XML to create SQL tables in a database schema. This has two advantages:

  • It works even for scenarios where the XSD uses flexible design patterns such as <any> or <anyAttribute> elements or Substitution Groups.
  • Typically a large part of an XSD is not used. FpML is an industry data standard used in Finance and it covers dozens of business processes. An organisation typically only implements a small subset of these processes. Combining the information from the XSD with the information from the actual XML can significantly simplify the schema and make it much easier for downstream consumers to work with it.

What tools are available to convert an XSD to a database schema?

In this post we have shown you how Flexter generates a database schema from an XSD, an XSD + XML, and XML. It works for both simple and complex scenarios. It optimises and simplifies the database schema using Elevate and Reuse algorithms. You can use it to generate a schema for any SQL database, data lake formats such as Avro, Parquet, Delta Tables, Iceberg, and data warehouse platforms such as Redshift, Snowflake, BigQuery, Databricks. It even works for situations where you can’t use an XSD only.

For simple requirements you can use the open source tool XSD2DB to convert the XSD. It does not offer any of the advanced features of Flexter.

Maciek

About the author:

Maciek

Co-founder of Sonra

Maciek is the Co-founder of Sonra. He has a knack for turning messy semi-structured formats like XML, JSON, and XSD into readable data. With a brain wired for product and data architecture, Maciek is the magic ingredient to making sure your systems don’t just work—they shine.