Converting 3GPP Configuration Management XML (3G, UMTS, 5G) to Snowflake

Uli Bethke Flexter, XML

XML in the telecom industry

XML standards have been widely adopted in the telecom industry. The standards themselves and their specifications are developed by 3GPP. The 3rd Generation Partnership Project (3GPP) unites seven telecommunications standard development organizations (ARIB, ATIS, CCSA, ETSI, TSDSI, TTA, TTC). It provides their members with a stable environment to produce the specifications that define 3GPP technologies.

Both Configuration Management and performance/event data has been specified as XML in various standards. Configuration data is typically very complex and comes in small volumes. However, the files themselves are quite large and frequently exceed 1 GB. Performance data is less complex but comes in large volumes. The schemas (XSD) for telco data are typically quite large and complex and comprise standards such as UTMS, 4G, 5G etc. Only a subset of the elements in these huge schemas is used by the individual XML files. The XML documents themselves will contain many recursive relationships and self-referencing elements, which adds more complexity.

Schemas change frequently when new versions are released and can be customised and modified by regions or individual telco vendors such as Ericsson or Huawei. For each release you will find hundreds of changes, which may require a significant amount of refactoring. Some of these XSD schemas have hundreds of complex types with thousands of XML elements.

Flexter, our data warehouse automation tool for XML, JSON, and industry data standards is a perfect fit for these requirements. Flexter loves complexity. You have large volumes of data? Even better.

3GPP Configuration Management XML

The 3GPP standards and specifications come with XML file format definitions. These are documented extensively, e.g. the specification for UMTS configuration management.

Configuration Management (CM) provides the telco operator with the ability to assure correct and effective operation of the network as it evolves. CM actions have the objective to control and monitor the actual configuration on the Network Elements (NEs) and Network Resources (NRs), and they may be initiated by the operator or by functions in the Operations Systems (OSs) or NEs.

Let’s have a look at a high level of the XSD for configuration management.

As you can see the XSD is split into three parts: 1) Header 2) Data 3) Footer

The Following XML namespaces are potentially used in Configuration data XML files:

  • the default XML namespace is associated with the configuration data files base XML schema bulkCmConfigDataFile.xsd
  • the XML namespace prefix xn is defined for the XML namespace associated with the NRM specific XML schema genericNrm.xsd for the Generic Network Resources IRP NRM
  • the XML namespace prefix un is defined for the XML namespace associated with the NRM specific XML schema utranNrm.xsd for the UTRAN Network Resources IRP NRM
  • the XML namespace prefix gn is defined for the XML namespace associate with the NRM specific XML schema geranNrm.xsd for the GERAN Network Resources IRP NRM
  • XML namespaces prefixes starting with vs, e.g. vsRH011, are reserved for the XML namespaces associated with the vendor-specific XML schemas

 

Masking 3GPP Configuration Management XML

For the purpose of this blog post we will use a single Configuration Management XML file. It comes in at ~500 MB uncompressed. You will see later in the relational ER diagram that it is quite complex.

We will start by masking this XML file with Paranoid. To mask data we have to provide a path to our file and a path to an output location. Paranoid will create the folder automatically, no need to create it first.

Optionally Paranoid has the feature to mask individual elements inside of a document.

Let’s check the file after obfuscation

After masking 3GPP Telco Standard XML file we can start with converting it.

Converting 3GPP Configuration XML

Flexter exposes its functionality through a RESTful API. Converting XML/JSON to Snowflake can be done in a few simple steps. For more details please refer to the FaaS API documentation.

Step 1 - Authenticate

Step 2 - Define Source Connection (Upload or S3) for Source Data (JSON/XML)

Step 3 - Optionally define Source Connection (Upload or S3) for Source Schema (XSD)

Step 4 - Define your Target Connection, e.g. Snowflake, Redshift, SQL Server, Oracle etc.

Step 5 - Convert your XML/JSON from Source to Target Connection

Let’s go through these steps for 3GPP XML data and convert it to a relational format in Snowflake.

Step 1 - Authenticate

To get an access_token you need to make a call to /oauth/token with Authorization header and 3 form parameters:

  • username=YOUR_EMAIL
  • password=YOUR_PASSWORD
  • grant_type=password

You will get your username and password from Sonra when you sign up for the service.

Example of output

Step 2 - Define Source Connection (Upload) for Source Data (3GPP CM XML)

In a second step we upload the 3GPP Telco Standard XML source data

Example of output

Step 4 - Define Target Connection (Snowflake)

As we don’t have a Source Schema we skip the optional step of defining a Source Schema.

Instead we define our Target connection. In this example we convert our XML data to a relational format in Snowflake.

We give the Target Connection a name and supply various connection parameters to the Snowflake database.

Example of output

Convert XML/JSON automatically to a Database, Text, or Hadoop

No manual coding
Cut cost by up to 80%

Find out more

Step 4 - Convert XML data from Source Connection (Upload) to Target Connection (Snowflake)

In the next step we will convert our XML data. Data will be written directly to the Snowflake Target Connection.

Example of output

Example of ER Diagram

There are many hierarchical levels in the relational output. All the little dots you see are tables in the ER-diagram. We can also see hundreds or even thousands of different data points. You can download the ER diagram from this location

Let’s run some SQL queries against the output.

The query shows the obfuscated values for our Configuration Management data set.

Conclusion

We have shown you how easy it is to obfuscate and convert 3GPP Telco Standard XML with Paranoid and Flexter. If you are interested in Flexter you can try out the free version online.

Our enterprise edition can be installed on a single node or for very large volumes of XML on a cluster of servers.

If you have any questions please refer to the Flexter FAQ section. You can also request a demo of Flexter or reach out to us directly.

 

About the author

Uli Bethke LinkedIn Profile

Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.

Uli is a regular contributor to blogs and books and chairs the the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.