Converting XML and JSON to a Data Lake

Uli Bethke September 30, 2021

Data lakes are a popular design pattern in data analytics. A data lake is used to store a copy of data coming from operational source systems such as relational databases. You can choose from dozens of tools to populate a data lake from relational and structured data sources. However, it gets tricky when you want ...

Read More

Converting SDMX XML to BigQuery

Uli Bethke June 22, 2021

You don’t have many options for processing XML on BigQuery. If our data is in Avro, JSON, Parquet, etc. then you can load it easily to BigQuery. But what about XML? BigQuery does not provide any native support to deal with XML documents. Converting XML to BigQuery is a manual, time consuming and error prone ...

Read More

Snowflake vs. Redshift – Support for Handling JSON

Uli Bethke June 22, 2021

ANSI SQL 2016 introduced support for querying JSON data directly from SQL. This is a common use case nowadays. JSON is everywhere in web based applications, IOT, NoSQL databases, and when querying APIs. In this document we compare Amazon Redshift and Snowflake features to handle JSON documents. Loading JSON data In this section, we will ...

Read More

Using Virtual Data Marts the right way

Uli Bethke May 26, 2021

Virtual data marts can be a useful design pattern but there are a few things you should know before you use them. Virtual data marts are logical views modeled dimensionally on top of an integration layer or a persisted staging area. Don’t confuse them with data virtualisation or data federation. Virtual data marts are built ...

Read More
1 2 3 21