SQL on Hadoop, BigQuery, or Exadata. Please don’t call them MPP.

Uli Bethke May 10, 2018

I often hear people referring to SQL engines running against HDFS or object storage as MPP. Strictly speaking this is incorrect. Let me first explain what an MPP database is and then explain why engines such as Presto etc. should not be called an MPP engine. MPP In an MPP engine we evenly distribute data ...

Read More

You might be surprised to hear, but Hadoop is a poor choice for a data lake

Uli Bethke May 3, 2018

As there are a lot of definitions on what constitutes a data lake let’s first define what we actually mean by it. A data lake is similar to the staging area of a data warehouse with a couple of core differences. Let’s look at the commonalities first. Just like the staging area of a data ...

Read More

A brief history of XML – From hype to useful data format

Vadim Mytarev October 18, 2016

Is XML really dead? When it first became popular about 20 years ago, XML was meant to be the one and only format to serialize, encapsulate, and exchange data. The serialization format to end all serialization formats so to speak. This was a bold claim. Has it materialised? Over the last couple of years it ...

Read More

Big Data News: HUG Ireland’s 1st 2016 Big Data Event, Airbnb’s Predictive Model using NPS and Hive Optimization

Uli Bethke January 15, 2016

Hadoop User Group (HUG) Ireland packed the house with a great evening on Apache Mesos/Myriad and an overview of Airbnb’s Predictive Model After a restful holiday season, the new year kicked off in style for Hadoop User Group (HUG) Ireland with its opening 2016 event at Synchronoss on January 11th. We heard from Mary Mangru, ...

Read More
1 2 3 7