Self-Service Analytics in the Data Lake

Uli Bethke Big Data, Business Intelligence, Data Warehouse, Hadoop, MapR, Spark

Deriving Value from your data... We all know that decision makers and analysts need quick access to all of the structured and unstructured data in an enterprise. Typically this data is locked away in various source systems and can’t be queried from a single central location The data warehouse set out to fix this problem ages ago and it has ...

About the author

Uli Bethke LinkedIn Profile

Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.

Uli is a regular contributor to blogs and books, holds an Oracle ACE award, and chairs the the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.

Window Functions (aka Analytic Functions) in Spark.

Uli Bethke analytic functions, Big Data, MapR, Spark, SparkSQL, SQL

As of Spark 1.4.0 we now have support for window functions (aka analytic functions) in SparkSQL. At Sonra we are heavy users of SparkSQL to handle data transformations for structured data. We also use it in combination with cached RDDs and Tableau for business intelligence and visual analytics. Spark SQL and Window Functions: The rationale I am a strong supporter ...

About the author

Uli Bethke LinkedIn Profile

Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.

Uli is a regular contributor to blogs and books, holds an Oracle ACE award, and chairs the the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.

Hadoop User Group Ireland Meetup (24 June 2015): An Introduction to Spark

Uli Bethke Big Data, Hadoop, HUG Ireland, Spark, SparkSQL, SQL, Tableau, Uncategorized

Thanks again to everyone who attended the third Hadoop User Group Ireland meetup. Also thanks to Bank of Ireland Grand Canal Square for making the venue available. Participants in the event can send feedback to their Twitter and Facebook accounts: facebook.com/BOIGrandCanalSquare twitter.com/BOIGrandCanalSQ. Also thanks to Étienne from Idiro and Antonio from HP for their great presentations. We have all of ...

About the author

Uli Bethke LinkedIn Profile

Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.

Uli is a regular contributor to blogs and books, holds an Oracle ACE award, and chairs the the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.

Multiple Spark Worker Instances on a single Node. Why more of less is more than less.

Uli Bethke Big Data, Hadoop, Spark

If you are running Spark in standalone mode on memory rich nodes it can be beneficial to have multiple worker instances on the same node as a very large heap size has two disadvantages: - Garbage collector pauses can hurt throughput of Spark jobs. - Heap size of >32 GB can't use CompressedOoops. So 35 GB is actually less than ...

About the author

Uli Bethke LinkedIn Profile

Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.

Uli is a regular contributor to blogs and books, holds an Oracle ACE award, and chairs the the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.

In-memory analytics with Tableau, SparkSQL, and MapR

Uli Bethke Big Data, Hadoop, Hive, MapR, Spark, SparkSQL, Tableau

Last week Tableau released version 9.0 of their data visualisation tool. From a Big Data point of view the nicest new feature was support for querying cached (in-memory) SchemaRDDs (Data Frames as of Spark 1.3). In this tutorial I will show you how to connect to Spark 1.2.1 on the MapR 4.1 sandbox with Tableau 9.0. Pre-requisites: - Download the ...

About the author

Uli Bethke LinkedIn Profile

Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.

Uli is a regular contributor to blogs and books, holds an Oracle ACE award, and chairs the the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.