Take the pain out of XML processing on Spark.

Maciek Kocon Big Data, Spark, XML

Note: We have written an updated version of this post that shows XML conversion on Spark to Parquet with code samples. Did you ever have to process XML files? Complex and large ones? Lots of them? No matter which processing framework or programming language you use it always is pain. It never is easy. It can be sure that it ...

Big Data News: Convergence with Mapr and Faster Stateful Streaming Processes with Spark

Uli Bethke Big Data, DFS, HUG Ireland, MapR, Spark

Mapr on Impedance Mismatch and how convergence is achieved for layered architecture along with Databricks on using the new Spark API “mapWithState” for faster Stateful Spark Streaming As our big data world comes to the end of another week, the team at Sonra have been once again impressed by the weeks highlights in big data. Mapr has shared its insights ...

About the author

Uli Bethke LinkedIn Profile

Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.

Uli is a regular contributor to blogs and books, holds an Oracle ACE award, and chairs the the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.

Big Data News: Apps Development with “JSON” and an Open Source “Spark” Library for Geospatial Analysis

Uli Bethke Business Intelligence, Data Discovery, MapR, Spark

Faster Big Data Apps Developments with Open Source JSON UI called “OJAI” and how the Spark Library “Magellan” will come to the rescue in Geo Spatial Analysis As the week moves closer to an end, the team at Sonra have been impressed with the developments reviewed, which positively reflects the direction our community is headed in. The increasing use of ...

About the author

Uli Bethke LinkedIn Profile

Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.

Uli is a regular contributor to blogs and books, holds an Oracle ACE award, and chairs the the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.

Big Data News: HyperLogLog with Spark and Open Source GZinga Compression

Uli Bethke Big Data, Spark, Technology

Exploring the performance enhancements of HyperLogLog on Spark and adding splittable and seekable features to Gzip in a new open source project called GZinga Life is never dull in big data and as I left a great Spark Dublin meetup last night pondering the distributed performance enhancements of using dataframes in Spark, I was once again struck by the continuous ...

About the author

Uli Bethke LinkedIn Profile

Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.

Uli is a regular contributor to blogs and books, holds an Oracle ACE award, and chairs the the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.

Big Data News – MapR’s Deep Dive into Apache Drill and Gartner’s Big Data Predictions for 2020

Uli Bethke Big Data, Hadoop, Spark, SQL for Analysis

30% of all Enterprises will use intermediaries for big data by 2017 Another week has passed and our big data community has been busy around the world. Some interesting movements in our industry has arisen with Doug Laney on Forbes making three predictions on the advancement of big data to 2020, including the rise of 3rd party big data contractors helping ...

About the author

Uli Bethke LinkedIn Profile

Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.

Uli is a regular contributor to blogs and books, holds an Oracle ACE award, and chairs the the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.