Big Data News: HyperLogLog with Spark and Open Source GZinga Compression

Uli Bethke Big Data, Spark, Technology

Exploring the performance enhancements of HyperLogLog on Spark and adding splittable and seekable features to Gzip in a new open source project called GZinga Life is never dull in big data and as I left a great Spark Dublin meetup last night pondering the distributed performance enhancements of using dataframes in Spark, I was once again struck by the continuous ...

Two Use Cases of Data Discovery Tools

Uli Bethke Big Data, Business Intelligence, Data Discovery, Hadoop

Data discovery tools are a relatively recent phenomenon as can be witnessed by the fact that there is no separate Gartner Magic Quadrant for them. Data discovery tools allow users without programming skills to wrangle and transform raw data. There are two main use cases for this type of tools (1) Self-service Business Intelligence. Self-service BI promises business users without ...

Big Data News - LinkedIn’s Ops team raises it’s Hadoop game with “Rewinder”

Uli Bethke Business Intelligence, Distributed Computing, Technology

LinkedIn’s new “Rewinder” tool outshines Apache Resource Manager and Job History Server on its Hadoop clusters… SIREn Solutions announces Kibi Another week has passed where the team at Sonra have been impressed with developments in our big data community. LinkedIn is a big data company, which also happens to do other stuff that pays the bills and has faced the ...

Big Data News - Predict Conference 2015… A Premier Big Data event!

Uli Bethke Big Data, Cloud, Data Science, Technology

From Predict Conference 2015 to Data Mining and Innovation at Twitter There is no doubt that before I went to represent Sonra at Predict Conference 2015, which was organised by Creme Global. I was a little dubious about the efficacy of the conference along with its outcomes for both Sonra and HUG Ireland. Would it allow us to effectively deliver ...

Big Data News – MapR’s Deep Dive into Apache Drill and Gartner’s Big Data Predictions for 2020

Uli Bethke Big Data, Hadoop, Spark, SQL for Analysis

30% of all Enterprises will use intermediaries for big data by 2017 Another week has passed and our big data community has been busy around the world. Some interesting movements in our industry has arisen with Doug Laney on Forbes making three predictions on the advancement of big data to 2020, including the rise of 3rd party big data contractors helping ...