Big Data News: HyperLogLog with Spark and Open Source GZinga Compression

Uli Bethke Big Data, Spark, Technology

Exploring the performance enhancements of HyperLogLog on Spark and adding splittable and seekable features to Gzip in a new open source project called GZinga Life is never dull in big data and as I left a great Spark Dublin meetup last night pondering the distributed performance enhancements of using dataframes in Spark, I was once again struck by the continuous ...

Big Data News - Predict Conference 2015… A Premier Big Data event!

Uli Bethke Big Data, Cloud, Data Science, Technology

From Predict Conference 2015 to Data Mining and Innovation at Twitter There is no doubt that before I went to represent Sonra at Predict Conference 2015, which was organised by Creme Global. I was a little dubious about the efficacy of the conference along with its outcomes for both Sonra and HUG Ireland. Would it allow us to effectively deliver ...