Exploring the performance enhancements of HyperLogLog on Spark and adding splittable and seekable features to Gzip in a new open source project called GZinga Life is never dull in big data and as I left a great Spark Dublin meetup last night pondering the distributed performance enhancements of using dataframes in Spark, I was once again struck by the continuous ...
About the author
Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.
Uli is a regular contributor to blogs and books, holds an Oracle ACE award, and chairs the the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.