Big Data News: Apps Development with “JSON” and an Open Source “Spark” Library for Geospatial Analysis

Uli Bethke Business Intelligence, Data Discovery, MapR, Spark

Faster Big Data Apps Developments with Open Source JSON UI called “OJAI” and how the Spark Library “Magellan” will come to the rescue in Geo Spatial Analysis


As the week moves closer to an end, the team at Sonra have been impressed with the developments reviewed, which positively reflects the direction our community is headed in. The increasing use of JSON in our community has received a boost by MapR’s development of OJAI for use with storage solutions like MapR-DB.

They have brought faster application development with the open JSON UI OJAI, which can provide distributed and scaled JSON file processing in “Hadoop like” clusters. OJAI (Native American term for “moon”) is designed as a file processing API for JSON files across all Hadoop Systems and Frameworks. OJAI also has a back end to table feature that allows CRUD operations to be performed via the API. Each stored document is by row on table with a unique key to manage the record, which can be field determined or assigned by admin. It’s a flexible use API for JSON in the distributed computing age of today.

Another area of progress we were impressed with was on Apache Spark 1.4 or above. An open source library on github detailed an impressive Spark Library called Magellan. The library is designed for Geospatial analysis on top of Spark 1.4 or above. It covers some critical functionality in geospatial analysis such as geo coordinates transformation (e.g. State Plane (NAD83) and Standard GPS (WGS84)) by providing a transformation interface for combining large datasets together and producing compatible outputs on a standard geo coordinate format. Magellan notably has only bindings for Scala and Python with operations such as Intersection, Union and Distance. Examples of its geometric features are Polygon, LineString and MultiLineString. Anybody doing Geospatial analysis and coding on Spark 1.4+ in Scala or Python should check out Magellan as it is guaranteed to become invaluable in any related Geospatial related analysis.

Who knows what innovations will visit upon us next week, allowing us all to dream of a better tomorrow to be released, one day at a time. Have a great weekend all!

About Sonra

We are a Big Data company based in Ireland. We are experts in data lake implementations, clickstream analytics, real time analytics, and data warehousing on Hadoop and Spark. We also run the Hadoop User Group (HUG) Ireland. We can help with your Big Data implementation. You can get in touch today, we would love to hear from you!

About the author

Uli Bethke LinkedIn Profile

Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.

Uli is a regular contributor to blogs and books, holds an Oracle ACE award, and chairs the the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.