Analysing the Irish Jobs Market with Dataiku DSS and import.io

Uli Bethke Dataiku, import.io

Getting value from your data is not a straightforward process. One of the secrets is to have a platform in place that allows you to quickly prototype data applications. In this two part series we are going to perform some data discovery on the Irish jobs market. We will use the popular import.io service to ingest a relevant dataset and then use the Dataiku DSS import.io plugin to ingest the data and use DSS functionality to generate some insights.

import.io it is an online platform to easily extract data from websites. Using the magic extractor is as easy as 1,2,3 to generate a virtual API on top of the required data set and then iterate over it. The data set we are interested can be found here. We pass this URL into the import.io Magic API, run the query and get the data back in a nice table.

import1Next we generate the virtual API

import2
Querying the API will return a JSON object. In a next step we now use the import.io plugin in DSS to ingest the data and paste the URL to the virtual API into the API URL field

import3
In order to page/iterate over the API we need to create a DSS recipe that passes the the irishjobs.ie source URLs into the virtual API. A recipe always requires an input dataset.
We use an Oracle database as the source to generate this dataset using Connect By clause and a little trick.

import4
In a next step we just run the recipe and explore the output dataset.
Using the Columns Quick View we can get a nice overview on the distribution of the various columns of our dataset.

importx
We can explore the data in more detail, e.g. which agency has the most jobs advertised.

import6

Or the range of salaries

import7
Or the location

import8

It took me longer to write up this post than it took to ingest and explore the data. In the next part we will have a look at how easy it is to transform our data and gain some more insights on what skills are most in demand at the moment and which skills correlate with each other. Until then don't stop to unleash the value of your data.

 

About the author

Uli Bethke LinkedIn Profile

Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.

Uli is a regular contributor to blogs and books, holds an Oracle ACE award, and chairs the the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.