Converting Asana JSON to CSV on S3

May 15, 2019

In this article we will show you how to use Sonra’s data warehouse automation solution for complex XML and JSON documents, Flexter. We will process Asana JSON that we got from Asana’s API and convert the data to TSV text files on AWS S3 object storage.
Flexter is a Spark application written in Scala. For this blog post we will use the managed cloud version of Flexter. Flexter SaaS is built on top of container technology. You can make calls to its endpoints via a RESTful API. It currently supports Snowflake, S3, and Redshift as sinks. We will add onto this list over time and will also add calls to APIs to transform JSON and XML on the fly. This will let you generate a relational data model from any API in seconds.

Table of Contents

Asana

Asana is one of the best and largest web and mobile project management applications made to help teams manage their work. In Asana teams can create projects, tasks, assign those tasks to people, put due dates and add a lot more of other information.
Asana is used in many organizations and industries of different sizes. It can be addapted to any organization’s specific workflows and processes.

S3

Amazon Simple Storage Service that provides object storage through a web service interface.
Amazon S3 can be employed to store any type of object which allows for uses like storage for Internet applications, backup and recovery, disaster recovery, data archives, data lakes for analytics, and hybrid cloud storage. In its service-level agreement. In this scenario we store the text files that Flexter generates on S3. From there you can process the data further or download it.

Exporting data with Asana API

We have created a sample project for this showcase. We have tasks and their subtasks, different tags, due dates etc.

First step for exporting data would be to find the ID of the project. We will do that by going to Asana API page, and selection Projects in a dropdown.

In next step we select what we want to GET

Then we submit our request and we will get our Project ID in the response

When we are done with this, we change our GET request to “GET /projects/:project/tasks

We select all the information we need

We put our Project ID

And then we submit our request, which will provide a JSON file

To save our data, we can open it in browser by clicking on “open raw response”

And then saving our data as JSON

We can now start processing. You can download JSON file we used for this post here.

Processing data with Flexter

Flexter exposes its functionality through a RESTful API. Converting XML/JSON to S3 can be done in a few simple steps.
Step 1 – Authenticate
Step 2 – Create a File Source
Step 3 – Generate schema (target data model)
Step 4 – Define your sink, e.g. S3
Step 5 – Process your XML/JSON data
Now we can go through steps and process our data.

Step 1 – Authenticate

To get an access_token you need to make a call to /oauth/token with Authorization header and 3 form parameters:

username=YOUR_EMAIL
password=YOUR_PASSWORD
grant_type=password

curl --location --request POST "https://api.sonra.io/oauth/token" \
--header "Content-Type: application/x-www-form-urlencoded" \
--header "Authorization: Basic NmdORDZ0MnRwMldmazVzSk5BWWZxdVdRZXRhdWtoYWI6ZzlROFdRYm5Ic3BWUVdCYzVtZ1ZHQ0JYWjhRS1c1dUg=" \
--data "username=XXXXXXXXX&password=XXXXXXXXX&grant_type=password"

curl --location --request POST "https://api.sonra.io/oauth/token" \

--header "Content-Type: application/x-www-form-urlencoded" \

--header "Authorization: Basic NmdORDZ0MnRwMldmazVzSk5BWWZxdVdRZXRhdWtoYWI6ZzlROFdRYm5Ic3BWUVdCYzVtZ1ZHQ0JYWjhRS1c1dUg=" \

--data "username=XXXXXXXXX&password=XXXXXXXXX&grant_type=password"

Example of output

{
"access_token": "eyJhbG........",
"token_type": "bearer",
"refresh_token": "..........",
"expires_in": 43199,
"scope": "read write",
"jti": "9f75f5ad-ba38-4baf-843a-849918427954"
}

{

"access_token": "eyJhbG........",

"token_type": "bearer",

"refresh_token": "..........",

"expires_in": 43199,

"scope": "read write",

"jti": "9f75f5ad-ba38-4baf-843a-849918427954"

}

Step 2 – Create a source

A source is the location and type of your source documents. Sources are referenced in the next step when we create the target data model.
Currently, we support file upload and S3 as location for sources.
Three different types of documents are supported.

XML
XSD
JSON

The type of source determines which algorithm will be used to process the data.

curl --location --request POST "https://api.sonra.io/source/create/file" \
--header "Authorization: Bearer <access_token>" \
--form "type=json" \
--form "name=asana" \
--form "file=@<file path>"

curl --location --request POST "https://api.sonra.io/source/create/file" \

--header "Authorization: Bearer <access_token>" \

--form "type=json" \

--form "name=asana" \

--form "file=@<file path>"

Example of output

{
"status": "ok",
"file": "asana",
"type": "json"
}

{

"status": "ok",

"file": "asana",

"type": "json"

}

Step 3 – Generating target data model

In this step we create our Schema (target data model).
You can generate Schema based on :

XML Source entry ( Source with type xml )
XML Source entry + XSD Source entry ( Sources with types xml and xsd accordingly )
JSON Source files ( Source with type json )

curl --location --request POST "https://api.sonra.io/schema/create" \
--header "Authorization: Bearer <access_token>" \
--header "Content-Type: application/json" \
--data "{
\"name\": \"asana_schema\",
\"json\": \"asana\"
}"

curl --location --request POST "https://api.sonra.io/schema/create" \

--header "Authorization: Bearer <access_token>" \

--header "Content-Type: application/json" \

--data "{

\"name\": \"asana_schema\",

\"json\": \"asana\"

Example of output

{
"status": "ok",
"uuid": "asana_schema"
}

{

"status": "ok",

"uuid": "asana_schema"

}

Example of ER Diagram

You can download the full ER Diagram for our Asana sample file from here.

Step 4 – Creating a sink

A sink is a connection to a target data store. In this step we create a connection to S3.
Use this endpoint to create an S3 Sink, if you want to save the data processing output to an S3 Bucket.
It requires 2 mandatory parameters name, path, and 1 optional parameter roleArn:

The parameter name is a unique name of Sink, which you will use in other calls as a reference to it.
Parameter path is a full path to S3 Object, it can be bucket/folder
(Optional) Parameter roleArn is a name of AWS Role with List/Read/Write to a provided path in your AWS Account, that you should create for Flexter AWS Account.

curl --location --request POST "https://api.sonra.io/sink/create/s3" \
--header "Authorization: Bearer <access_token>" \
--header "Content-Type: application/json" \
--data "{
\"name\": \"s3sink\",
\"path\": \"s3://iebucketencpt/output\",
\"roleArn\": \"arn:aws:iam::111111111111:role/SomeS3RoleForAssumeRole\"
}"

curl --location --request POST "https://api.sonra.io/sink/create/s3" \

--header "Authorization: Bearer <access_token>" \

--header "Content-Type: application/json" \

--data "{

\"name\": \"s3sink\",

\"path\": \"s3://iebucketencpt/output\",

\"roleArn\": \"arn:aws:iam::111111111111:role/SomeS3RoleForAssumeRole\"

Example of output

{
"status": "ok",
"file": "s3sink",
"type": "S3_EXTERNAL"
}

{

"status": "ok",

"file": "s3sink",

"type": "S3_EXTERNAL"

}

Step 5 – Processing Asana JSON data

In this step we will process our data.
This endpoint is used to create a Data Processing entry. It requires 2 mandatory parameters and 1 optional parameter:

Parameter schema is a reference to a Schema entry
Parameter source is a reference to an XML Source
(Optional) Parameter sink is a reference to a Sink, if you want the output of processing to be saved somewhere, you should specify a Sink reference, otherwise the output will be saved in the Sonra cloud for download

curl --location --request POST "https://api.sonra.io/data/process" \
--header "Authorization: Bearer <access_token>" \
--header "Content-Type: application/json" \
--data "{
\"schema\": \"asana_schema\",
\"source\": \"asana\",
\"sink\": \"s3sink\"
}"

curl --location --request POST "https://api.sonra.io/data/process" \

--header "Authorization: Bearer <access_token>" \

--header "Content-Type: application/json" \

--data "{

\"schema\": \"asana_schema\",

\"source\": \"asana\",

\"sink\": \"s3sink\"

Example of output

{
"status": "OK",
"uuid": "data-2400eb32-0a80-496d-a203-7fad46eca4d3"
}

{

"status": "OK",

"uuid": "data-2400eb32-0a80-496d-a203-7fad46eca4d3"

}

And now we can check our data in our S3 storage. You can download output data here.

[blogBannerFlexter]

Conclusion

We have processed JSON data with ease. Flexter auto-generated the target schema, the target tables, the mappings from JSON elements to target table attributes, and globally unique foreign key relationships. Last but not least we automatically processed the JSON data into CSV/TSV files on S3. We did in a matter of minutes what would normally take a few days.
Find answers to FAQs on our website.
Our enterprise edition can be installed on a single node or for very large volumes of XML on a cluster of servers.
If you have any questions please refer to the Flexter FAQ section. You can also request a demo of Flexter or reach out to us directly.

Back to Blog

Cookie	Duration	Description
__cfruid	session	Cloudflare sets this cookie to identify trusted web traffic.
cookielawinfo-checkbox-marketing	1 month	This cookie is set by the GDPR Cookie Consent plugin to store the user consent for the cookies in the category "Marketing".
cookielawinfo-checkbox-necessary	1 month	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-preferences	1 month	This cookie is set by the GDPR Cookie Consent plugin to check if the user has given consent to use cookies under the "Preferences" category.
cookielawinfo-checkbox-statistics	1 month	This cookie is set by the GDPR Cookie Consent plugin to store the user consent for the cookies in the category "Statistics".
cookielawinfo-checkbox-unclassified	1 month	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Unclassified".
CookieLawInfoConsent	1 month	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
csrftoken	1 year	This cookie is associated with Django web development platform for python. Used to help protect the website against Cross-Site Request Forgery attacks

Cookie	Duration	Description
AnalyticsSyncHistory	1 month	Linkedin set this cookie to store information about the time a sync took place with the lms_analytics cookie.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie	2 years	LinkedIn sets this cookie to store performed actions on the website.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
li_gc	2 years	Linkedin set this cookie for storing visitor's consent regarding using cookies for non-essential purposes.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
mgref	1 year	This cookie is set by Eventbrite to deliver content tailored to the end user's interests and improve content creation. It is also used for event-booking purposes.
mgrefby	1 year	This cookie is set by Eventbrite to deliver content tailored to the end user's interests and improve content creation. It is also used for event-booking purposes.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.

Cookie	Duration	Description
G	1 year	Cookie used to facilitate the translation into the preferred language of the visitor.
SERVERID	session	This cookie is set by Slideshare's HAProxy load balancer to assign the visitor to a specific server.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_7H38LVR4Z5	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_44804396_1	1 minute	Set by Google to distinguish users.
_gat_UA-44804396-1	1 minute	A variation of the _gat cookie set by Google Analytics and Google Tag Manager to allow website owners to track visitor behaviour and measure site performance. The pattern element in the name contains the unique identity number of the account or website it relates to.
_gcl_au	3 months	Provided by Google Tag Manager to experiment advertisement efficiency of websites using their services.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
SIDCC	6 Months	The "SIDCC" cookie is used as security measure to protect users data from unauthorised access
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
AN	1 month
AS	session
ebEventToTrack	1 month
eblang	1 year
SNID	2 years	This cookie is set by the Google. This cookie is used by the map which helps visitors to identify and reach the facility.
SP	session
SS	session

From Code to Clarity: Visualizing SQL code for Documentation and Debugging

From Code to Clarity: Visualizing SQL code for Documentation and Debugging

From Code to Clarity: Visualizing SQL code for Documentation and Debugging

From Code to Clarity: Visualizing SQL code for Documentation and Debugging

Converting Asana JSON to CSV on S3

Asana

S3

Exporting data with Asana API

Processing data with Flexter

Step 1 – Authenticate

Step 2 – Create a source

Step 3 – Generating target data model

Example of ER Diagram

Step 4 – Creating a sink

Step 5 – Processing Asana JSON data

Conclusion

Converting Asana JSON to CSV on S3

Asana

S3

Exporting data with Asana API

Processing data with Flexter

Step 1 – Authenticate

Step 2 – Create a source

Step 3 – Generating target data model

Example of ER Diagram

Step 4 – Creating a sink

Step 5 – Processing Asana JSON data

Conclusion

Related Articles

XML Conversion Using Python in 2024

Loading and querying XML documents in the Oracle Database

9 Critical Types of XML Tools for Developers

Cookies consent