In a previous post, I shared a glimpse about the Big Data movement in the Middle-East. Recently, I have been playing with different data science and data visualization tools, thus, I plan to share illustrated guides and tutorials to my blog readers with special focus on open data in the MEA region.
Besides, open data is still in its very early days in the MEA region, but still I wanted to show its impact and how data can help us tell stories that affect our lives. Luckily enough, Oman’s government started an open data portal last month and here is the press release:
“The Oman Open Data initiative launched by Information Technology Authority earlier this month to boost the government’s eTransformation plan.
In an effort to effectively apply the initiative’s objectives, the ITA has launched the ‘Big Data Idea’ to encourage the public to make use of this data and to exploit the potential to generate new businesses and stimulate growth.
The competition team urges the community of academics, students, researchers, businesses, and others to actively use open data that has already been published in the portal and to participate with their innovative ideas in the competition, through submitting their ideas via the Official eGovernment Services Portal (Omanuna) www.oman.om. Registration for the competition will close at the end of July.
The Oman Open Data portal (www.oman.om/opendata) currently offers approximately 55 datasets, a press release said.”
I explored the datasets until I thought of a story related to crimes in Oman. Thus, I used two datasets from the portal which are “Total Population by Governorate” and “Sentences Passed Against Convicts by Type of Crimes & Governorate for 2011”.
I downloaded the data in excel format then used Open Refine for cleaning, mashing-up, and formatting the data. Then, I geocoded the data in Google Docs and used an open source GIS platform to create the shapefile and join data to it. Finally, I used the AWESOME open source mapping tool “TileMill” from MapBox to design the map below. Well, it is fun to play with so many tools until reaching a good story to tell.
The figure below shows a thematic map which presents the total crimes in Oman’s Governorates per 1000000 people for 2011. Once you hover over a governorate, a chart will appear which compares crimes against persons in the selected governorate. It’s a simple interactive data visualization with a relatively small dataset but it can be more interesting if, for example, we compare crimes among all years. However, this data is not yet available.
And now enjoy the dataviz! Make sure to click on the image in order to open the link for the interactive map:
I won’t walk-through the whole process in this post so that I would keep it for the upcoming DEEPLY ILLUSTRATED series of tutorials.
What I am interested to show in the upcoming posts is the whole process from the moment you get the data until you tell a story with your data.
So hey – I urge you to follow my blog by signing-up in the right pane of this page! And please let me if you’d like me to cover a special tool or big data topic.
And here’s the good news! We are creating a Data Driven Journalism Community in the MEA region. It will be up and running very soon. This community will be a network for data journalists, big data enthusiasts, and data visualization gurus in the MEA region. Until then go ahead and sign up in the community list: Data Driven Journalism MEA.
One more thing…Before publishing this blog, I thought about playing with a growing data science tool which is open source R… If you want to know more about R, you can check my previous blog: Revolution Analytics at the BBBT and watch this short video: What is R?.
Robbery rates represents the highest percentage among all crimes in Oman. And yes, it’s reaching 88% in some governorates. Thus, I wanted to see what other crimes correlates with robbery. After doing the analysis in R, I concluded that governorates that have higher robbery rates tend to have higher fraud rates (See figure below). P.S. Correlation doesn’t mean causation in our case. There is a lot of debate about this topic in the big data industry so you can have a look at this Forbes article which discusses causation vs. correlation.
The figure below also shows a scatterplot matrix that visualize the relationships of all crimes in Oman. You can check all positive or negative correlations…(Click on the image to see the full-size)
Until next time,