Telling The Story of Falling Oil Prices Using Big Data [Tutorial]

The Impacts Of Big Data On Love, Relationships and Behaviors
May 23, 2016
Big Data: Two Words That Are Going To Change Everything
June 2, 2016

Telling The Story of Falling Oil Prices Using Big Data [Tutorial]

Figure 1: Network Diagram for the period before the falling of Oil prices

Big Data is indeed disrupting our industries and to a data scientist, the best way to prove that and show it to people is to play around with some data!

The fall of Oil prices is one of the most prominent topics in our world nowadays, so its a matter of curiosity for any data enthusiast to see what Big Data can tell us about the Oil market’s scene. We analyzed hundred of thousands of news article mentioning the Oil & Gas discussions before and after the fall of Oil prices in a 6 months’ time frame. Yes, we played with Big Data, really!

We then constructed a network diagram that shows the “communities” of conversation around “Oil & Gas”, identifying key influencers and the people with whom they are most closely connected. Also, we constructed maps that shows connections among countries discussing Oil & Gas topics along with the tone of these connections. We will only show the persons’ networks in this post.

 

Figure 2: Network Diagram for the period after the falling of Oil prices

So, after playing this big data game and constructing these awesome visualizations, what insights can we extract?

Before the prices decreased there was a relatively understandable division of communities including active players in the gas and oil game. Immediately clear is the blue cluster at center, encompassing President Barack Obama who connects different other communities that have respectively the Western, Arab and African figures. This highlights the dominant role of the US in the Oil & Gas market allowing it to be a central hub for the connections among people. Following Obama is Vladimir Putin who seems to be the second most effective character, followed by John Kerry, then Angela Merkel and the ousted Ukrainian President. Meanwhile, the active characters in South Sudan remain centralized around two characters Salva Kiir and Riek Machar. The arise of the South Sudani conflict between President Salva Kiir and his vice president Riek Machar was evidently portrayed in the high connectivity the two parties had with global parties around at that period along with its effects on the Oil dependent country at that period. More to that, after the oil crisis, both names seem to become part of the whole network and not solo major influencers.

On the other hand, Francois Holland, was not an influencer in the gas and oil industry before the prices started declining yet he started gaining a gradual increase in the European circle of influencers right after the crisis. Similarly, Stephan Harper, the Canadian president, became part of president Obama’s circle of connections, right after the prices dropped down, staying away from David Cameron where the Labour opposition was having discussions with Ed Miliband leading it.

Furthermore, the division of communities changed after the Oil & Gas prices dropped and the map transformed into a global community containing almost all the influential figures. Obama maintained the first influence rank yet many local political figures became more influential with the evolution of the issues they work on that affect the oil. Putin’s role was still among the most effective, especially his dominance and contribution in European discussions. This shows the overlapping of global political files and the oil price linkage with it like the Ukrainian and Syrian conflicts.

In the Middle East, the Prime Minister of Iraq governed the most attractive oil hub, Iraq, for 4 years, so his name seems obvious among the influencers in that field. This can simply reflect the major role Iraq plays in the oil sector along with the dominant role prime minister played throughout attempts for regulating prices. His role did not change after the election of the new Iraqi prime Minister as well, yet the latter started appearing in the circle of influencers. On another perspective, Nigeria is among the top exporters when it comes to oil and will undergo one of its most important elections with the northeast problems with extremists and the Oil prices falling affecting the current president’s chances of winning the election. That’s why, the Nigerian community is not connected with the other central network as it is concerned with internal issues.The death of King Abdullah in Saudi Arabia can also be a reason for the introduction of New Saudi King name after the prices declined.

Methodology: So, what do the figures in this post represent?

We used different data science tools for this project including R and other data visualization and analysis tools. We also used the GDELT data which monitors the world’s broadcast, print, and web news from nearly every corner of every country in over 100 languages.

Figure 1. visualizes the person co-occurrence network of Oil & Gas news coverage 6 months before the Oil prices dropped (November 2013 till June 2014). In this analysis, all 91 names were extracted from online news. Eventually, only the names which appeared together in a news article were connected based on co-occurrence. Only those names appearing in at least 180 different news articles and which co-occurred with at least one other name in 180 or more articles were retained, yielding a network diagram of 91 names and 184 connections.

Figure 2. visualizes the person co-occurrence network of Oil & Gas news coverage 6 months before the Oil prices dropped June 2014 till February 2015). In this analysis, all 143 person names were extracted from online news so that only the names which appeared together in a news article were connected based on co-occurrence. Only those names appearing in at least 180 different news articles and which co-occurred with at least one other name in 180 or more articles were retained, yielding a network diagram of 143 names and 238 connections.

Each node (i.e circle) represents a person name, sized based on total number of connections, with the edges (connections) between them representing names which co-occur frequently in news coverage, suggesting some degree of connection or similarity. An algorithm known as a “community finder” was applied to the final network diagram, which groups the names into clusters in which members of a given cluster co-occur more frequently with each other relative to other names. Moreover, each cluster is given its own color. The specific color of each group is assigned at random by the software, but nodes with the same color co-occur more frequently with each other.

For the sake of collecting more insights, We then applied R language to identify the maximal clique/s in Figure 2. The maximal clique is a sub-set of a network in which the actors are more closely and intensely or strongly tied to one another than they are to other members of the network. The maximal clique was identified as 7 which consists of Barack Obama , John Kerry ,Vladimir Putin,David Cameron, Francois Hollande, Petro Poroshenko, and Angela Merkel. These are clearly the influencers around the “Ukranian Conflict” and this shows how this conflict might be affecting the Oil prices discussions.

In conclusion, Data science will have a huge impact in understanding economic and political issues like the discourse surrounding a topic like Oil prices falling. Although, I am not an economic or political science expert, I was able to see the big picture and come up with conclusions by using data science techniques. In particular, the methodology’s ability to identify politicians associated with specific subtopics like “Ukrainian conflict” or the Nigerian internal problem with the Election and Oil prices, as well as its ability to tease out geographic focus of individuals and to identify key influencers for each cluster, suggests that this offers a public relations kit in a box. Using the information in the visualization above, it is possible to readily identify both key influencers in each community (the larger nodes) and the reporters and analysts who focus most prominently on them.

In the future, Big Data can be used to automatically detect emerging leaders and communities and to also offer a real-time picture of a conversation around a specific and important topic like the falling Oil prices.