As previously explained I am working through the Data Management and Visualization course. This is week 2 and I had the opportunity to use Python to explore some data.
I put together a Jupyter (formerly IPython) notebook and have uploaded it all to my GitHub repository. To view the notebook visit https://github.com/mikeasilva/democracy-and-economic-well-being/blob/master/Data Managment and Visualization/Week 2 Assignment.ipynb. It is an example of literate programming so it mixes narrative content with machine readable code. If you want to view the Python script sans narration it is available too.
In this analysis I would like to examine the relationship between the economic well-being of a society and the level of democratization. The data for this analysis comes from a subset of the GapMider project data.
The level of democratization is measured using the Polity IV democracy score. It is a summary measure of a country’s democratic and free nature or lack thereof. It ranges from -10 (an autocracy) to 10 (full democracy).
I did get some useful feedback on needing to clarify what is meant by “economic well-being.” Economist frequently use GDP per capita as a measure of economic well-being. Loosely GDP is a measure of how much stuff is produced by an economy. So per capita GDP would be how much stuff everyone would have on average. The higher the amount of stuff a person has the better off they are. I personally don’t like this measure but I am using it since it is the only thing available.
There are 213 observations and 16 variables in the GapMinder data set. 161 out of 213 countries had a democracy score (52 missing). 190 out of 213 had per capita GDP figures (23 missing). I subset this data selecting only those that had both a democracy score and per capita GDP figures which left me with 155 observations.
There are 155 countries in the data set I am analyzing. The breakout of these countries by their democracy score in 2009 is:
There are 32 countries that are full democracies (have a polity score of 10). This is roughly 21% of all the data. There are 2 observations that are autocracies. We see that most of the countries are greater than zero. I tabulated it and 69% of the countries (or 108) are “open” in one form or another.
One challenge that I faced was GDP per capita is a continuous variable. I broke the data up into quintiles. The following table summarizes the frequencies:
|GDP Per Capita Quintile||Count||Percent|
119 countries (roughly 78%) are in the lowest quintile which is a rather large proportion! I will have to decide if there is a better way to break the data into groups.
The final variable that I examined in this was a summary measure of the level of democracy. Using the 21 Polity IV scores is a little unruly. So I aggregated the data into five categories identified by the Polity IV project authors. I think I will use this in my full analysis:
Most of the countries in the data set are either democracies (38%) or full democracies (21%). 47 of the 155 countries could be considered as closed.
I like this measure better because it provides more delineation than the full polity score variable does. I do find it interesting that under this convention 20 of the 155 countries in the data set are autocracies as opposed to the 2 that I identified earlier. I find the democracy groups easier to believe.