Data Analysis and Tools Week 2

This week’s assignment focus is on running a chi-square test of independence.  My research explores the relationship between the level of openness of a society and the economic well-being of the citizens. My hypothesis is that countries with a more open society will have a higher level of economic well-being.  The python code can be found in the Jupyter notebook for this week.

Openness

I have followed the definitions of the Polity IV study in classifying countries into 5 types based on their polity score.  The following summarizes their distribution:

figure_3

Type of Government Count
Full Democracy 32
Democracy 57
Open Anocracy 19
Closed Anocracy 27
Autocracy 20

Economic Well-Being

In the past weeks I have divided the economic well-being measure into quartiles.  Here’s the count of countries by income quartiles:

income-quartiles

Income Class Count
0% to 25% 123
25% to 50% 12
50% to 75% 12
75% to 100% 8

I did not want to compare a 4×5 matrix so for simplicity sake I rolled the income classes up into a single measure.  It is if the country is in the top half of the income distribution or not.

Chi Square

The null hypothesis is variable independence. I ran a chi square test on this data which resulted in a p-value of 0.00000000142, so we would reject the null hypotheses.  The presence of a country in the top half of the income distribution is a function of the type of government.

Post Hoc Test

I tested the 10 combinations pairwise using the Bonferroni adjustment. Since there are 10 combinations the p-values need to be adjusted by one decimal place.

Only in the cases where the full democracy was compared against the other types resulted in p-values significantly large to reject the null hypothesis.   The most meaningful interpretation of these results is that all of the other groups are homogeneous. The full democracy group is different. Once again the python code used in this analysis can be found in the Jupyter notebook for this week.

Advertisements

Democracy and Economic Well-Being Visualizations

This is the third in a series of posts chronicling my project for the Data Management and Visualization course. This week we learned about visualizations to summarize the data.

As usual I put together a Jupyter notebook which is hosted on my GitHub project repository. It is an example of literate programming so it mixes narrative content with machine readable code. If you want to view the Python script sans narration it is available too.

Project Description

In this analysis I would like to examine the relationship between the economic well-being of a society and the level of openness.   My hypothesis is that countries with a more open society will have a higher level of economic well-being.

The data for this analysis comes from a subset of the GapMider project data.  I use the level of democratization as a measure of the openness of a country.  In order to measure the economic well-being I will be using GDP per capita data.  There were data management decisions that were made and chronicled in my previous post which would not be covered here.

Univariate Analysis

Economic Well-Being

figure_1 The income per person is unimodal and right skewed. Values range from about $100 to $40,000.  The mean is $6,600 and the median is $2,200.  There is a natural floor as it is not possible to have a negative GDP per person.

Openness

figure_3This data is categorical in nature so here’s the count of countries on the open to closed spectrum (most open on the left and closed on the right).  Most of the countries are generally open.  32 are Full Democracies and 20 are Autocracies.  These groups are especially important in this analysis.

Bivariate Analysis

Average Economic Well-Being by County’s Openness

figure_4 To look at the relationship I will compare the average economic well-being by the level of a country’s openness.  We see that the full democracies have a higher average than the autocracies.  It is also noteworthy to point out the U shape to this distribution.

Median Economic Well-Being by County’s Openness

We did observe a considerable range in the univariate analysis so I made the comparison again using the median as the measure.
figure_5 For those not accustom to box plots, the median is the line inside the box.  The median of the full democracy is higher than that of the autocracy.  One can still observe a U shape in the distribution.

I like the box plots because you can see the outliers that influence the means.

Summary

The data support the hypothesis that countries that are more open and democratic have a higher standard of living or economic well-being than those that are closed.  I would hasten to note that there is a U shape in the distribution which suggests that as a country moves from an autocracy towards a more open democracy, it might lower the economic well-being of the citizens.  As a summary I would like to present the average (mean and median) economic well-being by the level of openness.

Openness Count Mean Median
1 – Full Democracy 32 $19,290 $17,222
2 – Democracy 57 $3,425 $1,621
3 – Open Anocracy 19 $1,167 $669
4 – Closed Anocracy 27 $2,473 $591
5 – Autocracy 20 $6,114 $2,385

Data Management and Visualization Week 3

This is the third in a series of posts chronicling my project for the Data Management and Visualization course.  This week we learned about making data management decisions.

I put together a Jupyter notebook which is hosted on my GitHub project repository. It is an example of literate programming so it mixes narrative content with machine readable code. If you want to view the Python script sans narration it is available too.

Project Description

In this analysis I would like to examine the relationship between the economic well-being of a society and the level of democratization.  The data for this analysis comes from a subset of the GapMider project data.

Data Management Decisions

Democracy Score

The level of democratization is measured using the Polity IV democracy score.  It is a summary measure of a country’s democratic and free nature or lack thereof.  It ranges from -10 (an autocracy) to 10 (full democracy).  This gives 21 possible values which is a little overwhelming.  I decided to create five categories (Full Democracy, Democracy,Open Anocracy, Closed Anocracy and Autocracy).  These categories are from the Polity IV project authors.

Economic Well-Being

The other variable that needed data management is the per capita GDP variable.  This is a continuous variable that I am thinking about changing into a discrete variable.  I broke the data into per capita GDP quartiles and quintiles.

Subset of Full Data Set

There are 213 countries in the GapMinder data set.  161 out of 213 have democracy scores (52 missing).  190 out of 213 have per capita GDP (23 missing).  I selected the countries that had both per capita GDP and democracy scores.  This left me with 155 countries.

Exploratory Analysis

This section summarizes some of the findings of my exploratory analysis.

Countries by Continent

I was interested in seeing if I should include a geographic variable in the analysis so I created a dataset that took the GapMinder countries and mapped them into their continents.

Full Democracy Democracy Open Anocracy Closed Anocracy Autocracy
Africa 1 16 8 20 4
Asia 3 10 6 5 13
Australia and Oceania 2 1 1 1 0
Europe 20 15 2 0 2
North America 4 8 1 1 1
South America 2 7 1 1 0

Europe has the highest concentration (about 90%) followed by North America (85%). Asia and Africa had the lowest concentrations (both roughly 35%). Asia has 13 countries that are autocracies. South America and Australia and Oceania are not found in this column.

The Number of Countries by Income Quartile and Level of Democracy

Full Democracy Democracy Open Anocracy Closed Anocracy Autocracy
0% to 25% 9 53 19 26 16
25% to 50% 8 2 0 0 2
50% to 75% 9 2 0 0 1
75% to 100% 6 0 0 1 1

Most are in the lowest quartile. 6 of the 8 countries in the top income quartiles are full democracies.

The Number of Countries by Income Quintile and Level of Democracy

Full Democracy Democracy Open Anocracy Closed Anocracy Autocracy
Lowest (0% to 20%) 7 53 19 25 15
Second (20% to 40%) 9 1 0 1 3
Middle (40% to 60%) 2 2 0 0 1
Fourth (60% to 80%) 9 1 0 0 0
Highest (80% to 100%) 5 0 0 1 1

Most countries are in the lowest gdp per capita quintile. 34% of the countries in the data set are both democracies and are in the lowest per capita GDP quintile. Most of the countries in the highest quintile are full democracies.

Week 2 of Data Management and Visualization

As previously explained I am working through the Data Management and Visualization course.  This is week 2 and I had the opportunity to use Python to explore some data.

I put together a Jupyter (formerly IPython) notebook and have uploaded it all to my GitHub repository.  To view the notebook visit https://github.com/mikeasilva/democracy-and-economic-well-being/blob/master/Data Managment and Visualization/Week 2 Assignment.ipynb.  It is an example of literate programming so it mixes narrative content with machine readable code.  If you want to view the Python script sans narration it is available too.

Project Overview

In this analysis I would like to examine the relationship between the economic well-being of a society and the level of democratization.  The data for this analysis comes from a subset of the GapMider project data.

The level of democratization is measured using the Polity IV democracy score.  It is a summary measure of a country’s democratic and free nature or lack thereof.  It ranges from -10 (an autocracy) to 10 (full democracy).

I did get some useful feedback on needing to clarify what is meant by “economic well-being.”  Economist frequently use GDP per capita as a measure of economic well-being.  Loosely GDP is a measure of how much stuff is produced by an economy.  So per capita GDP would be how much stuff everyone would have on average.  The higher the amount of stuff a person has the better off they are.  I personally don’t like this measure but I am using it since it is the only thing available.

Exploratory Analysis

There are 213 observations and 16 variables in the GapMinder data set.  161 out of 213 countries had a democracy score (52 missing).  190 out of 213 had per capita GDP figures (23 missing).  I subset this data selecting only those that had both a democracy score and per capita GDP figures which left me with 155 observations.

Democracy Score

There are 155 countries in the data set I am analyzing.  The breakout of these countries by their democracy score in 2009 is:

Democracy Score Count Percent
-10 2  1.3
-9 3  1.9
-8 2  1.3
-7 11  7.1
-6 2  1.3
-5 2  1.3
-4 6  3.9
-3 6  3.9
-2 5  3.2
-1 4  2.6
0 4  2.6
1 3  1.9
2 3  1.9
3 2  1.3
4 4  2.6
5 7  4.5
6 10  6.5
7 13  8.4
8 19  12.3
9 15  9.7
10 32  20.6

There are 32 countries that are full democracies (have a polity score of 10). This is roughly 21% of all the data. There are 2 observations that are autocracies.  We see that most of the countries are greater than zero.  I tabulated it and 69% of the countries (or 108) are “open” in one form or another.

Economic Well-Being

One challenge that I faced was GDP per capita is a continuous variable.  I broke the data up into quintiles.  The following table summarizes the frequencies:

GDP Per Capita Quintile Count Percent
Lowest 119  77.8
Second 14  9.0
Middle 5  3.2
Fourth 10  6.5
Highest 7  4.5

119 countries (roughly 78%) are in the lowest quintile which is a rather large proportion!  I will have to decide if there is a better way to break the data into groups.

Democracy Groups

The final variable that I examined in this was a summary measure of the level of democracy.  Using the 21 Polity IV scores is a little unruly.  So I aggregated the data into five categories identified by the Polity IV project authors.  I think I will use this in my full analysis:

Democracy Group Count Percent
Full Democracy 32  20.6
Democracy 57  37.8
Open Anocracy 19  12.3
Closed Anocracy 27  17.4
Autocracy 20  12.9

Most of the countries in the data set are either democracies (38%) or full democracies (21%).   47 of the 155 countries could be considered as closed.

I like this measure better because it provides more delineation than the full polity score variable does.  I do find it interesting that under this convention 20 of the 155 countries in the data set are autocracies as opposed to the 2 that I identified earlier.  I find the democracy groups easier to believe.

Democracy and Economic Well-Being

I have recently started the Data Management and Visualization course on Coursera.  As part of this course I will be conducting research and will be writing posts explaining my work.  This post is the first in the series of posts explaining my research project.

My Research Question & Hypothesis

For this research project I will be exploring what is the relationship between the level of openness of a society and the economic well-being of the citizens. My hypothesis is that countries with a more open society will have a higher level of economic well-being.

Data Used

The data I will be using is a subset of the Gapminder Project’s data. I measure the economic well-being by the 2010 per-capita GDP.  This data has been adjusted to replace purchasing power parity and is in constant 2000 US dollars.  I will measure the level of openness with the 2009 Democracy score (Polity).  This data comes from the Polity IV Project and is a summary measure of a country’s democratic and free nature. It ranges from -10 (an autocracy) to 10 (full democracy).

Literature Review

In Why Nations Fail1, Daron Acemoglu and James Robinson explain that economic prosperity depends above all on the inclusiveness of economic and political institutions.  This is in contrast to classical economics which explain the difference in economic well-being using factors like land, labor and capita.  I have previously read this book and hope that my research can confirm or deny their thesis.

I searched using Google Scholar for other literature on this topic.  John Gerring’s “Democracy and economic growth: A historical perspective”2 explained that recent studies appear to show that democracy has no robust association with economic growth (economic growth is viewed as the engine to increase economic well-being).  They argue that democracy must be understood as a stock, rather than a level, measure.

Next Steps

I plan on doing the analysis in Python and will be posting my work to GitHub.

Footnotes

1. Robinson, A. D., and R. Acemoglu. “Why nations fail.” The Origins of Power, Prosperity and Poverty, Nueva Y ork (2012).

2. Gerring, John, et al. “Democracy and economic growth: A historical perspective.” World Politics 57.03 (2005): 323-364.

Aside

There Is No Such Thing As The U.S. Economy

While working at the Bureau of Labor Statistics in New York City, I recall overhearing a colleague comment that there is no such thing as the U.S. economy.  Instead, he said, there are a lot of little economies.  That really stuck with me and in this vein I am proud to share what I’ve been up to lately.

I have been thinking about the employment situation in the United States, especially after the Great Recession.  I wanted to get a feel for how many jobs have returned keeping in mind that each metro could have a different story.

gr-employment-indexI use R to create metro level indexes of employment and then visualized them.  I found the visualization so intriguing that I created a Shiny app.  The app allows you to produce what I lovingly refer to as moustache graph as seen to the left.  You can view the difference between the Midland, TX and the Dalton, GA’s post Great Recession employment stories.  When viewing the visualization, I am inclined to think that my colleague at the BLS was right.  There is no such thing as the U.S. economy.

Note: You can access the app at https://mikesilva.shinyapps.io/metro-employment-index/ and read how the index was derived at http://rpubs.com/mikesilva/metro-employment-index.