Image

New York State Public Libraries Circulation Visualization

I have recently been exploring data on the public libraries of New York State for a side project (more on that in a latter post hopefully).  I have also stated a Data Visualization course on Coursera and have decided to feature some visualization of this data set.

About the Data

The data used in this analysis comes from the Annual Report for Public and Association Libraries produced for New York State Education Department (NYSED). You can access the data at http://collectconnect.baker-taylor.com/ using “new york” as the username and “pals” as the password.  Load the saved list named “All Libraries as of 15 March 2016” and select the “Total Circulation” data element.

Visualization Decisions

For this visualization I decided to use all data from 2000 to 2014 (latest data available).  I aggregated the library level circulation data to generate the aggregate circulation for New York State Public Libraries.  I used colorblind safe colors from the Color Brewer palette.  I adjusted the scale on the Y-axis to be in millions.  I used R to generate the following visualization:

Data_Visualization_Assignment_1

What It Tells Us

Book circulation generally increased until 2010 where one observes a reversal of the decade long trend.  There is an exceptionally precipitous drop from 2013 to 2014.

This begs the question why is this changing?  Is it because of a change in the population?  Is it due to a change in the number of libraries reporting (might explain the 2013-2014 drop)?  Is it due to a rise in digital media sources as a substitute for books?  Is it due to a lack of public support/investment in libraries? I plan at looking at that last question in a future post.

Source Code

library(dplyr)
library(tidyr)
library(ggplot2)
library(ggthemes)

book_circulation <- read.csv('https://goo.gl/fyybwi', na.strings = 'N/A', stringsAsFactors = FALSE) %>%
  gather(., Year, measurement, X1991:X2014) %>%
  mutate(Year = as.numeric(substr(Year,2,5))) %>%
  mutate(measurement = as.numeric(gsub(',', '', measurement))) %>%
  filter(Year > 1999)%>%
  filter(ifelse(is.na(measurement),0,1)==1) %>%
  group_by(Year) %>%
  summarise(Circulation = sum(measurement)) %>%
  mutate(Circulation = Circulation/1000000)

ggplot(book_circulation, aes(Year, Circulation)) + geom_bar(stat='identity', fill="#9ecae1", colour="#3182bd") + ylab('Book Circulation (in millions)') + ggtitle('Book Circulation in NYS Public Libraries, 2000-2014') + theme_hc()
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s