Patents by County

I recently had to update some county level utility patent data.  I decided to use R to scrape the U.S. Patent and Trademark Office’s website.  I have the file on GitHub.

## Utility Patents by Year
## This script scrapes the PTO's website for their utility patents by year and
## creates and excel file 

## Value & Variable Names <- 'Utility Patents' <- 'Year'
## The path of the output
file.path <- 'H:/Data Warehouse/US Patent and Trademark Office (USPTO)/Utility Patents By Year.xlsx'

## Scrape the web for utility patent grant figures
library(rvest) # Load the package
url <- ''
pto %
  html_nodes('table') %>%

pto <- pto[[1]]

## Remove the total column
pto <- pto[!names(pto) %in% c('Total') ]

## Reshape from wide to long
library(reshape2) # Load the package
pto <- melt(pto, id.vars = names(pto)[1:4],, 

## Reorder the columns
pto <- pto[,c(names(pto)[1:4],,]

## Save the PTO data
library(xlsx) # Load the package
write.xlsx(x=pto, file=file.path, sheetName='USPTO_Utility_Patents_By_Year', 
           row.names = FALSE)
# The quicker alternative
#write.csv(pto, 'H:/Data Warehouse/US Patent and Trademark Office (USPTO)/Utility Patents By Year.csv', row.names=FALSE)

Beginning to Assemble Metro Economic Data

Just a quick post. I have begun assembling MSA (Metropolitan Statistical Area) economic data. The data as well as the scripts used to generate the data are available on GitHub at

My hope is once I compile the data running some analysis on them. It is my belief that the notion of a national economy is flawed in the case of the U.S. It is really the aggregation of a lot of other economies. They each have their unique characteristics.

The data that I have assembled include a measure of worker productivity. It is the BEA’s Real GDP by metro (in chained 2005 dollars) divided by the total employment so you get real output per worker. There is also a measure of high-tech jobs by metro using the latest Occupational Employment Statistics.


Bureau of Economic Analysis API

Following in the footstep of my previous posts on the FRED API and the BLS and Census Bureau API, I want to pass on another data source.

The Bureau of Economic Analysis (or BEA) has an API.  They are the people who come up with the GDP estimates (among other things).  You can register for an API key, and read the documentation (their user guide is quite good).  The URI is  You can get data in JSON and XML formats.

Once I become more proficient with Python, I plan on programming a wrapper for this API.  I will announce when that project is complete here.


More Economic Data APIs

I previously posted about the FRED API which got me thinking that there are other APIs that others might not be aware of.  I would like to pass them on.

The U.S. Census Bureau has an API that give you access to economic indicators, as well as demographic data (decennial census and American Community Survey data).  I just requested my API key.  Sunlight Labs has programmed a Python wrapper but it looks to be a bit outdated.

The U.S. Bureau of Labor Statistics also has an API.  I have used the API to get unemployment rates for some widgets I’ve programmed (example).  I’m not sure if there are series that are available through the BLS that are not available through FRED.  There is also a 3rd party Python wrapper.

How to Install the FRED API Python Toolkit on a 32 Bit Windows 7 System

I mentioned in my last post that the St. Louis Fed’s FRED system has an API and there is a Python wrapper. I just installed it on our 32-bit Windows 7 computer and here are the steps I followed:

  1. Download and Install Python – I am running the 3.3 version of Python.
  2. Download and Install Setup Tools – I chose setuptools‑2.2.win32‑py3.3.exe since it matches my system.
  3. Opened a command prompt – There are lots of ways to do this.  I click the start button and type “command” in the “Search programs and files” field, and hit enter.
  4. Installed FRED using easy_install – I typed “C:\Python33\Scripts\easy_install.exe fred” and voilà!  The Python FRED Toolkit (or wrapper) is on my machine.

You will need an API key to access the service.  I have only been successful getting XML responses with the wrapper.    Based on my research it looks like it is a Python 3 issue.