My First ŷhat Model

As part of the Developing Data Products Coursera course I was introduced to ŷhat.  ŷhat is a great product that allows developers to create models in R or Python and the publish them to their platform.  You can then send the hosted model parameters and get a prediction from it.

My model (available on GitHub of course) it a k-nearest counties model.  It is loosely based on the idea of k-nearest neighbors however the only dimensions it compares on it latitude and longitude.  You provide the model with a latitude and longitude and the number of counties you want it to return (k) and the model will give the name of the county, FIPS code, and distance for your points.

I could see myself using this in developing visualizations where I have a series of points and want to know which county the points fall in.  I could also see using this to support geocoded information.  I recently had to aggregate geocoded information into metropolitan areas.  I used Google’s geocoding API to try to tease out the county name of the point, but didn’t have the FIPS code to aggregate to the MSA level.  ŷhat is a great product which I recommend to anyone looking for a simple yet effective way to make awesome data products.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s