Thoughts from Big Data Economic Papers

I have started reading The Data Revolution and Economic Analysis, a paper written by two Stanford professors.  It is a great paper and I highly recommend it.

One passage spurred on some thoughts for me.  It reads:

In health care, it is now common for insurers to adjust payments and quality measures based on “risk scores”, which are derived from predictive models of individual health costs and outcomes. An individual’s risk score is typically a weighted sum of health indicators that identify whether an individual has different chronic conditions, with the weights chosen based on a statistical analysis.

I thought of computing macroeconomic risk scores for, say , the risk of entering a recession.  Economic indicators would be used to construct the risk factors.

I also thought of Varian’s paper that gave a methodology for determining the importance of a variable for inclusion into a model.  This process may inform the weighting decision.

I would love to write a Python program that would use the FRED API and pull data and compute the risk score.


I am Excited to Learn R

I have continued reading Varian’s paper tonight.  He mentioned using R to make predictive models, using rpart (short for recursive partitioning and regression trees) to make a tree.  My only experience with R was using it to compute Gini coefficients on Census 2000 data (I have issues with people’s interpretations of a Gini coefficient but that is a subject for another time).  I am excited to hit Data Camp once I wrap up the Python course on Code Academy to learn R.

I also am impressed with Varian practice of making his “source code” available.  This is a great practice and there needs to be more of it.  I will be following his lead and making mine available on this blog (I would share my Gini coefficient code but it is lost to time).