Predicting Titanic Survivors

Udacity intro to data science course has a project that involves predicting the probability of a passenger being a survivor on the Titanic.  To successfully complete the task you need to have a higher than 80% accuracy rate.

The following is the heuristic that I programmed. I can’t take credit for this as I got my inspiration from Hal Varian’s paper. The heuristic has a 80.47% accuracy rate.

# Assume they aren't a survivor by default
survivor = False
# Prediction model variables
passenger_id = passenger["PassengerId"]
sex = passenger["Sex"]
pclass = passenger["Pclass"]
age = passenger["Age"]
sibsp = passenger["SibSp"]
# Let's find the Survivors
if sex == "female" and pclass <= 2:
     survivor = True
elif sex == "male" and pclass > 1 and age <= 9 and sibsp <= 2:
     survivor = True
# Set the prediction for the passenger
if survivor:
     predictions[passenger_id] = 1
else:
     predictions[passenger_id] = 0

Advertisements

4 thoughts on “Predicting Titanic Survivors

    • Mike Silva says:

      He explained in his paper “One might summarize this tree by
      the following principle: ‘women and children first . . . particularly if they were
      traveling first class.’”

      I just had to translate that into Python. The decision tree gave the parameters so it wasn’t hard to do. Best of luck with the course. I know I enjoyed it!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s