The Costanza Occupation Prediction Engine (C.O.P.E.)

This data science project references a classic Seinfeld episode and demonstrates web-scraping, natural language processing and machine learning techniques with Python, Jupyter Notebooks and an API.

The algorithm classifies Reddit headlines as coming from either the architecture or marine biology Subreddits - George Costanza’s favorite “pretend” occupations.

The model accuracy improved from 0.51 to 0.89 (a good result), though Larry David (fittingly) refers to it as a data science project about nothing…

Previous
Previous

GreenScan

Next
Next

Behind the Mask