This project was developed by two students for the course „From Design To Software 1“. The general goal was to retrieve the lyrics from as many songs as possible from of the Million Song Dataset and depending on the content find out the emotion the song conveys.

The Million Song Dataset is a freely-available collection of audio features and metadata for a million contemporary popular music tracks. With Java, Python and open source software, the lyrics were fetched, stripped from special characters, formatted and saved. The retrieved lyrics are then categorized into one of four basic emotions, angry, sad, love and happy, depending on their content which is tested against four dictionaries for mentioned emotions.