The Big Lebowski Sentiment Analysis

#dataviz


Like a lot of folks The Big Lebowski is one of my favourite movies. The dialogue kills me; In spite of everything that comes his way, The Dude remains The Dude. So, I figured it would make an interesting dataset to garner some experience with sentiment analysis.

To begin, I found a copy of the script that seemed right (enough). After downloading the script, I split each character's dialogue into separate sentences. Then, for each sentence I used the VADER sentiment analyzer, which scores the content of the provided text on a scale from -1 to 1 (most negative to most positive).

Below I've plotted each character's mean score with the marker size corresponding to the number of sentences spoken.

Interactive visualizations are best rendered on desktop.

Unsurprisingly, Jesus Quintana is the most negative and on the flip side Brandt is the most positive. Meanwhile, Donny, Walter, and The Dude all scored just below 0, which got me thinking...

Does their mean score approach 0 (neutral) because they are the most well rounded characters? Sentiment diversity should contribute to a character's wholeness, right? Even the most depressed among us can crack a joke after all. So, could this kind of sentiment analysis be used to assess the depth of character? How would that vary across entertainment genres, mediums, or characters' genders?

That's all beyond the scope of this little forray, but the topic is extremely interesting. Below is a plot showing the scores of each line spoken by some of the characters in the movie (the markers are randomly scattered vertically for added depth). Hopefully you can find your favorite line.

Interactive visualizations are best rendered on desktop.

Notes

Check out the code on Github and feel free to reach out to me on social media.