South Park is one of TV’s longest running scripted, primetime shows. It has won Emmys, sparked controversy, earned lots of money and given me some of my best laughs. The fact it has survived so long is remarkable. Fans might suspect that the show has made changes that have let it survive. It is impossible to say what actually caused the show to survive. However, it is possible to analyze what has changed, which is what the rest of this post does.
My only goals were to explore a show I love and to play with some text and visualizations libraries. I had a good time so I decided to share what I did. This is meant to be a fun read for any South Park fan. Data nerds can check out the gory details on Github.
A Change in Jokes
We can start exploring the evolution of South Park by analyzing jokes. Over time, some of South Park’s most well known jokes have stopped appearing as often. The amount of times someone says “Killed Kenny” has declined dramatically since the show’s early years:
Actually, in later episodes, the writers will sometimes dance around these signature jokes. They tease the audience by putting Kenny in dangerous situations, but don’t kill him.
Similarly, all characters have also stopped saying “learned something” as frequently. This was a recurring joke, often said by Kyle. However it has faded as the show has progressed:
What has replaced killing Kenny and Kyle learning lessons? One new joke is Butters getting grounded by his parents:
Changes in Character Roles
Jokes are not the only part of show that has changed over time. The characters and the their roles have shifted. To explore this, I assigned a value of -1 to lines a character said in the first half of the series (seasons 1-9) and a value of +1 to lines they said in the second half (seasons 10-18). If a character has an equal number of lines in both halves of the show, this should sum to 0*. If it’s negative then they had more lines in the first half than the second and vice versa.
Here are scores for some of the most important characters:
This doesn’t explain character progression across seasons. However, it is useful in determining which characters are worth exploring further. Stan, Kyle, Chef, Mr. Garrison and Jimbo all have declined substantially. In the second half of the series, Randy and Butters became more featured characters.
Kyle gave up lines. The decline in lines for Kyle also meant a decline in lines for his parents (Sheila and Gerald). Here is the whole family:**
Butters became a much larger character. Stephen is Butter’s dad and that meant more lines for him. Linda, Butters’ mom, had a brief spike in the some of the earlier seasons, but faded as the show progressed. Even though Butters is a more important character his importance fluctuates season to season. See here:**
Jimbo is featured much less often. Chef was killed off in the 10th season after the actor objected to the Scientology episode. Here is their progress across seasons:
Mr. Garrison became Mrs. Garrison in the 9th season. After a few seasons, Mrs. Garrison changed back to Mr. Garrison. You can see both of them on this plot:
As any fan of the show knows, Randy has become more important. In the first season, Randy accounts for less than 1% of the dialogue. By the 18th season, he is speaking almost as much as his son. While Stan remains an important character, he talks less. Randy’s wife, Sharon, fluctuates throughout the series:**
Changes in Lead Characters
Who is the lead character for an episode’s main plot and how has that changed over time? We can look at which characters had the most words for an episode. Excluding the narrator, only 12 characters had the most words of an episode more than twice. Here they are:
As you can see, Cartman is the main character the most often by a wide margin. However, other characters have changed roles over time. This chart shows the changes in episode leadership from the first half of episodes (1 – 128) to the second (129 – 256).
Originally, the show was carried by Cartman, Stan and Kyle. Mr. / Mrs. Garrison and Chef played large supporting roles and even led some episodes. Randy and Jimmy were minor characters that occasionally got an episode. Butters began the show as a background character. He led his first episode in Two Guys Naked in a Hot Tub (episode #39). He started becoming a major character in the fifth and sixth seasons when he temporarily replaced Kenny. He faded a bit in seasons 9 – 11 and has since come back strong. And while he has become a major character, he isn’t the lead character of the main plot as often as I expected.
In the second part of the show, Cartman takes up even more episodes. South Park is clearly Cartman’s show. Even so, it’s diverse with different characters leading episodes. The positioning of other characters has changed.
Stan and Kyle lead less often though still remain core characters. Chef was killed off. Mr. / Mrs. Garrison is still a major supporting character, but leads fewer episodes now. Jimmy continues to be a supporting character who occasionally leads an episode. Randy has gone from a minor character to one the show’s most important. In fact, only Cartman has led more episodes than Randy in the show’s second half.
Caveat: the show often has a more complicated structure than just a single lead. There can be an additional subplot or two. However, it’s hard to disentangle the 2nd most important character from the core plot and the lead character of the subplot.
Interlude — Fun With Words
Let’s take a break for some fun. Look at some of the words and phrases that can best predict when a specific character is speaking:
If you are a fan of South Park, those probably aren’t very surprising. However, they are quite funny such as Cartman saying “seriously” or “ey”. You can see a lot of characters’ signature phrases such as Kenny saying “woohoo” or Towelie saying “high”. You can also see specific jokes from memorable episodes such as Kanye saying “fish”. It’s also funny to see how many of Butters’ words are non-committal sounds such as “uh”, “wuh” and “ih”.
An interesting takeaway: a character saying “Eric Cartman” is predictive that Cartman is talking. His alter ego, Coon, also likes saying his own name. For most other characters, saying their name is not nearly as predictive that character is talking.
Changes in Show Structure
The show’s structure has changed. The seasons now have fewer episodes:
The spread of the dialogue has changed as well. The lead character’s median dialogue each season started at 20% (the blue line). That has increased to 25%, which is a 25% increase from 20%. Who is speaking less?
It’s not the number 2 most talkative character of the show (the red line). That has held fairly steady at 15%. However, all of the other character besides the the two most talkative are speaking less (the yellow line):
Each episode is more concentrated around the lead character than it used to be.
South Park rarely had multi-part episodes at the show’s start. In the first 9 seasons, there were only 7 multi-part episodes. In seasons 10 – 18, there were 21 multi part episodes. Since there were fewer episodes in later seasons, this is an extreme proportional difference. Look at this:
You might wonder how fans feel about different types of episodes. I scraped IMDB using import.io to get some fan ratings. Here are the average ratings for the show when each character is the lead:
Clearly, fans love Cartman. Butters is almost tied with Cartman and Randy is tied with Stan. This is impressive considering that Butters and Randy started as background characters. They now lead episodes and get comparable ratings to the shows’ most important characters. Even so, Cartman is the show’s all star. He carries the most episodes and has the highest ratings. Kyle is the least popular of the boys. In fact, Matt and Trey almost killed him off in the fifth season. Instead they just shrunk his role. That seems to fit audience preferences.
How do fans feel about multi-part episodes? Here is a matrix of the average ratings. The rows split the show into seasons 1 – 9 and 10 – 18. The columns compare single-part and multi-part average episode ratings.
As you can see, the multi-part episodes have been substantially better rated than the single-part episodes since season 10 started. This trend towards more multi-part episodes is well received by audiences.
Note: Since the ratings came from a user submitted internet poll, take them with a grain of salt.
Changes in Speech Patterns
How has the dialogue changed over time? Well there are actually fewer lines per episode:
To make up for that, there are more words per line:
Some Things Never Change…
As fans might suspect, South Park often gets involved in politics. Below is a graph that illustrates how the show makes more political references during election years with the blue line. The red line is the number of times a character says “president”. In between election years, there is a rise in jokes about the incumbent. Making fun of politics and politicians will always be a South Park signature.
A Bit More Fun
Here are some statistics for different characters:
Here are a couple of fun facts:
- Cartman has the most lines. He is followed by Stan and Kyle.
- Kanye, is tied with Towelie for the lowest average word length. They both use a lot of small words when they talk.
- Kenny has lowest average words per line by a wide margin. He doesn’t talk for long when he does say something.
- A high percentage of self referential words is claimed*** to be a sign of narcissism. Towelie scores the highest. Kanye West scores almost as high as Satan. Unsurprisingly, Cartman is towards the upper end of this list. Surprisingly, so is his mom, Liane. The Announcer has the lowest percentage of self referential words. In the world of South Park, Jesus falls in the middle of the pack for self referential words. He is barely above a would-be vigilante (Coon) who masterminded a threat against a hospital to try to raise his public profile.
- Cartman has the most monologues and the most final lines of an episode. He is followed by Kyle, Stan and Butters in both categories.
South Park has changed over time. Kyle and Stan remain core characters, but carry fewer episodes. Kyle episodes got lower ratings on IMDB than Stan and Cartman episodes. Popular characters like Randy and Butters were given more lines. Randy has led many more episodes. The change in jokes mirrors these changes in characters. Signature jokes of Kyle (learned something) have faded to make room for new signature jokes such as Butters getting grounded.
Episodes tend to have fewer, but longer lines. The writers give the lead character a larger percentage of the dialogue. The writers are more confident in concentrating the dialogue around fewer characters and the words around fewer lines. The show has fewer episodes each season. They also have more multi-part episodes. In the 19th season (not included in this analysis) almost every episode is connected.
For all of the changes, one constant is Cartman. The show has gave him more lines and more episodes to lead. Fans loved it as they rate his episodes the highest.
That’s it. I can’t tell you that South Park survived because of these changes. However, I always loved it and still do. Hopefully, you do to. Feel free to share your thoughts on the show’s evolution.
This analysis only goes up to Season 18 Episode 9. All code and data can be found on Github.
*There are actually more words in the first half of the series. The numbers are weighted to account for that.
**To muffle noise, this uses a 4 episode rolling average.
***There is research that suggests self referential words are not correlated with narcissism.
3 thoughts on “South Park Text Analysis: They Stopped Killing Kenny and the Rise of Randy”
Great job. I would assume you used scripts from the show as data. What program did you use for this (besides import.io)?
What about Timmy?? Surely he would have the least words per line of dialogue.
I think the amount of focus episodes for each character yet is a bit off. For example, I don’t believe Stan has only had 24 episodes at the time you wrote this. Cartman isn’t the main character, but other than that, good job.