library(tidyverse)
Who is the Mightiest of the Mighty Nein?
This is another post using data from Crit Role Stats. In this post, I will be looking at the damage dealt by each member of the Mighty Nein. Spoilers for campaign 2, but that ended a couple of years ago so it seems reasonable to spoil. Now Crit Role Stats already has a spreadsheet detailing damage dealt, and I want to recreate some of that and more. I will mostly be using the All Rolls spreadsheet. I have downloaded the CSV for just the sheet that I need: All Rolls. I will be looking back at the full spreadsheets online to verify my findings, but this is the only one that I need to pull in to work with.
By the way, my soundtrack for this post is Critical Role’s own Mighty Vibes YouTube playlist. It just feels right.
I am interested in damage dealt by character and by episode to learn who had the highest average damage per hit and the highest average damage per episode. I expect these to differ because of the number of attacks that different characters get. Spellcasters can do a lot of damage, and many of their spells still do half damage even if the opponent successfully saves, but they rarely get more than one damage-dealing action per turn. Beauregard, as a monk, deals less damage per hit but can hit an astonishing number of times in a round, so I expect her per-hit average to be much lower than, say, Yasha’s, but her average per episode may be quite a bit higher.
Before we get to that, there’s a fair bit of cleaning to be done with the All Rolls table.
Data Cleaning
First, read it in.
<- read_csv("all_rolls_wildemount_all_episodes.csv")
all_rolls glimpse(all_rolls)
Rows: 15,364
Columns: 10
$ Episode <chr> "C2E001", "C2E001", "C2E001", "C2E001", "C2E001", "C2E…
$ Time <chr> "0:35:40", "0:39:22", "0:39:30", "0:43:04", "0:50:37",…
$ Character <chr> "Fjord", "Jester", "Beau", "Beau", "Molly", "Molly", "…
$ `Type of Roll` <chr> "Perception", "Perception", "Perception", "Insight", "…
$ `Total Value` <chr> "Nat20", "22", "19", "13", "16", "16", "14", "19", "14…
$ `Natural Value` <chr> "20", "19", "16", "8", "11", "11", "14", "14", "9", "1…
$ `Crit?` <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ `Damage Dealt` <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ `# Kills` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ Notes <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "C…
A few things to note:
First, we will need to clean up the Episode
column to use as numeric. Not a big deal.
Second, there is a Damage Dealt
column, which turns out to be very useful. Looking in the source data, we can see that it tells you the total damage dealt for a given roll. In the example of episode 43, wherein the Mighty Nein fight Avantika and her crew, Caleb uses Wall of Fire and rolls 17 damage. However, the Damage Dealt
column tells us that he dealt a total of 101 damage when you consider all of the enemies he was able to hit. Moreover, we can also see that even in cases where there is only one target, the Damage Dealt
column tells us whether the damage was taken in full or halved, whereas the Total Value
column would just tell us the value of the roll. This does mean that we have a lot of parsing to do on that column because it contains a lot of free text as well, but it does seem to usually start with the actual number so that may be relatively easy. Let’s hope.
First, I will pare down the data set to only work with what we need. This isn’t an enormous data frame, but saving that space will always help speed things up.
print(object.size(all_rolls), units = "MB")
1.9 Mb
<- all_rolls |>
damage ::clean_names() |>
janitorfilter(type_of_roll == "Damage") |>
select(episode, character, damage_dealt)
What sort of space did that save?
print(object.size(damage), units = "MB")
0.2 Mb
It was still < 2 MB to begin with, but getting down to 0.2 MB will just make things run faster. Not a huge deal, but a nice benefit.
Let’s take a look at the damage_dealt
column to see what we need to do to parse it.
select(damage, damage_dealt)
# A tibble: 2,119 × 1
damage_dealt
<chr>
1 18 to Creature 1
2 11 to Creature 1
3 9 Cold to Creature 1
4 11 to Creature 1
5 9 (8 + 1 Cold) to Creature 1
6 3 Necrotic to Creature 1
7 12 to Creature 2
8 13 to Creature 1
9 17 (12 + 5 Necrotic) to Creature 2
10 14 (10 + 4 SA) to Creature 2
# ℹ 2,109 more rows
Let’s first look to see if every row begins with a number. I’ll use stringr::str_match()
to pull out the leading numbers. If any entries begin with non-numeric values, the result will be NA
. I can then see if there are any NA
s in the resulting vector to see if there will be a problem with just pulling out the initial digits.
<- str_match(damage$damage_dealt, "^\\d+")
matches anyNA(matches)
[1] TRUE
Ok, so there are NA
s. How many?
sum(is.na(matches))
[1] 50
50 is a lot but not insurmountable. Where are they?
is.na(matches), ] damage[
# A tibble: 50 × 3
episode character damage_dealt
<chr> <chr> <chr>
1 C2E010 Nott Unknown to Rat 5
2 C2E013 Molly Unknown to Molly
3 C2E013 Molly Unknown to Shadow Assassin 3
4 C2E018 Molly <NA>
5 C2E019 Molly <NA>
6 C2E019 Molly Unknown to Molly
7 2-20 Fjord <NA>
8 2-20 Fjord <NA>
9 2-22 Fjord <NA>
10 2-23 Molly <NA>
# ℹ 40 more rows
There are a lot of Unknowns and a lot of NA
s. Looking at a few of these in the original table, I can see that some of them show “Disregarded” in the notes. Others are Molly’s own Blood Maledicts, which damage him but he doesn’t always announce the result. What this tells me is that I need to re-read the damage
data frame, this time including Notes
from the original all_rolls
data. I can look for “Disregarded” and “Blood Curse” in the notes and omit them, because they shouldn’t count toward the total number of hits. After I do that, I can re-run the above code to find non-matches and see where things stand.
<- all_rolls |>
damage ::clean_names() |>
janitorfilter(type_of_roll == "Damage") |>
select(episode, character, damage_dealt, notes)
|>
damage filter(notes != "Disregarded")
# A tibble: 624 × 4
episode character damage_dealt notes
<chr> <chr> <chr> <chr>
1 C2E001 Jester 3 Necrotic to Creature 1 Toll the Dead
2 C2E001 Beau 13 to Creature 2 HDYWTDT
3 C2E003 Molly 3 to Molly Blood Curse of the Eyeless (ampl…
4 C2E003 Nott 2 to Zombie 2 Sneak Attack
5 C2E003 Molly 1 to Molly Rite of the Frozen
6 C2E003 Molly 4 Psychic to Imp 1 Vicious Mockery
7 C2E003 Fjord 14 to Kylre HDYWTDT
8 C2E003 Jester 6 to Imp 1 Word of Radiance
9 C2E005 Molly 3 Psychic to Witherling 1 Vicious Mockery
10 C2E005 Jester 9 Cold to Witherling 4 Hellish Rebuke
# ℹ 614 more rows
Wait, that just dropped a ton of rows. We’ve gone from 2119 to 624. Turns out that what’s happening is that all of the NA
values are getting dropped in addition to those notes that are “Disregarded” so we’re losing a ton of data. I need to create a way to eliminate the “Disregarded” rows but still keep the NA
rows, because having no note is pretty much the norm, so we don’t want to drop those. I do this by passing a logical vector to filter()
even though the vector isn’t exactly a conditional in the format of <column> <operator> <value>
. Instead, I’m using dplyr::if_else()
to check equality against “Disregarded” but to also accommodate NA
values. To do that, I use the argument missing = TRUE
to tell if_else()
that I want NA
values to be included in the results.
<- damage |>
damage filter(
if_else(
!str_detect(tolower(notes), "disregard"),
true = TRUE,
false = FALSE,
missing = TRUE
) )
Next, I need to remove the rows that start with “blood curse.” I will again use if_else()
to account for the many NA
values in notes
.
<- damage |>
damage filter(
if_else(
!str_starts(tolower(notes), "blood"),
true = TRUE,
false = FALSE,
missing = TRUE
) )
So let’s repeat the search for those NA
s and unknowns:
<- str_match(damage$damage_dealt, "^\\d+")
matches sum(is.na(matches))
[1] 33
is.na(matches), ] damage[
# A tibble: 33 × 4
episode character damage_dealt notes
<chr> <chr> <chr> <chr>
1 C2E010 Nott Unknown to Rat 5 HDYWTDT
2 C2E013 Molly Unknown to Shadow Assassin 3 Dissipates
3 2-28 Nott <NA> <NA>
4 C2E035 Beau <NA> <NA>
5 C2E035 Caduceus <NA> <NA>
6 C2E035 Jester <NA> <NA>
7 C2E039 Beau <NA> <NA>
8 C2E044 Caduceus <NA> Prayer of Healing, Caleb, Bea…
9 C2E044 Fjord <NA> <NA>
10 C2E047 Beau <NA> <NA>
# ℹ 23 more rows
The rest of these are judgment calls. Some of them are cases where there was clearly some damage, such as in episode 10 when Nott gets a “How do you want to do this?” against a rat, but it doesn’t say the damage done. I would imagine that the rat had something like 1HP left and any attack would kill it, so Sam didn’t even have to roll. One is a healing roll which should have been omitted to begin with. Some of them are by pseudo-NPCs that a few players are controlling, and those should be omitted. In other cases, I’ve looked back to the original all_rolls
table and found that there are values in the total value
column. Because of that, I’m going to re-read the data (yet again), and include the natural roll and total value columns for use in my manual cleaning. Because there’s no easy way to code through every judgment, I’m going to write out the current data to a .csv, hand-clean these remaining rows, and re-read it back in. In instances where it was clear there was some damage, I’m just going to assign a value of 1. In cases that should be disregarded, I am going to delete the row. In cases where damage_dealt
is NA
but there is a total_value
or a natural_value
, I will populate damage_dealt
with the higher of those two values. I hate to edit data files by hand but this situation calls for it, and it’s low stakes so I’m not that worried about it.
|>
all_rolls ::clean_names() |>
janitorfilter(type_of_roll == "Damage") |>
select(episode, character, total_value, natural_value, damage_dealt, notes) |>
filter(
if_else(
!str_detect(tolower(notes), "disregard"),
true = TRUE,
false = FALSE,
missing = TRUE
)|>
) filter(
if_else(
!str_starts(tolower(notes), "blood"),
true = TRUE,
false = FALSE,
missing = TRUE
)|>
) write_csv(file = "damage.csv")
Edit, edit, edit. Clean, clean, clean.
Reimport:
<- read_csv("damage_cleaned.csv") damage
Let’s check if all of the damage_dealt
values start with a number now:
<- str_match(damage$damage_dealt, "^\\d+")
matches sum(is.na(matches))
[1] 0
is.na(matches), ] damage[
# A tibble: 0 × 7
# ℹ 7 variables: episode <chr>, character <chr>, total_value <chr>,
# natural_value <chr>, damage_dealt <chr>, notes <chr>, ...7 <chr>
We’re good!
Now I will once again pull out just the episode
, character
, and damage_dealt
columns.
<- damage |>
damage select(episode, character, damage_dealt)
Now I can extract that numeric value from the damage_dealt
column and convert the whole thing to numeric.
<- damage |>
damage mutate(damage_dealt = parse_number(str_extract(damage_dealt, "^\\d+")))
We need to pull out the episode numbers to clean that up:
<- damage |>
damage mutate(episode = parse_number(str_extract(episode, "\\d+$")))
Now we can start doing some analysis!
Most Total Damage
Who has done the most total damage?
|>
damage group_by(character) |>
summarize(total_damage = sum(damage_dealt)) |>
arrange(desc(total_damage))
# A tibble: 22 × 2
character total_damage
<chr> <dbl>
1 Beau 5922
2 Fjord 5069
3 Yasha 4741
4 Caleb 3638
5 Nott 3636
6 Jester 3465
7 Veth 1854
8 Caduceus 1732
9 Molly 641
10 Keg 203
# ℹ 12 more rows
Oops! I still have some non-core characters.
<- damage |>
damage filter(character %in% c(
"Beau",
"Caduceus",
"Caleb",
"Fjord",
"Jester",
"Molly",
"Veth",
"Yasha"
))
|>
damage group_by(character) |>
summarize(total_damage = sum(damage_dealt)) |>
arrange(desc(total_damage))
# A tibble: 8 × 2
character total_damage
<chr> <dbl>
1 Beau 5922
2 Fjord 5069
3 Yasha 4741
4 Caleb 3638
5 Jester 3465
6 Veth 1854
7 Caduceus 1732
8 Molly 641
Many of these numbers are pretty close to what Crit Role Stats themselves calculated – I calculate Veth’s total damage at 5490 and they have it at 5583, so pretty close. But some are quite different – I calculate Caleb’s damage at just 3638 but they have him at 5281, a big difference. I really can’t explain what’s going on here, and would love some external validation for why some findings are so different. Most are pretty close so that’s what I’m going with. In another post I could work on reading in data from their Damage Dealt spreadsheet and parsing that within R, but that’s not today. So, sorry Caleb, you’re getting the shaft here.
Episode High
What episode had the most damage dealt?
|>
damage group_by(episode) |>
summarize(total_damage = sum(damage_dealt)) |>
slice_max(n = 1, order_by = total_damage)
# A tibble: 1 × 2
episode total_damage
<dbl> <dbl>
1 98 849
Episode 98 features a big fight on the deck of the Balleater and ends with a dead Fjord and Orly. Big episode, good fight, makes sense that it would have the most damage. This doesn’t totally agree with what Crit Role Stats has, but it’s in the same ballpark.
Damage Over Time
How does damage dealt increase over time?
|>
damage group_by(episode) |>
summarize(total_damage = sum(damage_dealt)) |>
ggplot(aes(x = episode, y = total_damage)) +
geom_bar(stat = "identity")
It’s kind of all over the place, to be honest. There are ups and downs, which is to be expected with the various arcs, and the highs tend to be higher later in the campaign, but it’s not a terribly clear trend.
|>
damage group_by(episode) |>
summarize(total_damage = sum(damage_dealt)) |>
ggplot(aes(x = episode, y = total_damage)) +
geom_point() +
geom_smooth(method = "lm")
It’s clearly a positive slope, but not terribly strong. That’s interesting to me.
Average Damage by Character
We’ve already seen that Beau dealt the most total damage. But who dealt the most damage per hit? Per episode?
|>
damage group_by(character) |>
summarize(avg_damage = mean(damage_dealt)) |>
arrange(desc(avg_damage))
# A tibble: 8 × 2
character avg_damage
<chr> <dbl>
1 Veth 29.0
2 Caleb 20.6
3 Yasha 20.2
4 Caduceus 15.9
5 Jester 15.8
6 Fjord 15.5
7 Beau 10.5
8 Molly 8.43
It’s interesting, but not entirely unsurprising, that Caleb has the highest average damage. His spells can hit multiple targets, sometimes with very high damage for each one, and that all gets counted together. Yasha being close to the top is unsurprising, since she’s a barbarian and her whole role is to hit things hard. It’s also unsurprising that Beau, the overall champ, is nearly last. Monks don’t hit very hard, they just hit a lot. And poor Mollymauk didn’t get a chance to level up enough to do really hard hitting.
Who has the highest per-episode average?
|>
damage summarize(episode_total = sum(damage_dealt), .by = c(character, episode)) |>
summarize(episode_average = mean(episode_total), .by = character) |>
arrange(desc(episode_average))
# A tibble: 8 × 2
character episode_average
<chr> <dbl>
1 Veth 88.3
2 Yasha 76.5
3 Beau 69.7
4 Fjord 64.2
5 Jester 55
6 Caleb 52.7
7 Molly 37.7
8 Caduceus 35.3
No real surprises here, but it’s interesting that Yasha edges out Beau when we look at episode total damage.
Characters’ Hardest Hits
Where did each of the characters most shine? First, I’ll look at which episode had the most damage for each player.
|>
damage group_by(character, episode) |>
summarize(total_damage = sum(damage_dealt)) |>
slice_max(n = 1, order_by = total_damage)
# A tibble: 8 × 3
# Groups: character [8]
character episode total_damage
<chr> <dbl> <dbl>
1 Beau 140 210
2 Caduceus 98 122
3 Caleb 43 297
4 Fjord 82 243
5 Jester 82 209
6 Molly 26 98
7 Veth 139 239
8 Yasha 39 300
Beau and Veth both have their best episodes in the final arc, fighting through Cognouza, and Caduceus has his on the bloodiest episode of the series, 98. Caleb’s comes in the Bad Guys arc, facing Avantika in Darktow. Fjord and Jester both hit their high water marks in episode 82, in the depths of the Happy Fun Ball. Molly does the most damage, sadly, in the episode in which he dies. Lastly, Yasha hits a whopping 300 damage (just beating out Caleb for the highest single episode damage) in the Temple of the False Serpent, walloping a hydra.
Next, I will look at the single hardest hits for each character.
|>
damage group_by(character) |>
slice_max(order_by = damage_dealt)
# A tibble: 8 × 3
# Groups: character [8]
episode character damage_dealt
<dbl> <chr> <dbl>
1 133 Beau 55
2 73 Caduceus 42
3 43 Caleb 172
4 123 Fjord 106
5 119 Jester 106
6 23 Molly 21
7 139 Veth 100
8 138 Yasha 78
- Beau hits for 55 points using an Intuit charge against the Tomb Takers. Makes sense that this was with a device, I don’t know how she could possibly do that much with a single hit.
- Caduceus does 42 necrotic damage on a Remorhaz using the blight spell
- Caleb gets his 172 damage from a big fireball in episode 43, which I’ve mentioned before. Worth noting that this hit 10 separate targets.
- Fjord and Jester tie with 106 damage. Fjord gets his in episode 123, fighting Gelidon, a massive white dragon. The spreadsheet doesn’t give me any details, unfortunately.
- Jester gets her hit in on episode 119, hitting two Aeorian Abominations for 53 damage each. Again, sadly no details.
- Poor Molly yet again bringing up the rear at 21 points, with a big hit on a troll
- Veth has one of the final arc episodes as her highest, hitting for 100 points against Lucien. I don’t have details other than
100 (50x2)
so I’m not sure exactly what happened there. - Yasha has her all-time high on the episode prior, and surprisingly low damage for her hardest hit.
Conclusion
This has been an interesting exercise. The data cleaning portion had some challenges but it gave me the chance to work with the stringr package and exercise some very basic regex skills. I learned that even with careful coding, I couldn’t get things to line up 100% with Crit Role Stats, which is a valuable lesson. And I learned that while Beau dealt the most damage over time, it’s Caleb who can pack the single hardest punch. And it’s no surprise that he does so with fire.