Statistics and data science has been developing rapidly in the sports industry for the past few decades. At first, many sports players couldn’t accept how people who never played professional sports could define their performance with some numbers. Now, data science is used everywhere in sports.
Esports is not an exception. As a card game, Hearthstone has more statistics available to everyone. However, competitive players rarely come to the same conclusion on the interpretation of the same given numbers.
Back around 2014, Hearthstone’s data was defined by opinions of the most popular streamers. ZachO, the founder and head of the Vicious Syndicate Data Reaper project, started writing Data Reaper Reports to analyze the game based on the numbers, not by how the game feels like.
We got a chance to talk with ZachO about Hearthstone, data science, and his experience over half a decade with more than 200 Data Reaper reports.
What motivated you to start writing Data Reaper reports?
Well.... Hearthstone was in the stone age when we started out, right. There was no public information about matchups and win rates. Players didn't have a way to pick decks based on what they wanted to beat. Tier lists were driven by player perceptions and feelings. Nobody knew what the best decks were. Sleeper strong decks struggled to get traction unless they were driven by a content creator.
It just made sense to start out with this content if we had the capabilities of doing that. We had the capabilities, and I thought it would be content that could change the scene. It did.
The first time I read your article was when Che0nsu brought Reno Mage to a Korean tournament after reading Data Reaper report when no one thought Reno mage was a real deck.
Yes. Back in early Mean Streets of Gadgetzan, everyone thought that Reno Warlock was the best deck because Savjz hit #1 legend with it.
But on the first report of Mean Streets of Gadgetzan, we showed that wasn't the case, and Reno Mage was actually stronger. Reno Mage started to gain traction after that, blew up and became a core choice in every tournament line up, and Reno Warlock stagnated.
Were there any sleeper decks that players missed before Mean Streets of Gadgetzan?
I don’t know if missed, but there are two decks that might not have emerged without the reports. Dragon Warrior and Yogg Druid in Whispers were discovered as Tier 1 decks back then, with very low play rates. Very few people talked about them. The reports caused players to pick them up and it snowballed into them becoming pillars of that format.
Nowadays, these sleeper decks are played more often thanks to your reports. However, there are still cases where competitive players don't believe them. Many cases seem to be because of the "skill cap", which is argued between Hearthstone players every meta.
I think cases in which sleeper decks don't get picked up have more to do with whether people are interested in playing them.
For example, Taunt Druid was really strong pre-patch, but people were like "oh whatever, just some aggro deck", until it got to the point where you couldn't ignore how good it was post-patch. [Interview was conducted in early October — Ed.]
Do you think there are ever cases where skill cap can actually matter when choosing a lineup?
Taunt Druid wasn't particularly good for a tournament line up, even now its place in tournaments is more awkward than on ladder because the field is different. I don't think it has to do with the skill cap here. But the skill cap shouldn't be treated as this invisible force. Skill cap can be seen in the data. We can tell which decks improve and which decks stagnate at higher levels of play.
We've just had this Contact Garrote Rogue deck show unprecedented skill scaling and you can identify it in the data. The things people envisioned for numerous decks in the past that were overrated due to skill perception. You can actually find it in Rogue. And it's there, black on white.
Is Contact Garrote Rogue the deck with highest skill cap ever? Were there any other decks that had a close or higher skill cap?
It might be. I've honestly never seen anything like it. I've seen plenty of decks improve their performance at higher levels. But I don't think any deck reached these kinds of numbers where the improvement is by well over 5% from upper Diamond to top legend. Decks rarely if ever reach a 3% overall matchup improvement.
Rogue blows D6 Warlock out of the water. D6 Warlock wasn't particularly special. It was fairly skill intensive, but on normal levels of a 2-3% improvement. It still struggled to break the 50% barrier at top legend. It was pretty sensationalized. And then everything some people thought D6 Warlock was, Contact Garrote Rogue is the real thing.
So this case of Rogue is kind of like eye opening in the sense that "Oh you know, when the mythical Patron Warrior comes, you can actually identify it in the data. You don't need to imagine it, and it won't conflict with reality. It will just happen" And I don't even know how Patron Warrior performs, but it's often brought up as this mythical deck.
Were there ever cases of specific cards having a high skill cap? Specifically asking about Battle Rage and Spirit of the Shark, which were controversial cards back in 2018 and 2019.
It's easy to argue about something that the other person can't disprove. Try to disprove God when talking to a religious person.
Spirit of the Shark was a very dramatic case of a card that plainly sucked. The deck was better without it. There was a better way to tech for Control Warrior if you wanted to target that matchup.
And then after that short window where we were led to believe it was the best card in a deck, it disappeared and never saw play in another meta ever again. It was a bait, that the entire pro community fell for and refused to accept.
There are plenty of silly things Grandmasters have netdecked in the past that made zero sense and you could even logically tell were bad, but they brought it to lineups. They're not immune to making poor judgment calls Like running Safety Inspector in Paladin.
Was Battle Rage just a bad card too, or did it have some meta or matchup specific reasons to ever run it?
Remember that there were two timelines? There was the Descent of Dragons meta before Galakrond's Awakening and Risky Skipper. People still swore by Battle Rage before Skipper. The card sucked.
After Risky Skipper was printed and people started to run more 1-drops alongside Skipper and Eternium Rover, then it became more reasonable.
Players tend to overvalue card draw in general, because losing after you ran out of cards feels really bad. A good example now, Rustrot Viper. Super overrated card, super overplayed — just because tradeable feels good.
Many players got baited by Acidic Swamp Ooze alone, Ooze with tradeable is too good not to be a bait.
Well, I fully expected that to happen. But you know, for Team 5, if they got you to play a sub-optimal card and it convinced you that it was actually powerful. Then for them, it's a success. Their design goal is to basically bait you with cards that feel overpowered but aren't.
Remember Tickatus? That was a great example of a deck that sucked. So for Team 5, it made no sense to nerf it. Like think about it, they just designed a deck that sucked but people played because they either thought it was fun and/or powerful. If you're a card game designer, you go to the colleague who made Tickatus and tell them "congrats, you did it." That's their perspective in many ways.
People don't think about that, but that's often the reason Tier 3/4 that people complain about for being too powerful or warping or whatever, don't get nerfed so quickly. Because in many ways, that's their design goal. Getting players to play decks that aren't powerful but still have fun doing it.
Talking about baits, there are times when players have to take some baits in a new meta before enough data sample is gathered. What do you think is the minimum sample size for a deck to be analyzed?
Well, you test a deck and see if it's working for the matchups you're focused on. If you can't find good data available for it, which was the case early in Rogue's development, you had to do with your own experience.
That's what theorycrafting is, basically, and it's a very important element of a meta's development. I can't create decks out of thin air. I can only work with the data available.
Hearthstone's history on data science looks similar to that of other sports like Baseball. Players refused to believe the numbers at first, but now almost every player relys on data science. Also, players need to contribute to the statisticians for data science to work.
Vicious Syndicate versus Tempo Storm used to be this big debate in Hearthstone in the past — "Raw data or Expert opinion". The truth is, vS is expert analysis of data. You don't have to choose between the two.
Data and objective metrics will always win the debate in the long term, because even if you have an instance where the data doesn't reflect your experience at one point in time, that's just some variance. Over the long term, a player will find more in common with objective reality than word of mouth.
There are lots of pros who regularly read our reports, and contribute data. Sometimes, there will be something that a pro doesn't agree with. Maybe there's a good reason. Maybe there's a card that changes something about a deck and we couldn't evaluate. Maybe some matchup spread is different where he's playing.
But to write the whole thing off because of one or two of these instances is foolish, in my opinion, because we're very good at refining decks. I personally think we're the best in the business at doing that.
The report is very useful for players at every level. I remember when Highlander Mage was being worked on, and we identified Imprisoned Observer as this game changing card. We basically screamed "Play Observer!" for a couple of weeks.
Afterwards, BoarControl started playing Observer on his stream, hit a high legend rank and only then people believling it. Then, Highlander Mage's win rate climbed by nearly 2% — Observer was the best card in the deck outside of Zephrys and Reno. These cases happen all the time, to more modest improvements.
Do you have any advice to casual Hearthstone players about using statistics and data?
I will say this to most players who are not at a competitive level: Pick the deck you're comfortable with and enjoy playing the most.
You don't have to netdeck the highest win rate archetype to climb. The differences between decks are very small over a 100 game sample. An individual won't feel those changes. Those win rates have a bigger impact on the macro landscape of the game.
It's likely that you will do better with a 50% WR deck than a 52% deck, if you're more comfortable and enjoy playing with the first. The biggest difference is your play, not the deck. Optimizing cards and deck choices makes a difference, but improving as a player makes a bigger difference.