It’s in the Game: the Commercial Uses of Esports Player Data
The first known prioritization of statistics and data in professional sports was via former baseball player-turned Oakland Athletics’ General Manager, Billy Beane. His use of empirical baseball data (known as Sabermetrics) radicalized the way data collection is used in sports, from scouting talent to measuring a player’s responsive behavior.
Competitive video gaming has not always relied on analysts in the same way, but the introduction of dataset heavy titles such as StarCraft and Dota 2 have made analysts indispensable among team staff. The escalating audience numbers and prize money of tournaments have now created a clear commercial case for traditional analytics companies such as SAP, Microsoft, and IBM, to build and outfit teams with an esports-ready version of their pre-existing data software. In addition, several of these companies are also developing tools that integrate analytics into esports broadcasting.
What is Being Measured?
Those familiar with any given traditional sport can make assumptions about the types of player data that is recorded. In tennis, this includes serve direction and placement. In basketball, Bayesian statistics have been used to measure a team’s defensive ability on the court. But to those with little understanding of esports games and their respective genres, it is first important to explain what is being measured, and why it impacts a player’s performance.
Spatial movement and player reactions
All competitive video games feature a finite area of playing space, in which the movements of the player are further limited (e.g. by terrain, obstacles). Whether it’s a shooter or strategy title, a player’s positioning and distance will affect how much reaction time is required, and this, in turn, can be measured game by game.
Commercial Example: Tobii and SteelSeries each offer eye-tracking solutions that let developing players match their eye-movements against professionals.
Combinatorial space of possibilities
Even during complex strategy games with hundreds of different units on screen at any given moment, there is still a hierarchical order to every possible action. Players must constantly modify and augment their live-strategies, and account for random variables (if any). The right software can identify which decisions by a player were the most optimal in a given gameplay scenario.
Commercial Example: Community-run resources such as LoLSkill, League of Graphs, and Oracle’s Elixir list the most picked and banned champions in League of Legends, as well as their win rates. Some also provide calculators that can estimate a team’s probability of winning a game after the 15-minute mark.
Management of player economy and resources
hough some esports titles do not require the player to obtain and/or upgrade items and equipment, several of the most widely-played games feature some kind of in-match economy. Essentially, players earn currency by defeating enemies, which they can use to purchase or modify weapons. Even battle royale titles, which do not have an in-match economy, limit the ammunition and healing equipment available to the player.
Commercial use: The Counter-Strike: Global Offensive app CSGO Scout includes a smart economy tool, while TrackDota logs data on items purchased and buybacks in Dota 2.
Patch changes and game updates
A recurring challenge for professional esports players is adapting to the weekly changes in their respective video game. While traditional sports makes incremental rule changes and regulations, esports titles see almost weekly alterations. Analysts and software developers alike need to factor the impact of these “patches” and ensure their strategies match the current version of the game.
Commercial use: The analytics platform Mobalytics maintains multiple “tier lists,” updated after every patch, that lists the most viable champion choices for both regular players and those considered High-ELO (i.e. in the top 1%-2% by ranking).
Acquiring Match and Player Data
On the one hand, the fact that every video game action is entirely “digitized” makes it inherently better suited to empirical analysis than traditional sports. The latter has to be digitized before it can be analyzed, and esports skips that potentially time-intensive step. However, this leads itself to two issues: there is a deluge of digital information to sift through, and for some games, there are immovable barriers to entry.
“The central source of truth is always going to be from the game publisher, and the application programming interface (API) they provide,” said Matthew Gunnin, CEO of Esports One, whose products use computer vision and machine learning to generate statistics from live esports broadcasts. In the early days of the company, Gunnin relied on the API as well as manual data capture. “Now, we rely heavily on computer vision for a lot of that data, but it varies on a case-by-case basis, depending on the game and how early supported it is,” he said.
The richness of the data, and its preferred usage, depends both on the game itself and the competition structure around it. Dota 2, a multiplayer online battle arena (MOBA) title, has been one of the more accessible titles for data and analytics, as the publisher Valve has put up less barriers in front of open source projects. Data can be accessed on the most granular level, including advanced match data extracted from match replays.
On the professional level, Dota 2 is played in a tournament cycle, with open circuits and several highlight events per year, as opposed to a weekly league format. This means players are less concerned with analyzing their own practice sessions, and instead there is high relative value in scouting opponents and finding patterns.
“Right now we have around 65K matches, that are historical in the sense they are past matches,” said Melvin S. Metzger, an esports developer for SAP HANA, who noted that the company can fully analyze any Dota 2 match played in a public pro setting. “We put these into context with what happens in matches that are played now, or in the future.”
When comparing an individual player’s data to that of an entire team or competition, the data usage is subjective to whatever requests are being made. “If I have one player and I want to retrieve every kill that he or she has done on a Tuesday afternoon with this champion, it will only have x number of matches,” said Gunnin.
“If I want to see that same sort of information for the League of Legends Championship Series (LCS), for example, on a certain champion, it’s having to do the same thing for one individual player, and do that across all players. But that’s a unique case, in the sense that the different schemas that you’re building can make that process easier, so it’s being able to preempt those sorts of requests before they’re made.”
As with most subsections of the esports industry, player development through data is still in its infancy. For many established tech companies, the goal, for now, is brand attachment. Whatever practical results emerge from, say, Cloud9’s partnership with Microsoft, the marketing campaign is what will resonate with the casual esports fan.
“The future of data, and how it’s going to be used from our perspective, is how we will correlate the stats we have on users to the information, data, and events that are happening on screen,” said Gunnin. “We’re now looking at all of your gameplay, recognizing how the pros play […] and start storing and tagging events in the game for you to reference while you’re in the game.”
Case Study: SAP, Team Liquid, and Dota 2
In April 2018, enterprise software company SAP announced it would be the official innovation partner of esports team organization Team Liquid. Focused around the latter’s Dota 2 team, one of the core aspects of the partnership was to develop software based on game-derived data, helping to analyze player performance and scout new talent.
After a year of collaboration, SAP has learned to develop tools that are actually relevant for a professional esports team, and likewise Team Liquid learned what SAP needs to know to create such tools. “That’s basically your common software development process, but it is, because of the complexity of the topic, very challenging,” said Metzger.
The core team developing the software comprises just two people, both with full-stack development skills paired with data science knowledge. One of the earliest tasks for SAP was to assist Team Liquid in the drafting phase; the pre-match portion of Dota 2 where players select their own heroes, while banning choices for the other team. The potential synergies between heroes makes this one area where teams are seeking to gather the most data possible.
“One of the challenges is that none of the public channels, essentially, gives you what you need or a real comprehensive version of the truth,” said Milan Cerny, SAP’s property owner and innovation lead for esports. “We can provide full transparency in what we’re working with. We can cater to the needs of the team itself, and the person interacting with it, in terms of how much they need to narrow down, filter, slice, and dice, and what they want to get out of it.”
Cerny explained that the final drafting solution gives the player, coach, or analyst a good idea of what he is looking at, while still offering room to draw their own conclusions. “Obviously we were looking at in-game aspects as well. Heat mapping for all kinds of events within a game, whether it be hero movement or wards,” said Cerny.
While the Team Liquid tech has remained largely behind closed doors, one area where SAP is a bit more public is its broadcast partnerships. Since late 2018, the company has worked with every notable Dota 2 tournament organizer, including PGL, EPICENTER, DreamHack, and ESL, providing backend services, presenting data on hero picks, win-rates, etc, on screen. Casters and production analysts are also provided insights during the pick-and-ban phase, and in segments prior to and after a match.
“We have our strength in digesting, analyzing, and processing those big amounts of historical data, and putting that into context with the data coming out of the match you’re currently looking at,” said Milan. For these events, SAP partnered with a startup called Layerth, which produces esports spectator tools, and which already had observers and other staff directly on-site and integrated into the broadcast team.
“We identified use cases where it makes sense to query a historical database and get some information,” said Cerny. “Whether an item timing is exceptionally good or bad. Whether a net worth is exceptional in a good or bad way across the entire match.”
While Dota 2 is notably open in regards to data collection, there are still limitations for a third-party software developer looking to build solutions around the game. “There is a lot of reverse engineering in place.” said Metzger “That’s a challenge that you have in any form of data analysis, I guess. But it is a challenge that is ongoing, especially with patches changing the game, changing file structures.”
As an example, the first time a new hero was added to Dota 2 (which already boasted over 100) when SAP was working on the tool, Metzger said nothing worked afterwards. “We look at the patch notes, try to understand what the changes mean for our system, what do we have to adapt, and we’ve become very fast at that.” Though specific details could not be shared, SAP is in a constant exchange with Dota 2’s developer, Valve.