Scripted Automated Tests Vs. AI Driven Tests: Finding the Right Approach for Game Testing

2023年6月30日 22:26

Over time I saw a few companies sinking lots of money into trying to automate tests in games and getting no results. There are a lot of reasons I believe these companies failed, but the main one is what I want to discuss here: trying to use scripted automated tests to test a game. What works for regular software, many times don't work for games.

The Limitations of Scripted Automated Tests

My definition of a scripted automated test is:

A sequence of steps that verifies that the game shows the expected behaviour. This set of instructions is well defined, and the script is able to identify elements on screen, like a user would do, and interact with them in order to control the game.

I experienced first hand, a few times, companies trying this approach and failing in getting return over investment (ROI). And here I want to share my thoughts on why I think this approach is not the best one.

Example 1: Company K tried this approach and invested around two million euros in equipment and people in order to run the scripts in multiple devices, OS versions and languages. After 3 years, the tests were still were breaking all the time, requiring maintenance in every sprint and stability was far from the expected. In the end, teams did not trust the automated tests and manual testers were cheaper and faster.

Example 2:
Company W invested in automating their multiplayer games for 4 years. They had a team with 10 engineers developing a test framework and automated tests. The automation team used most of their time creating POCs (proof of concept) to convince the teams of the value of automation.
The value of running these tests was very low: only the simplest tests were running frequently. These same tests when executed by a manual tester would take 10 minutes or less for every new build.

These experiences showed me that the cost of writing these tests is too high and the benefit they bring is not enough to cover the costs. Scripted automated tests can be valuable when you automate a deterministic area of the game that is easy to interact with, like:

Access a shop inside the game, buy a power up, apply the power up and verify that player stats improved according to the specification.
Starts the game on different devices and checks that the game is loading and displaying the start screen.
Go through the tutorials.
Create a new user profile, change preferences, save the changes and verify that the changes have been saved.

By automating these deterministic scenarios you increase safety in making changes and can cover important scenarios (like any profit related scenario). I believe they should always be part of a sanity check that confirms that a build is ready for testing. And that is where the value ends, when test scenarios get more complex, these tests do not delivery enough ROI.

Example 3:
You have a racing game and you write a script that is able to accelerate, keep the car on the road, get to the finish line and on the way execute some verifications.

What if you're hit by an opponent car and you start going backwards?
What if one of the tracks has no clear delimitation of the road?

Many different scenarios start to pile up and exceptions become the norm. This leads to a test script full of if conditions and code readability decreases. Over time, debugging the script becomes a real challenge.

The main problem: scripted tests are fragile and they require frequent maintenance. Even if you do continuous integration it will take too long until you identify the need to update the tests since you can't run the tests for every commit, because they are too slow.

I got convinced that for more complex scenarios, scripted automated tests should not be the focus: AI driven tests are easier to create and more robust.

Shifting to AI-Driven Tests for Complex Scenarios

We can define an AI driven tests as automated tests that make use of machine learning algorithms to simulate a player and verifies that the game works as expected. These tests can handle more complex scenarios and deal with a wide variety of scenarios, being more robust and flexible than scripted tests.

Games already collect a lot of player data with telemetry in order to understand the players. This data is labeled and could easily be used for supervised learning. We can even expand the learning if the game as a replay functionality or multiplayer.

Example:
Imagine a multiplayer game with synchronous sessions. All the clients will need to constantly send their location in the world (coordinates) and update different player stats (points, health, ammo). The server then has to broadcast the status of the entire game to all players. All this data can feed the learning AI and generate test cases. Besides the data generated by players, we can count on data generated during manual tests, data from previous iterations of the same IP, replay data and others. Live games can benefit even more from this approach, since they are already in production and generating the data we require to train the models.

With AI driven tests we don't need the if statements. The AI driven tests are able to take decisions, deal with different game feedback and handle changes in design without requiring constant maintenance. This decreases the costs of development and maintenance and increases coverage, by consequence improving the ROI.

Striking the Right Balance: Scripted Tests and AI-Driven Tests

There is no silver bullet when it comes to testing games. However, based on my experience, this is the strategy that I recommend:

Write scripted automated tests to cover the more deterministic areas of the game, the ones that resemble regular software the most. The tests where you do A, then B should happen (given, when, then for those that like gherkin).
- UI, Menus and Navigation
- Game Progress and Retention
- Player Profiles and File I/O
- Identities and Cloud Save
- Tutorials
- Telemetry and Analytics
- Settings and Configuration
- In Game Purchases
Focus on AI driven tests when it comes to testing gameplay, multiplayer sessions, performance… They will provide more return over the investment in time and money. You will have more robust and flexible tests.
Use the data acquired from players for supervised learning. Keep the algorithm learning, the more data the better. Use the AI driven tests for repetitive work and let the human testers focus on the more creative work.

~~The photo is a random whiteboard discussion. Something I miss from the times before hybrid and remote work.~~

Enjoy life,
Leonardo Ribeiro Oliveira

この記事が気に入ったらサポートをしてみませんか？