Mobile Game: A/B Testing & Player Retention

This project analyzes A/B testing results from "Cookie Cats" to evaluate the impact on player retention. It examines user progression data at specific time points, providing insights into how game design changes affect player engagement.

Year
2024

Application
Cookie Cats

Project Background


Game Overview

Cookie Cats is a popular mobile puzzle game developed by Tactile Entertainment.

It's a classic "connect three"-style puzzle game where players must connect tiles of the same color to clear the board and win levels.

The game features singing cats, adding a unique charm to the gameplay.

Project Introduction

In Cookie Cats, players encounter gates that require them to wait or make an in-app purchase to progress. These gates aim to enhance player enjoyment by providing enforced breaks, and they also drive in-app purchases.

This project analyzes an A/B test where the first gate was moved from level 30 to level 40. I will investigate the impact of this change on player retention, specifically focusing on 1-day and 7-day retention rates.


Data Structure


Dataset Description

This project's dataset contains information about player behavior and retention in different versions of a mobile game.

The data is from 90,189 players that installed the game while the A/B test was running. The variables are:

  • userid: A unique identifier for each player.

  • version: The version of the game the player was assigned to (either "gate_30" or "gate_40").

  • sum_gamerounds: The total number of game rounds played by the player during the first week after installation.

  • retention_1: A boolean value indicating whether the player returned to play 1 day after installation.

  • retention_7: A boolean value indicating whether the player returned to play 7 days after installation.

Data Cleaning

After ensuring there are no missing values in the dataset, I identified a significant outlier. This player played approximately 50,000 game rounds during the first week, which is highly unusual.

Due to this outlier, the boxplot is extremely skewed and not very informative.

Therefore, I removed this record from the dataset to improve the accuracy and clarity of our analysis.

Player Behavior

As we can see, 50% of players played fewer than 16 game rounds during the first week after installation, and 75% played fewer than 51 rounds.

Nearly 4,000 players did not play a single round after installation. Possible reasons may include:

  • They downloaded several new games simultaneously and were more attracted to other games.

  • They opened the app but did not like the design, interface, or music, so they quit before playing.

  • They have not started playing the game yet.

Another noteworthy statistic is that over 14,000 players played fewer than three rounds. For these players, the reasons for leaving may include:

  • They did not enjoy the game, which is probably the most common reason.

  • The game was different from what they expected.

  • The game was too easy and they got bored.

From the line chart showing the distribution of game rounds played (0-100 rounds) during the first week, we can observe that the distribution is highly skewed with a long tail on the right.

  • Skewed Distribution: Many players played fewer than 20 rounds before leaving the game. This indicates that the majority of players did not engage with the game for an extended period.

  • Early Drop-off: The steep decline in the number of players from 0 to 20 rounds suggests that there might be initial gameplay issues or unmet player expectations that cause early drop-off.

  • Steady Engagement: Beyond 60 rounds, the number of players remains steady at approximately 300. This indicates that players who cross this threshold tend to stay engaged with the game.

Understanding why a large number of players quit the game at an early stage is crucial. Tactile Entertainment could try to collect player feedback through an in-app survey to gain more insights.


Statistical Analysis


Methodology

An A/B test was conducted to determine the impact of moving the first gate from level 30 to level 40 on player retention. Players were randomly assigned to one of two groups:

  • Control Group: Players encounter the first gate at level 30.

  • Treatment Group: Players encounter the first gate at level 40.

I measured player retention at two-time windows:

  • 1-day Retention: The percentage of players who returned to the game 1 day after installation.

  • 7-day Retention: The percentage of players who returned to the game 7 days after installation.

1 Day Retention Analysis

Bootstrap Method

Bootstrap is a robust statistical technique used to estimate the distribution of a statistic by resampling the original data with replacement. This method is especially valuable when the sample size is limited.

In our A/B testing project, we used the bootstrap method to estimate the distribution of the 1-day retention rates for two different game versions (gate_30 and gate_40).

I performed 10,000 bootstrap resamples to gain insights into player retention.

Steps involved:

  1. Resampling: Randomly sample with replacement from the original dataset 10,000 times, creating new samples of the same size.

  2. Compute Statistic: For each new sample, calculate the mean 1-day retention rate for both the control group (gate_30) and the test group (gate_40).

  3. Construct Distribution: Use these bootstrap samples to build the distribution of the mean 1-day retention rates, which provides insights into the variability and reliability of our estimates.

Bootstrap Method Example

# Creating a list with bootstrapped means for each group
boot_1d = []
for i in range(10000):
    boot_mean = df.sample(frac=1, replace=True).groupby('version')['retention_1'].mean()
    boot_1d.append(boot_mean)

# Transform list into a dataframe
boot_1d = pd.DataFrame(boot_1d)

# A Kernel Density Estimate plot of the bootstrap distributions
ax = boot_1d.plot(kind='density')

# Adding labels and title
ax.set_title('Bootstrap Distributions of 1-day Retention Rates')
ax.set_xlabel('Mean 1-day Retention Rate')
ax.set_ylabel('Density')

plt.show()
        

Application in Our Project

I applied the bootstrap method to compare the 1-day retention rates between two game versions: gate_30 and gate_40. The specific steps are:

  1. Data Resampling: From the original data of 90,189 players, we created 10,000 new samples by resampling with replacement.

  2. Calculate Mean: For each resampled dataset, we computed the mean 1-day retention rate for both groups.

  3. Construct Distribution: We generated the distribution of these means and visualized it using a Kernel Density Estimate (KDE) plot.

This chart shows the percentage difference in 1-day retention rates between two A/B test groups: one with the gate at level 30 and the other with the gate at level 40. I used the Bootstrap method to estimate and visualize this difference.

Key Findings:

  1. Percentage Difference: The difference between the two groups is most likely to be around 1% to 2%.

  2. Distribution Analysis: 96% of the distribution is above 0%.

Conclusion:

The analysis suggests that in our dataset, the 1-day retention rate is 96% likely to be higher when the gate is at level 30 compared to level 40.

Therefore,

The analysis shows that players are more likely to continue playing the game the next day when the gate is set at level 30 compared to level 40. In fact, there's a 96% chance that having the gate at level 30 will result in better player retention after one day.

Application of Z-Test in This Project

I will also apply the Z-test to compare the retention rates of the two groups:

  • Control Group: Players encounter the gate at level 30.

  • Treatment Group: Players encounter the gate at level 40.

Hypotheses

  • Null Hypothesis (H0): p30−p40≤0

    • The retention rate for the gate at level 30 is less than or equal to the retention rate for the gate at level 40 at a 5% Type 1 error rate.

  • Alternative Hypothesis (H1): p30−p40>0

    • The retention rate for the gate at level 30 is greater than the retention rate for the gate at level 40.

Z-Test for Proportions

import statsmodels.api as sm
# Perform Z-test
z_score, p_value = sm.stats.proportions_ztest([c_30, c_40], [n_30, n_40], alternative='larger')
        
  • Z-score: 1.787
  • P-value: 0.037

Interpretation:

The p-value is less than 0.05, indicating that the observed difference in retention rates is statistically significant. Therefore, we reject the null hypothesis.

This result suggests that the retention rate for players encountering the gate at level 30 is significantly higher than for those encountering it at level 40.

Comparison of Methods

I used two methods to analyze the 1-day retention rate, and both yielded the same result.

Interestingly, the percentage difference (96%) is similar to the complement of the p-value (1 - p-value = 0.963), indicating the robustness of our findings.

7 Day Retention Analysis

Application of Bootstrap

I applied the same bootstrap method used in our analysis of the 1-day retention rate to analyze the 7-day retention rate.

Distributions of 7-day Retention Rates:

  • The 7-day retention rate is higher for players encountering the gate at level 30 (around 19%) compared to those at level 40 (around 18.5%).

  • The distributions are distinct, showing a clear difference between the two groups.

Percentage Difference in 7-day Retention:

Key Findings:

  • Percentage Difference: The 7-day retention rate for players encountering the gate at level 30 is about 5% higher than those at level 40.

  • Probability Analysis: 99.93% of the distribution is above 0, indicating a higher retention rate for gate_30.

Conclusion:

The analysis suggests that in our dataset, the 7-day retention rate is 99.93% likely to be higher when the gate is at level 30 compared to level 40.

Therefore,

The analysis shows that players are more likely to continue playing the game after seven days when the gate is set at level 30 compared to level 40. The 7-day retention rate is significantly higher for players encountering the gate at level 30 than for those at level 40.

Both the 1-day and 7-day retention rates are significantly higher when the gate is set at level 30 compared to level 40.


Exploratory Data Analysis


Relationship and Correlation Between 1-Day Retention and 7-Day Retention

For players who encounter the gate at level 30:

  • 7.40% of players did not return on day 1 but came back on day 7.

  • 33.32% of players returned both on day 1 and day 7.

  • 92.60% of players did not return on either day 1 or day 7.

  • 66.68% of players returned on day 1 but did not return on day 7.

For players who encounter the gate at level 40:

  • 6.99% of players did not return on day 1 but came back on day 7.

  • 32.34% of players returned both on day 1 and day 7.

  • 93.01% of players did not return on either day 1 or day 7.

  • 67.66% of players returned on day 1 but did not return on day 7.

For the total dataset:

  • 7.19% of players did not return on day 1 but came back on day 7.

  • 32.83% of players returned both on day 1 and day 7.

  • 92.81% of players did not return on either day 1 or day 7.

  • 67.17% of players returned on day 1 but did not return on day 7.

Conclusion

As the analysis shows, the retention rates for gate 30 are slightly better than those for gate 40. However, despite this small difference, the overall trend remains consistent: players who return after 1 day are significantly more likely to be retained on day 7, regardless of the gate level.

Specifically, players returning on day 1 are about 5-6 times more likely to return on day 7 compared to those who don't return on day 1.

Recommendations

  1. Enhance First-Day Experience:

    Design an engaging tutorial and early levels that showcase the game's best features. Implement a "First Week Challenge" where players unlock special rewards for completing daily tasks over the first 7 days.

  2. Re-engage Lapsed Players:

    For players who don't return on day 1, send a push notification with a tempting offer, such as "Your progress is waiting! Return now to receive a free level skip and double coins for your next completed level!"

  3. Optimize Early Game Progression:

    Analyze player behavior in the first 7 days to fine-tune the difficulty curve and reward frequency. Introduce new game mechanics gradually to keep the experience fresh. Consider adding bonus levels or mini-games every few stages to break up the main gameplay and maintain interest.

  4. Implement Social Features:

    Even in a single-player game, add social elements like leaderboards, score sharing, or the ability to send and receive "lives" or power-ups from friends. Encourage players to connect their accounts to social media to compare progress with friends, fostering friendly competition.

By implementing these recommendations, we aim to leverage the strong correlation between 1-day and 7-day retention to improve overall player engagement and long-term game success.

Analysis of Game Rounds and Retention Rate

Since the number of game rounds represents data from the first week, we use the 7-day retention rate as the reference point for our analysis.

Distribution of Game Rounds

Summary statistics for the game rounds:

  • Count: 90,188

  • Mean: 51.32

  • Std: 102.68

  • Min: 0

  • 25%: 5

  • 50%: 16

  • 75%: 51

  • Max: 2,961

Based on this distribution, game rounds are categorized into:

  • '0-5'

  • '6-16'

  • '17-51'

  • '52-100'

  • '101-200'

  • '201-500'

  • '500+'

By analyzing the retention rate for each group, we found that:

  1. Positive Trend: The retention rate increases as the number of game rounds played increases.

  2. Low Engagement: Players who played '0-5' rounds have the lowest retention rate.

  3. High Engagement: Players in the '52-100', '101-200', and '201-500' groups have even higher retention rates.

  4. Highest Retention: Players who played more than 500 rounds have the highest retention rate, close to 100%.

Therefore, encouraging players to play more game rounds can significantly improve the 7-day retention rate.

High Loyalty Players Analysis

I define "high loyalty players" as those who return to play the game on both the first and seventh days.

I found:

  • Players who engage in 101-200 game rounds show the highest loyalty, with the largest number of high-loyalty players.

  • The 201-500 and 52-100 game-round groups also have a strong number of loyal players.

  • Very few players in the 0-5 and 6-16 game round groups are high-loyalty players.

This suggests that mid-range game rounds (52-500) are the most attractive players to return.


Executive Summary


Key Findings and Strategic Recommendations

  1. Optimize Level Gating: Setting the gate at level 30 instead of 40 can significantly increase player return rates. This adjustment improves both day 1 and day 7 retention, providing a more balanced gaming experience for players.

  2. Prioritize Early Experience: Data shows that players who return on day 1 are 5-6 times more likely to be retained on day 7 compared to those who don't. This highlights the critical importance of optimizing the game experience from day 1 to day 7 to establish player habits.

  3. Focus on Mid-game Players: Analysis indicates that players who complete a moderate number of game rounds (52-500) are most likely to continue playing. This phase is crucial for building player loyalty, and we should design more engaging content and reward mechanisms for these players.


Learn More on