https://ift.tt/Agw8CtV A more customized approach to event analytics beyond Firebase and Google Analytics Photo by ROBIN WORRALL on Uns...
A more customized approach to event analytics beyond Firebase and Google Analytics
If you are operating an Android app business or are working as an analyst for an app business you probably use Google Analytics and Firebase on a regular basis. These tools help you monitor app performance and give you insight into your app users as well.
However, not all questions can be answered by GA and Firebase, and if they do, they may not perfectly capture the way that you want to analyze and present your data. In cases like these, it may be worth exporting the historical data to a data warehouse, and then you should be able to do more advanced analytics yourself. And since BigQuery belongs to the same Google ecosystem as GA and Firebase, the export is much more seamless.
This article talks about sample business questions that are answered by running simple queries in BigQuery. The prerequisites are some basic SQL skills, a Google Cloud account, and a Firebase-connected app; then you are ready to make these queries yourself!
The first step is to export Firebase data into BigQuery. Once you have done this once, the events data automatically flows to your data warehouse ready for data wrangling and analysis. To learn more about how to enable Firebase to BigQuery export, you may refer to this documentation.
The dataset
For this demo, I’ll be using sample gaming app data from Flood-it!, a puzzle game that is available on both Android and iOS platforms. It is an obfuscated dataset that mimics what an actual implementation in GA and Firebase would look like.
You may check this link for more information on the dataset and its schema.
Just a note on the user identifier: “user_pseudo_id” was used to identify individual users in this demo data. If you want to associate user data across multiple apps, devices, and analytics providers, you should set a “user_id”.
Now we are prepared to analyze the data.
Sample user experience-related questions that Firebase and BigQuery can answer
How many users have installed the app?
A query like this can reveal how many users have installed the app within a specific period of time:
SELECT COUNT(DISTINCT user_pseudo_id) as users_installed
FROM `firebase-public-project.analytics_153293282.events_*`
WHERE event_name = "first_open"
AND _TABLE_SUFFIX BETWEEN '20180927' and '20181003'
As the tables come in an intraday format (one table for each day), you can combine several tables in one query using the wildcard symbol “*”. Be extremely careful using this code though, as it may be very expensive to execute especially if you have a lot of data. Remember to always use “_TABLE_SUFFIX” to define the date boundaries and limit the data.
By running this code we saw that there are 321 unique users installing the app within the period specified.
To have a daily breakdown of users installing, you can also run this code:
SELECT
FORMAT_DATE('%Y-%m-%d', PARSE_DATE('%Y%m%d', event_date)) AS date,
COUNT(DISTINCT user_pseudo_id) as users_installed
FROM `firebase-public-project.analytics_153293282.events_*`
WHERE event_name = "first_open"
AND _TABLE_SUFFIX BETWEEN '20180927' and '20181003'
GROUP BY date
ORDER BY date
This query produces these results:
What are the demographics of the users?
Some demographic data is available in Firebase, such as location and device information.
For example, this query gives us the top 10 countries where our users are located, with corresponding percentages:
WITH
--Compute for the numerators
country_counts AS (
SELECT
geo.country,
COUNT(DISTINCT user_pseudo_id) AS users
FROM `firebase-public-project.analytics_153293282.events_*`
WHERE event_name = "first_open"
AND _TABLE_SUFFIX BETWEEN '20180927' and '20181003'
AND geo.country <> ""
GROUP BY geo.country
),
--Compute for the denominators
user_counts AS (
SELECT
COUNT(DISTINCT user_pseudo_id)
FROM `firebase-public-project.analytics_153293282.events_*`
WHERE event_name = "first_open"
AND _TABLE_SUFFIX BETWEEN '20180927' and '20181003'
),
--Compute for the percentages
percent AS (
SELECT
country,
ROUND(users / (SELECT * FROM user_counts), 4) AS percent_users
FROM country_counts
)
SELECT * FROM percent
ORDER BY percent_users DESC
LIMIT 10
We get this result from running the code:
From this, we know that 2 out of 5 of our user base come from the US, while India has also a good representation at 15%.
We can also get some insights as to what type of device they use when playing the game (mobile or tablet) by running this query:
WITH
device_counts AS (
SELECT
device.category,
COUNT(DISTINCT user_pseudo_id) AS users
FROM `firebase-public-project.analytics_153293282.events_*`
WHERE event_name = "first_open"
AND _TABLE_SUFFIX BETWEEN '20180927' and '20181003'
AND device.category <> ""
GROUP BY device.category
),
user_counts AS (
SELECT
COUNT(DISTINCT user_pseudo_id)
FROM `firebase-public-project.analytics_153293282.events_*`
WHERE event_name = "first_open"
AND _TABLE_SUFFIX BETWEEN '20180927' and '20181003'
),
percent AS (
SELECT
category,
ROUND(users / (SELECT * FROM user_counts), 4) AS percent_users
FROM device_counts
)
SELECT * FROM percent
ORDER BY percent_users DESC
From this example, we see that 81% are using mobile phones, while only 19% use tablets.
How many users are actively using the app on a daily basis?
The daily active users (DAU) metric gives an insight as to how engaged your current app users are. How it is measured depends on how you define an active user. In this example, we define an active user as a user who did any “user_engagement” action in the day.
To compute for DAU for a specified time period, simply run this query:
WITH
daily_user_count AS (
SELECT
FORMAT_DATE('%Y-%m-%d', PARSE_DATE('%Y%m%d', event_date)) AS date,
COUNT(DISTINCT user_pseudo_id) AS active_users
FROM `firebase-public-project.analytics_153293282.events_*`
WHERE event_name = "user_engagement"
AND _TABLE_SUFFIX BETWEEN '20180901' and '20180930'
GROUP BY date
)
SELECT AVG(active_users) AS daily_active_users
FROM daily_user_count
On average, 496 users engaged with the app during the month of September.
How much are users spending on the app?
We can also get insight into how much users spend on in-app purchases by running this query:
SELECT SUM(user_ltv.revenue) AS revenue
FROM `firebase-public-project.analytics_153293282.events_*`
WHERE event_name = "in_app_purchase"
AND geo.country = "United States"
AND _TABLE_SUFFIX BETWEEN '20180901' and '20180930'
That’s just 5.94 USD for users in the US— not a lot, but knowing this we can formulate better strategies to upsell in-app purchases.
How many users are encountering app crashes?
To prevent churn and uninstalls it is necessary to monitor unpleasant user experiences such as app crashes.
We can gain insight into this by querying the “app_exception” event:
SELECT COUNT(DISTINCT user_pseudo_id) AS users
FROM `firebase-public-project.analytics_153293282.events_*`,
UNNEST(event_params) e
WHERE event_name = 'app_exception'
AND _TABLE_SUFFIX BETWEEN '20180901' and '20180930'
AND e.key = 'fatal' AND e.value.int_value = 1
Notice the UNNEST() function. Event parameters are stored as arrays in BigQuery, hence the need to unnest it first. Check out this article for a more in-depth discussion on unnesting event parameters in BigQuery.
From this query, we have determined that 269 users have experienced crashes during the month of September.
How many users are uninstalling the app?
App removal is an undesirable outcome and therefore should be monitored regularly. The corresponding event name for this in Firebase is “app_remove”.
The following query outputs the percentage of users who still have the app after a week, among the cohort who installed it in September:
WITH
--List of users who installed in Sept
sept_cohort AS (
SELECT DISTINCT user_pseudo_id,
FORMAT_DATE('%Y-%m-%d', PARSE_DATE('%Y%m%d', event_date)) AS date_first_open,
FROM `firebase-public-project.analytics_153293282.events_*`
WHERE event_name = 'first_open'
AND _TABLE_SUFFIX BETWEEN '20180901' and '20180930'
),
--Get the list of users who uninstalled
uninstallers AS (
SELECT DISTINCT user_pseudo_id,
FORMAT_DATE('%Y-%m-%d', PARSE_DATE('%Y%m%d', event_date)) AS date_app_remove,
FROM `firebase-public-project.analytics_153293282.events_*`
WHERE event_name = 'app_remove'
AND _TABLE_SUFFIX BETWEEN '20180901' and '20181007'
),
--Join the 2 tables and compute for # of days to uninstall
joined AS (
SELECT a.*,
b.date_app_remove,
DATE_DIFF(DATE(b.date_app_remove), DATE(a.date_first_open), DAY) AS days_to_uninstall
FROM sept_cohort a
LEFT JOIN uninstallers b
ON a.user_pseudo_id = b.user_pseudo_id
)
--Compute for the percentage
SELECT
COUNT(DISTINCT
CASE WHEN days_to_uninstall > 7 OR days_to_uninstall IS NULL THEN user_pseudo_id END) /
COUNT(DISTINCT user_pseudo_id)
AS percent_users_7_days
FROM joined
This tells us that the 7-day retention rate among September installers is at 76%, or 24% of the September cohort has uninstalled the app.
How good is this number? This can be answered by doing some research on industry benchmarks, or by monitoring how the trend progresses over time.
You may also change the timeframe of the retention rate to whatever is most useful by editing the final “CASE WHEN” condition to the number of days you want instead of 7 days.
Are crashes possibly affecting the user experience, causing them to uninstall?
We can further drill-down among those who uninstalled the app to determine probable reasons for removing it. As we have access to several events, we can look at them and check correlations with uninstalls.
For example, we may want to check how many of the uninstallers have encountered crashes before they uninstall. We can’t necessary conclude that this caused their frustration and led them to remove the app, but just looking at the numbers may help. A query like this can be ran:
WITH
--List of users who installed in Sept
sept_cohort AS (
SELECT DISTINCT user_pseudo_id,
FORMAT_DATE('%Y-%m-%d', PARSE_DATE('%Y%m%d', event_date)) AS date_first_open,
FROM `firebase-public-project.analytics_153293282.events_*`
WHERE event_name = 'first_open'
AND _TABLE_SUFFIX BETWEEN '20180901' and '20180930'
),
--Get the list of users who uninstalled
uninstallers AS (
SELECT DISTINCT user_pseudo_id,
FORMAT_DATE('%Y-%m-%d', PARSE_DATE('%Y%m%d', event_date)) AS date_app_remove,
FROM `firebase-public-project.analytics_153293282.events_*`
WHERE event_name = 'app_remove'
AND _TABLE_SUFFIX BETWEEN '20180901' and '20181007'
),
--Get the list of users who experienced crashes
users_crashes AS (
SELECT DISTINCT user_pseudo_id,
FORMAT_DATE('%Y-%m-%d', PARSE_DATE('%Y%m%d', event_date)) AS date_crash,
FROM `firebase-public-project.analytics_153293282.events_*`,
UNNEST(event_params) e
WHERE event_name = 'app_exception'
AND _TABLE_SUFFIX BETWEEN '20180901' and '20181007'
AND e.key = 'fatal' AND e.value.int_value = 1
),
--Join the 3 tables
joined AS (
SELECT a.*,
b.date_app_remove,
DATE_DIFF(DATE(b.date_app_remove), DATE(a.date_first_open), DAY) AS days_to_uninstall,
c.date_crash
FROM sept_cohort a
LEFT JOIN uninstallers b
ON a.user_pseudo_id = b.user_pseudo_id
LEFT JOIN users_crashes c
ON a.user_pseudo_id = c.user_pseudo_id
)
--Compute the percentage
SELECT
COUNT(DISTINCT
CASE WHEN days_to_uninstall <= 7 AND date_crash IS NOT NULL
THEN user_pseudo_id END)
/ COUNT(DISTINCT
CASE WHEN days_to_uninstall <= 7 THEN user_pseudo_id END)
AS percent_users_crashes
FROM joined
This reveals that only 2.4% of the uninstallers experienced a crash. A small number, it seems, but it may be useful to compare it versus non-installers to make a more reasonable conclusion. Furthermore, we can create an app removal rate model to determine the predictors of app removal. I won’t cover these things in this article as it’s beyond the scope, but I recommend it as a powerful next step.
Conclusion
In this article, we have discovered the enormous possibilities that BigQuery can bring about in your app analytics. We touched on some simple queries, but as you explore more about the available data in Firebase, I encourage you to dig deeper and see for yourself how this powerful resource can help you grow your app business.
If you liked this article or have found it useful, please follow my blog for more resources on marketing analytics and data science. Happy learning!
References:
BigQuery sample dataset for Google Analytics 4 gaming app implementation. Google Analytics Sample Datasets. Available online: https://developers.google.com/analytics/bigquery/app-gaming-demo-dataset. License information: Creative Commons Attribution 4.0 License
Know More about Your App Users through BigQuery was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
from Towards Data Science - Medium https://ift.tt/ciW4KQr
via RiYo Analytics
No comments