Google Data Analytics Capstone: Bellabeat Case Study

Tamyris Gimenez
13 min readJan 30, 2022

How can a wellness technology company play it smart?

Photo by Denys Nevozhai on Unsplash

INTRODUCTION

This capstone is part of the eighth and final course from the Google Data Analytics Professional Certificate, where you have the opportunity to complete an optional case study. In this project, we will apply the five steps of the data analysis process taught in the program: Ask, Prepare, Process, Analyze, Share, and Act. The project was done with Python, and it is posted in its entirety on Kaggle. If you wish to check it out, you can access it here or on Github.

ABOUT THE COMPANY

Bellabeat is a high-tech company that manufactures health-focused smart products. Founded in 2013 by artist Urška Sršen and mathematician Sando Mur, the company has many products carefully designed to monitor activity, stress, sleep, and reproductive data to help women better understand how their bodies work and make healthier choices. While small in size, Bellabeat has quickly positioned itself as a tech-driven wellness company for women. Chief Creative Officer, Urška Sršen, believes that analyzing smart device fitness data could help unlock new growth opportunities for the company.

BUSINESS TASK

Focus on one of Bellabeat’s products and analyze smart device data to gain insight into how consumers use non-Bellabeat smart devices.

PRODUCT

Bellabeat app: The Bellabeat app provides users with health data related to their activity, sleep, stress, menstrual cycle, and mindfulness habits. This data can help users better understand their current habits and make healthy decisions. The Bellabeat app connects to their line of smart wellness products.

STAKEHOLDERS

  • Urška Sršen — Bellabeat’s co-founder and Chief Creative Officer.
  • Sando Mur — Mathematician and Bellabeat’s cofounder; a key member of the Bellabeat executive team.
  • Bellabeat marketing analytics team — A team of data analysts responsible for collecting, analyzing, and reporting data that helps guide Bellabeat’s marketing strategy.

ASK

  • What are some trends in smart device usage?
  • How could these trends apply to Bellabeat customers?
  • How could these trends help influence Bellabeat marketing strategy?

PREPARE

Urška Sršen encourages the use of public data that explores smart device users’ daily habits. Therefore, the dataset used in this project will be the FitBit Fitness Tracker Data (CC0: Public Domain, dataset made available through Möbius).

About the data
According to its source, the dataset was generated by respondents to a distributed survey via Amazon Mechanical Turk and comprehends dates between 03–12–2016 and 05–12–2016. Thirty FitBit users consented to the submission of their personal tracker data — which includes a minute-level output for physical activity, heart rate, and sleep monitoring. Variation between output represents the use of different types of Fitbit trackers and individual tracking behaviors/preferences.

This public dataset is comprised of 18 CSV files — each containing specific tracking information, such as daily calories, daily steps, etc. The data is organized in the long format, where an Id column identifies each user, and the remaining columns contain different attributes about the said user.

In regard to data bias, considering that the dataset only gathers information of 33 FitBit users over the span of two months, it is important to keep in mind that there is a real possibility that the data isn’t a perfect representation of all FitBit users. However, we can still analyze the dataset and learn interesting information about the surveyed users and how they used their smart tracking devices over that time period.

PROCESS

I began this phase by loading the libraries and datasets that were going to be used. Then, I performed data exploration by getting an overview of the datasets, checking the data types, gathering a statistics summary, as well as cleaning the data.

ANALYZE

In the Analyze phase, I performed data transformation by creating a DayOfWeek column, changing data formatting, and merging datasets in order to perform analysis. I also aggregated and grouped data to do calculations and find key information, such as the average of total steps by day of the week, the average number of daily steps taken by the user, the most active time of day, etc. I divided the merged datasets into two parts: Activity Data and Sleep Data. In the Activity Data part, I analyzed the following variables:

The Average of Total Steps and Calories
How many steps do our users take daily? How many calories do they burn on average?

As we can see, the average of total daily steps by the users is 7,638 steps. According to a study conducted in 2011 by BMC/BioMed Central, taking 10,000 steps a day is a reasonable target for healthy adults, helping reduce certain health conditions, such as high blood pressure and heart disease. In order to compare daily steps to an activity level, the following categories can be considered:

  • Inactive — Less than 5,000 steps/day
  • Average — Between 7,500 and 9,999 steps/day
  • Very Active — More than 12,500 steps/day

Given the information above, we can conclude that our users fall into the Average category. With this in mind, it becomes clear that the users should improve their amount of daily steps for optimal results regarding their health and well-being.

The Average of Total Steps by Day of the Week
Are the users consistent with the number of total steps taken throughout the week? Or are there significant changes as the days go by?

The Most Active Time of Day
What are the most active hours of the users’ day on average? Are they occurring during the day or at night?

Time
0 42.188437
1 23.102894
2 17.110397
3 6.426581
4 12.699571
5 43.869099
6 178.508056
7 306.049409
8 427.544576
9 433.301826
10 481.665231
11 456.886731
12 548.642082
13 537.698154
14 540.513572
15 406.319126
16 496.845645
17 550.232892
18 599.169978
19 583.390728
20 353.905077
21 308.138122
22 237.987832
23 122.132890
Name: StepTotal, dtype: float64

I grouped the data by the StepTotal column and the Time column, extracting only the hours of the day from it.

The Time of Day Users Burn the Most Calories
When are the users burning the most calories? In the morning, afternoon, or night?

Time
0 71.805139
1 70.165059
2 69.186495
3 67.538049
4 68.261803
5 81.708155
6 86.996778
7 94.477981
8 103.337272
9 106.142857
10 110.460710
11 109.806904
12 117.197397
13 115.309446
14 115.732899
15 106.637158
16 113.327453
17 122.752759
18 123.492274
19 121.484547
20 102.357616
21 96.056354
22 88.265487
23 77.593577
Name: Calories, dtype: float64

I grouped the data by the Calories column and the Time column, extracting only the hours of the day from it.

Analyzing the Different Categories of Activity
How active are the users throughout the day? Do they spend a significant portion of their day being very active or, perhaps, not active enough?

If we take a closer look into the average time spent on the different categories of active minutes, we can see that the amount of time the users spend being sedentary is quite significant. If we do the math, we learn that the users are Sedentary for, on average, 16.5 hours of their day. They spend about 3.2 hours of their day being Lightly Active, 23 minutes being Fairly Active, and 35 minutes being Very Active.

The Average of Minutes Spent in Each Activity Category
Let’s analyze how our users are spending their day on average.

VeryActiveMinutes        21.265822
FairlyActiveMinutes 13.660436
LightlyActiveMinutes 194.197545
SedentaryMinutes 992.791446
dtype: float64

In the Sleep Data part, I analyzed some of the users’ sleeping habits.

The Average of Total Minutes Asleep and Total Time in Bed

According to the CDC, an adult between 18–60 years old should get 7 or more hours of sleep per night. By analyzing the TotalMinutesAsleep column, we learn that, on average, our users’ sleeping time is 419.5 minutes (or 7 hours). We can, therefore, conclude that they have good sleeping habits.

The Average of Minutes Asleep by Day of the Week
We can group the data to find out how much sleep, on average, the users are getting throughout the week.

The Average Awake Time in Bed by Day of the Week
Now, let’s take a look at how much time the users are spending in bed without being asleep throughout the week.

If we take the mean from the AwakeTimeInBed we learn that the users spent, on average, 39.5 minutes awake in bed.

SHARE

The Share phase is all about visualizing the insights found in the Analyze phase. I started by plotting a scatter plot graph to identify the relationship between the total number of steps and the total number of burned calories.

Visualizing Total Steps and Calories

As we can see on the scatter plot above, there is a positive relationship between the two variables, which indicates that the greater the number of steps taken, the more calories the user burns. As mentioned above, keeping active is crucial for maintaining good health, and the number of steps the individual takes daily has a significant impact on that.

Visualizing the Average of Total Steps by Day of the Week

This graph shows us the average total steps taken by the users throughout the week. We can see that the most active day in terms of steps taken is Saturday, followed by Tuesday. The least active day is Sunday — most likely a rest day for the users. This visualization shows us that the users seem to be consistent with their total steps scores.

Visualizing the Most Active Time of Day

This chart tells us that the users tend to start becoming more active between 6 and 8 am. The level of activity stays doesn’t change dramatically throughout the day, but the peak hours seem to occur between 5 and 7 pm. Our users are likely choosing to work out after work/school hours.

Note that their levels of activity start decreasing significantly at 8 pm and keep going down as the hours progress. The users seem to go to bed at a reasonable time, as lower levels of activity are recorded at late hours of the night.

Visualizing the Time of Day Users Burn the Most Calories
We learned from a previous visualization that there is a positive correlation between the total amount of steps walked and burned calories. Let’s see if the burned calories are fluctuating as much as the activity levels.

According to the Sleep Foundation, we burn around 50 calories an hour while we sleep — and we are able to see that in the graph above. Observe that as the users wake up and start moving, the number of burned calories increases, peaking around the same time that the users are most active during the day.

Visualizing the Average of Minutes Spent in Each Activity Category
We can visualize the distribution of time in percentages by plotting the following pie chart:

The Department of Health and Human Services recommends at least 30 minutes of moderate daily physical activity. It is also said that reducing sitting time is important to avoid one’s risk of developing metabolic problems. We can conclude from our data that, even though our users are being very active for over 30 minutes every day, they are still spending an enormous amount of time being sedentary, which could result in future health issues.

Visualizing Total Minutes Asleep x Total Time in Bed
We can use a scatter plot to better visualize the relationship between TotalMinutesAsleep and TotalTimeInBed:

The scatter plot shows a strong positive linear association between the total number of minutes asleep and the total time spent in bed, which means that most users are usually in bed only when they are sleeping, and not much longer before or after that. However, we can still see that, at times, they do spend a larger amount of time in bed without being asleep. This could be related to the weekends when many people choose to sleep in or relax.

Visualizing the Average of Minutes Asleep by Day of the Week

In the column chart above, we can see that the average Total Minutes Asleep goes above the 400-minute mark — 419.8 minutes to be exact — with Sunday (7.6 hours) and Wednesday (7.2 hours) being the days where the users seem to have slept the most. It is clear from this visualization that there are no significant changes in sleeping time throughout the week. This information shows us that the users have a consistent sleeping schedule.

By analyzing the chart above, we can see that Sunday records the highest number of the Total Minutes Asleep average. If we go back to the “Average of Total Steps by Day of the Week” graph, we learn that Sunday also recorded the lowest number of the total steps average in the week, showing us that Sunday is likely the users’ choice of a rest day.

Visualizing Awake Time in Bed by Day of the Week

As we can see, the users are very consistent with their time in bed throughout the days of the week, and the same goes for the time they are awake in bed. From their total time in bed, they spend, on average, 39.5 minutes awake. The longest recorded times occurred on the weekend, which is perfectly understandable.

ACT

Now, it is time to present the key findings and recommendations.

KEY FINDINGS

  • The average user takes 7,638 steps and burns 2,304 calories per day.
  • There is a positive relationship between the total number of steps and the total number of burned calories.
  • The users seem to be consistent with their total steps scores throughout the week. The most active day is Saturday, and the least active is Sunday.
  • The users start their day between 6 am and 8 am. They are most active between 5 pm and 7 pm, and become less active at 8 pm.
  • The highest number of burned calories occur between 5 pm and 7 pm, when the users are most active.
  • Although the average user is very active for over 30 minutes every day, they still spend 81% of their time being sedentary.
  • There is a strong, positive relationship between the total number of minutes asleep and the total time spent in bed, with users only spending an average of 39.5 minutes of their total time in bed being awake.
  • The users have a consistent sleeping schedule, with an average sleeping time of 419.8 minutes (or 7 hours) per night — with Sunday (7.6 hours) being the day where the users seem to have slept the most.
  • Recording the lowest number of steps and the highest number of minutes asleep, Sunday is likely a rest day chosen by the users.

RECOMMENDATIONS

The following recommendations were carefully created to help guide Bellabeat’s marketing strategy:

  • Personalized Notifications to Promote Activity: The users’ average of total steps is 7,638 — a mark well below the 10,000 daily steps recommended by the CDC. In addition, our analysis has shown that the average user spent about 81% of their day being sedentary. Bellabeat could incorporate personalized notifications on its app to motivate users to keep moving throughout the day. Such notifications could include real-time information regarding the number of steps taken so far, or even the number of steps left in order to reach the daily goal.
  • Dynamic Calorie Counter: The app could also provide the user with an elegantly designed, easy-to-use interface that displays the number of calories that are being burned throughout the day to improve motivation. The user could also have the option to set their customized daily calories goal and be able to follow their progress throughout the day.
  • Detailed Sleeping Log: The average user has a consistent sleeping schedule, but those who may want to improve the quality of their sleep — or simply keep track of it — could benefit from a sleeping log. The app could offer this feature and record sleep quality, the number of times one wakes up during the night, the total amount of awake time in bed, anxiety, and/or stress levels.
  • Weekly and Monthly Achievement Reports: To keep the users motivated, the Bellabeat app could provide customized weekly and monthly reports regarding the total number of steps, burned calories, sleeping habits, weight loss, and total time spent on the different activity levels. The app could send congratulatory messages to those who keep up with good habits, as well as motivational tips for improvement depending on the user’s overall performance.
  • Meditation and Relaxation Services: The Bellabeat app could also offer meditation and relaxation tips and services — either for free or on a premium basis — to those who are looking to improve their sleep quality and/or reduce stress and anxiety levels. By tracking the time of day when the user decreases their activity levels, their rest days, or perhaps around bedtime, the app could send notifications to the person’s phone or smart device and suggest different meditation or relaxation techniques.
  • Discounts on Other Bellabeat Wellness Products and Services: Another way to keep customers motivated is to offer special discounts on the different Bellabeat products, as well as their premium membership. This way, the users could become more inclined to get more active and purchase more products from the company.

--

--

Tamyris Gimenez

Information Systems student. Passionate about data science, history, and genealogy.