Case Study

This Chart, from Home Depot, Dramatically Demonstrates the Power of a Cloud Data Warehouse


Of your peers have already read this article.

2:30 Minutes

The most insightful time you'll spend today!

When Home Depot moved it's gigantic enterprise data warehouse to Google Cloud, it could not have imagined how much faster it could crunch data--for a variety of uses cases.

The Home Depot (THD) is the world’s largest home-improvement chain, growing to more than 2,200 stores and 700,000 products in four decades. Much of that success was driven through the analysis of data. This included developing sales forecasts, replenishing inventory through the supply chain network, and providing timely performance scorecards.

However, to compete in today’s business world, THD has taken this data-driven approach to an entirely new level of success on Google Cloud, providing capabilities not practical on legacy technologies.

The Home Depot BigQuery installation performance table
Percent reduction in time that specific workloads took using BigQuery versus on-premises data warehousing.

The pressures of contemporary growth that drove much of the work are familiar to many businesses. In addition to everything it was doing, THD needed to better integrate the complexities in its related businesses, like tool rental and home services. It needed to better empower teams, including a fast-growing data analysis staff and store associates with mobile computing devices. It wanted to better use online commerce and artificial intelligence to meet customer needs, while maintaining better security.

Even before addressing these new challenges, THD’s existing on-premises data warehouse was under stress as more data was required for analytics and data analysts were utilizing the data with increasingly complex use cases. This drove rapid growth of the data warehouse, but also created constant challenges for the team in managing priorities, performance, and cost.

In order to add capacity to the environment, it was a major planning, architecture, and testing effort. In one case, adding on-premises capacity took six months of planning and a three-day service outage. Within a year, capacity was again scarce, impacting performance and ability to execute all the reporting and analytics workloads required. The capacity refresh cycles were shrinking, and the expecations for data were growing. There had to be a better way.

Still, THD did not take its move to the cloud lightly. A large-scale enterprise data warehouse migration involves tremendous effort among people, process, and technology. After careful consideration, THD chose Google Cloud’s BigQuery for its cloud enterprise data warehouse.

BigQuery, a scalable serverless data warehouse, was better on cost, infrastructure agility, and analytics capability, driving better insights with improved performance. There are no service interruptions when capacity is added, and that capacity can be added within a week (and soon same day). It doesn’t require complex system administration, and its standard SQL support means people can easily ramp up quickly. Valuable BigQuery products like Identity and Access Management meant THD could create many separate Google Cloud projects, while ensuring that different teams weren’t interfering with each other or accessing protected data.

THD also utilizes BigQuery’s flat-rate monthly pricing model that allows teams to budget their capacity based on need and provides billing predictability. The capacity not being used by a given project is available for enterprise use. This ensures no surprises when the monthly bill arrives and provides all analytical users access to significant computing power.

While THD’s legacy data warehouse contained 450 terabytes of data, the BigQuery enterprise data warehouse has over 15 petabytes. That means better decision-making by utilizing new datasets like website clickstream data and by analyzing additional years of data.

As for performance, look at this chart:

With the cloud EDW migration complete, and the legacy on-premises data warehouse retired, analysts now execute more complex and demanding workloads that they would not have been able to complete before, such as utilizing Datalab for orchestrating analytics through Python Notebooks, utilizing BigQuery ML for machine learning directly against the BigQuery data (no movement of large datasets), and AutoML to help determine the best model for predictions.

Additionally, engineers at THD have adapted BigQuery to monitor, analyze, and act on application performance data across all its stores and warehouses in real time, something that was not practical in the on-premises system.

With over 600 projects that THD now has on Google Cloud, the BigQuery story is just one of the many ways that Google Cloud is working with THD to deliver meaningful business results, every day.

More Relevant Stories for Your Company


How to Predict the Cost of a Managed Streaming and Batch Analytics Service

The value of streaming analytics comes from the insights a business draws from instantaneous data processing, and the timely responses it can implement to adapt its product or service for a better customer experience. “Instantaneous data insights,” however, is a concept that varies with each use case. Some businesses optimize

Case Study

Skyscanner Supercharges Ability to Turn Raw Data into Deep Understanding of Consumer Behaviour, Conversion Jumps 40%

Skyscanner is a leading global travel search company covering flights, hotels, and car hire around the world. Founded in 2003, the company helps over 40 million people each month find the best travel options across its portfolio of websites and mobile apps. Skyscanner wanted to understand the anonymized behaviour of

Case Study

AirAsia Flies High With Data Analytics and AI

AirAsia’s vision is simple: allow everyone to fly. Founded in 2001, the airline and sister company AirAsia X have grown to service 150+ destinations in 25 markets, using 274 aircraft to operate 11,000+ weekly flights from 23 hubs across the region. While the airline is known as a provider of


What is Dataflow?

What is Dataflow, and how can you use it for your data processing needs? In this episode of Google Cloud Drawing Board, Priyanka Vergadia walks you through Dataflow, a serverless system for processing and enriching data, supporting both streaming and batch models. Here's what's inside: 0:00 - 0:14 Video snapshot0:15