16.2. Why Do We Need BigQuery?

Volume of Data

Let's begin with a pain point: the volume of data we can handle with our current data stack. A few years ago, I was working with a client who wanted to report and analyze transactional data. They had about 3,000 to 4,000 transactions daily, each with 50 columns of information. Even if we only kept one month of data in a Google Sheet, we were close to hitting the limit (5 million cells at that time). Now the limit is 10 million cells, but it's still not enough for larger datasets.

Moreover, even at 100,000 or 200,000 cells, if you have calculations and functions to perform in the sheet, it becomes very slow.

16_page-0002.jpg

Speed of ETL

ETL stands for extract, transform, and load. It's the process of extracting data from a tool or source system, transforming it (modeling it or cleaning it), and loading it into another system or providing it to a visualization tool like Looker Studio.

The speed of ETL matters because even if your tool can handle large amounts of data (like Google Sheets), processing that data might take a long time.

Data Modeling Capabilities

The number of functions available in Google Sheets and Looker Studio are limited compared to what you have in SQL and BigQuery. BigQuery offers more flexibility and can handle more use cases for data analysis and data science.

Lack of Data Ownership

When you connect Looker Studio to an external tool or API like Google Analytics or Facebook Ads API, you don't own your data anymore. You need a place where you can store your own dataset securely.

Limited Ownership Over Modeling

Data modeling performed in Looker Studio is exposed to anyone using the report/dashboard. If you want to create an ownership strategy on your data modeling, you need a proper place to perform it.

API Limitations

When working with APIs, there are several factors to consider:

  1. Volume of data: How many rows are you requesting? How quickly can the tool process your request?
  2. API speed: Some tools have slow APIs, which can slow down the entire process.