Let's begin with a pain point: the volume of data we can handle with our current data stack. A few years ago, I was working with a client who wanted to report and analyze transactional data. They had about 3,000 to 4,000 transactions daily, each with 50 columns of information. Even if we only kept one month of data in a Google Sheet, we were close to hitting the limit (5 million cells at that time). Now the limit is 10 million cells, but it's still not enough for larger datasets.
Moreover, even at 100,000 or 200,000 cells, if you have calculations and functions to perform in the sheet, it becomes very slow.
ETL stands for extract, transform, and load. It's the process of extracting data from a tool or source system, transforming it (modeling it or cleaning it), and loading it into another system or providing it to a visualization tool like Looker Studio.
The speed of ETL matters because even if your tool can handle large amounts of data (like Google Sheets), processing that data might take a long time.
The number of functions available in Google Sheets and Looker Studio are limited compared to what you have in SQL and BigQuery. BigQuery offers more flexibility and can handle more use cases for data analysis and data science.
When you connect Looker Studio to an external tool or API like Google Analytics or Facebook Ads API, you don't own your data anymore. You need a place where you can store your own dataset securely.
Data modeling performed in Looker Studio is exposed to anyone using the report/dashboard. If you want to create an ownership strategy on your data modeling, you need a proper place to perform it.
When working with APIs, there are several factors to consider: