Chapter 12External Data Processing

By this point you should be getting good at using BigQuery and have an understanding of how it works. That said, sometimes it may not be convenient to use BigQuery directly via the API, generated clients, or web UI. Other times, you need to do something with your data that isn't possible inside of BigQuery.

This chapter shows you how to handle both of these situations. In data warehousing, the process of taking data out of one storage system and adding it to another one is called ETL, for Extract Transform and Load. The first section in this chapter is about Extract: You've got your data in BigQuery and you want to take it out. The next section describes different ways to Transform your data, such as running a Hadoop MapReduce on Google Compute Engine. The Load component of ETL was covered in Chapter 6, “Loading Data.”

The last portion of the chapter shows some of the alternative interfaces to BigQuery that enable you to access your data from two popular spreadsheet programs: Google Spreadsheets and Microsoft Excel. You can use Google Apps Script to run BigQuery queries, which enables you to fill your Google Spreadsheets (or even Google Forms) with BigQuery data. For those of you who prefer Microsoft Excel to Google Spreadsheets, the final portion of the chapter describes the BigQuery Excel Connector that enables you to run BigQuery queries from Microsoft Excel directly.

This chapter covers only the Google-provided interfaces. Chapter 13, “Using ...

Get Google BigQuery Analytics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.