Use Jupyter magic commands for BigQuery

To avoid boilerplate code in Jupyter notebooks, it is possible to use magic commands with the BigQuery library.

This simplifies the code from this:

sql = """
SELECT
    source_year AS year,
    COUNT(is_male) AS birth_count
FROM `bigquery-public-data.samples.natality`
GROUP BY year
ORDER BY year DESC
LIMIT 5
"""

(
    bqclient
    .query(sql)
    .result()
    .to_dataframe()
)

to this:

%%bigquery
SELECT
    source_year AS year,
    COUNT(is_male) AS birth_count
FROM `bigquery-public-data.samples.natality`
GROUP BY year
ORDER BY year DESC
LIMIT 5

Load module

If you run a cloud-based JupyterLab notebook with AI Platform, you won’t need this step, as the BigQuery module is already installed and loaded.

Otherwise, open a Jupyter notebook. Make sure you have installed the google-cloud-bigquery Python library. Load the magic commands module with the following:

# Load magic commands from the library
%load_ext google.cloud.bigquery

Run query in a magic cell

Create a magic cell with %%bigquery on the first line (don’t add anything above, not even commented lines) and directly enter your query below:

%%bigquery
SELECT
    source_year AS year,
    COUNT(is_male) AS birth_count
FROM `bigquery-public-data.samples.natality`
GROUP BY year
ORDER BY year DESC
LIMIT 5
year birth_count
0 2008 4255156
1 2007 4324008
2 2006 4273225
3 2005 4145619
4 2004 4118907

Magic command options

To save the results to a pandas DataFrame instead of displaying them directly, add a variable name after %%bigquery

%%bigquery df
# Save results to `df`
SELECT
    source_year AS year,
    COUNT(is_male) AS birth_count
FROM `bigquery-public-data.samples.natality`
GROUP BY year
ORDER BY year DESC
LIMIT 5
# Print the resulting DataFrame
df
year birth_count
0 2008 4255156
1 2007 4324008
2 2006 4273225
3 2005 4145619
4 2004 4118907

It is possible to set parameters with --params, and use them in the query with @param:

%%bigquery --params {"lim": 10}
# Use a parameter for query limit
SELECT
    source_year AS year,
    COUNT(is_male) AS birth_count
FROM `bigquery-public-data.samples.natality`
GROUP BY year
ORDER BY year DESC
LIMIT @lim
year birth_count
0 2008 4255156
1 2007 4324008
2 2006 4273225
3 2005 4145619
4 2004 4118907
5 2003 4096092
6 2002 4027376
7 2001 4031531
8 2000 4063823
9 1999 3963465