BigQuery client does not recognize dates on query

I am trying to run queries to BigQuery using the Python SDK. This is my code:

    client = bigquery.Client()
    query = """
SELECT 
    SUM(Success) AS Success,
    COUNT(*) AS Total,
    SUM(Success)/COUNT(*) as KPI
FROM `dpto-bigdata.bbva.altas`
WHERE
    Date >= DATE_SUB(current_date(), INTERVAL 30 DAY) 
    and Date <= current_date()
    """
    df = client.query(query).to_dataframe()

This works without any problem. However, I have found out that if I use the SQL function current_date() I am loosing the partitioning functionality my table has, so I have changed the code to the following one:

    client = bigquery.Client()
    today = datetime.now().strftime("%Y-%m-%d")
    query = """
SELECT 
    SUM(Success) AS Success,
    COUNT(*) AS Total,
    SUM(Success)/COUNT(*) as KPI
FROM `dpto-bigdata.bbva.altas`
WHERE
    Date >= DATE_SUB({0}, INTERVAL 30 DAY) 
    and Date <= {0}
    """.format(today)
    df = client.query(query).to_dataframe()

With this, I am calculating the current date on my Python code and passing it to the query explicitly, but I am getting the following error:

BadRequest: 400 No matching signature for function DATE_SUB for argument types: INT64, INTERVAL INT64 DATE_TIME_PART. Supported signatures: DATE_SUB(DATE, INTERVAL INT64 DATE_TIME_PART); DATE_SUB(DATETIME, INTERVAL INT64 DATE_TIME_PART); DATE_SUB(TIMESTAMP, INTERVAL INT64 DATE_TIME_PART) at [8:19]

It seems as if the today variable is being received as an INT64 type instead of the string it is. I have tried using DATE_SUB(DATE({0}), INTERVAL 30 DAY), but I then receive the same error but for the SQL function DATE(), which is the first one on being executed.

I have performed some verifications:

print(today)
2021-06-16

print(type(today))
<class 'str'>

print(query)
SELECT 
    SUM(Success) AS Success,
    COUNT(*) AS Total,
    SUM(Success)/COUNT(*) as KPI
FROM `dpto-bigdata.bbva.altas`
WHERE
    Date >= DATE_SUB(2021-06-16, INTERVAL 30 DAY) 
    and Date <= 2021-06-16

I am executing the code on a Cloud Run instance. Anyone knows what is happening?

1 answer

  • answered 2021-06-16 18:20 Luiscri

    I finally solved it. The problem was that although today var was a string, when I was inserting it to the query it was considered as an INT64 since it started by a number. The solution was declaring it as a string adding the following change:

        query = """
    SELECT 
        SUM(Success) AS Success,
        COUNT(*) AS Total,
        SUM(Success)/COUNT(*) as KPI
    FROM `dpto-bigdata.bbva.altas`
    WHERE
        Date >= DATE_SUB("{0}", INTERVAL 30 DAY) 
        and Date <= "{0}"
        """.format(today)
    

    Notice that now I am using DATE_SUB("{0}", INTERVAL 30 DAY) instead of DATE_SUB({0}, INTERVAL 30 DAY).