Lag Function for Dates

I want to lag on date to get a value from 7 days ago. However, there is not a value for everyday in the dataset, but I want the value from 7 days ago rather then the value 7 before the current.

What do I need to do to make this work?

Currently doing LAG(field, 7) OVER (partition by x,y,z ORDER By date asc)

1 answer

  • answered 2020-06-02 12:17 Sabri Karagönen

    You need to use range within window function definition instead of rows(default). To make it possible, you can convert the column to Unix seconds by applying UNIX_SECONDS(TIMESTAMP(date)) conversion.

    In the following code 604800 is equal to 7 days in seconds. You can change that value to use another range.

    FIRST_VALUE(field) OVER (partition by x,y,z ORDER By UNIX_SECONDS(TIMESTAMP(date)) asc RANGE BETWEEN 604800 PRECEDING AND 604800 PRECEDING)
    

    Example query:

    WITH rawdata AS
     (
      SELECT 'cat1' as name, DATE '2019-01-01' as date, 1 as val UNION ALL 
      SELECT 'cat1', DATE '2019-01-08', 3 UNION ALL 
      SELECT 'cat1', DATE '2019-01-15', 7 UNION ALL 
      SELECT 'cat2', DATE '2019-10-18', 10
    )
    SELECT 
      name,
      date,
      val,
      FIRST_VALUE(val)
        OVER (PARTITION BY name ORDER BY UNIX_SECONDS(TIMESTAMP(date)) ASC
        RANGE BETWEEN 604800 PRECEDING AND 604800 PRECEDING) AS last_week_val
    FROM rawdata