back/front filling values in SQL

I have table in PrestoSQL that looks like this:

 test
    id value           timestamp
1  foo  blue 2019-10-17 17:42:52
2  foo  <NA> 2019-10-17 17:43:52
3  foo  <NA> 2019-10-17 17:44:52
4  foo   red 2019-10-17 17:45:52
5  foo  <NA> 2019-10-17 17:46:52
6  bar  <NA> 2019-10-17 17:47:52
7  bar green 2019-10-17 17:48:52
8  bar  <NA> 2019-10-17 17:49:52
9  bar  <NA> 2019-10-17 17:50:52
10 bar  <NA> 2019-10-17 17:51:52

My objective is to fill in the values after they appear in value for example:

 output
    id value           timestamp
1  foo  blue 2019-10-17 17:42:52
2  foo  blue 2019-10-17 17:43:52
3  foo  blue 2019-10-17 17:44:52
4  foo   red 2019-10-17 17:45:52
5  foo   red 2019-10-17 17:46:52
6  bar  <NA> 2019-10-17 17:47:52
7  bar green 2019-10-17 17:48:52
8  bar green 2019-10-17 17:49:52
9  bar green 2019-10-17 17:50:52
10 bar green 2019-10-17 17:51:52

I understand how to use lead() and lag(), but how would one write a query to fill in current values from the previous known value by timestamp (and id) if it is not NA ?

Any suggestions would be appreciated

1 answer

  • answered 2019-10-17 21:55 Piotr Findeisen

    You need to use lag() with IGNORE NULLS. Example:

    presto:default> SELECT
                 ->   a, t, v,
                 ->   coalesce(v, lag(v, 1) IGNORE NULLS OVER (PARTITION BY a ORDER BY t))
                 -> FROM (VALUES
                 ->   ('a', 1, 'red'),
                 ->   ('a', 2, NULL),
                 ->   ('a', 3, 'blue'),
                 ->   ('a', 4, NULL),
                 ->   ('a', 5, NULL)
                 -> ) t(a, t, v);
                 ->
     a | t |  v   | _col3
    ---+---+------+-------
     a | 1 | red  | red
     a | 2 | NULL | red
     a | 3 | blue | blue
     a | 4 | NULL | blue
     a | 5 | NULL | blue
    (5 rows)
    

    (tested in Presto 322)