Query target conditions

I need to query a table accounting for multiple change events. The table (seen below) is partitioned by Date where a snapshot of is taken every day of employees. I would like to create a table that shows milestone changes.

Namely I want the final export to show:

  1. First Date they appear (hire date)
  2. Any record when the Type changes
  3. Last Date they appear (termination date)

This ultimately shows the changes in Type along with the hire/termination date.

I'm wondering what a good way to build this is? I can see a query that takes the UNION of the 3 criteria listed above and then sorts by date then employee but am not sure if this is efficient.


Table

+-----------+------+----------+--------+
| Employee  | Type | Date     | Active |
+-----------+------+----------+--------+
| urdearboy | 1    | 1/1/2019 | 1      |    '<---- Want
+-----------+------+----------+--------+
| urdearboy | 1    | 1/2/2019 | 1      |
+-----------+------+----------+--------+
| urdearboy | 4    | 1/3/2019 | 1      |    '<---- Want
+-----------+------+----------+--------+
| urdearboy | 4    | 1/4/2019 | 1      |
+-----------+------+----------+--------+
| urdearboy | 4    | 1/5/2019 | 1      |
+-----------+------+----------+--------+
| urdearboy | 4    | 1/6/2019 | 1      |
+-----------+------+----------+--------+
| urdearboy | 4    | 1/7/2019 | 0      |    '<---- Want
+-----------+------+----------+--------+

In the above it can be deduced I was:

  • Hired 1/1/19
  • Changed Type 1/3/19
  • Terminated 1/7/19

1 answer

  • answered 2019-10-30 22:17 Gordon Linoff

    One method is to use lag():

    select t.*
    from (select t.*, 
                 lag(date) over (partition by employee, type, active order by prev_date) as prev_date_eta,
                 lag(date) over (partition by employee order by date) as prev_date
          from t
         ) t
    where prev_date_eta is null or
          prev_date_eta <> prev_date;
    

    This approach compares the previous date with the same attributes to the overall previous date for the employee. When these are the same, nothing has changed, so the row is filtered out.

    The use of partition by is a big convenience when you want to compare multiple columns. The alternative is basically to compare each column individually.