SQL: Count rows until criteria met

I need a way using SQL-Server to count the number of rows after the current row that are of a certain type, but only count up until the next row of a certain type, at which point the count would restart. In the below example I need to count 'Sub-Clase Section' after a 'Sub-Clause' or 'Schedule Section' after a 'Schedule'.

Example data:

Type                  Title
--------------------------------------------------
Clause                20
Sub-Clause            20.1 A Sub-Clause
Sub-Clause Section    20.1-1 A Sub-Clause Section
Schedule              Schedule 1 to Blah
Schedule Section      1.0 X
Schedule Section      2.0 X
Schedule              Schedule 2 to Blah
Sub-Clause            20.2 A Sub-Clause
Clause                21
Schedule              Schedule 1 to Clause 21
Schedule Section      1.0 X
Schedule Section      2.0 X

The output I require would be something like this:

Type                  Title                          Count
---------------------------------------------------------------
Clause                20                             0
Sub-Clause            20.1 A Sub-Clause              1
Schedule              Schedule 1 to Blah             2
Schedule              Schedule 2 to Blah             0
Sub-Clause            20.2 A Sub-Clause              0
Clause                21                             0
Schedule              Schedule 1 to Clause 21        2

I have come close to a solution using row_number() to identify each row, and then counting the difference between the current row and the next row with the criteria rows filtered out. This is quite a crude method and I am sure there is a better way to achieve this. Plus, my method does not work for the last row as it would return a NULL for the count. A stripped back version of my code is this:

SELECT
    Type
  , Title
  , LEAD(myTable.Row#, 1) OVER(ORDER BY Row#) - Row# -1 AS 'Count'
FROM
  (
  SELECT
    ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS Row#
  , Type
  , Title
  FROM Content      
  ORDER BY Row# 
  ) AS myTable
WHERE myTable.Type in ('Clause', 'Sub-Clause', 'Schedule')

Any suggestions on a better method that will achieve what I need?

1 answer

  • answered 2018-07-11 04:47 Squirrel

    i make some assumption. There is a ID column that can define the sequence of rows

    -- Schema
    declare @Content table
    (
        ID  int identity,
        Type    varchar(20),
        Title   varchar(50)
    )
    
    -- Sample Data
    insert into @Content
    SELECT  'Clause',                '20'                       union all
    SELECT  'Sub-Clause',            '20.1 A Sub-Clause'        union all
    SELECT  'Sub-Clause Section',    '20.1-1 A Sub-Clause Section'  union all
    SELECT  'Schedule',              'Schedule 1 to Blah'       union all
    SELECT  'Schedule Section',      '1.0 X'                    union all
    SELECT  'Schedule Section',      '2.0 X'                    union all
    SELECT  'Schedule',              'Schedule 2 to Blah'       union all
    SELECT  'Sub-Clause',            '20.2 A Sub-Clause'        union all
    SELECT  'Clause',                '21'                       union all
    SELECT  'Schedule',              'Schedule 1 to Clause 21'  union all
    SELECT  'Schedule Section',      '1.0 X'                    union all
    SELECT  'Schedule Section',      '2.0 X'
    
    -- Query
    ; with 
    cte as
    (
        -- grp is to identify group by rows with same Type
        -- first set of `Schedule Section` 
        --     and last set of `Schedule Section` will have diff `grp`
        SELECT  *, grp = ID - row_number() over (PARTITION BY Type ORDER BY ID)
        FROM    @Content c
    ),
    type_cnt as
    (
        -- count no of rows for each type
        SELECT  ID = min(ID), Type, cnt = count(*)
        FROM    cte c
        WHERE   c.Type not in ('Clause', 'Sub-Clause', 'Schedule')
        GROUP BY Type, grp
    )
    SELECT  c.ID, c.Type, c.Title, cnt = isnull(n.cnt, 0)
    FROM    cte c
            outer apply
            (
                SELECT  TOP 1 x.cnt
                FROM    type_cnt x
                WHERE   x.ID    = c.ID + 1
            ) n
    WHERE   c.Type in ('Clause', 'Sub-Clause', 'Schedule')
    ORDER BY ID
    
    /*  Result
    1   Clause      20          0
    2   Sub-Clause  20.1 A Sub-Clause   1
    4   Schedule    Schedule 1 to Blah  2
    7   Schedule    Schedule 2 to Blah  0
    8   Sub-Clause  20.2 A Sub-Clause   0
    9   Clause      21          0
    10  Schedule    Schedule 1 to Clause 21 2
    */