How can I select the last index of a column split with bigquery

There are a lot of questions about splitting a BigQuery, MySQL column, but I can't find one that fits my situation.

I am processing a large dataset (3rd party) that includes a freeform location field to normalize it for my Android app. When I run a select I'd like to split the column data by commas, take only the last segment and trim it of whitespace.

So far I've come up with the following by Googling documentation:

SELECT RTRIM(LOWER(SPLIT(location, ',')[OFFSET(-1)])) FROM `users` WHERE location <> ''

But the -1 trick to split at last element does not work (with either offset or ordinal). I can't use ARRAY_LENGTH with the same array inline and I'm not exactly sure how to structure a nested query and know the last column index of the row.

I might be approaching this from the wrong angle, I work with Android and NoSQL now so I haven't used MySQL in a long time

How do I structure this query correctly?

1 answer

  • answered 2018-10-11 19:34 Mikhail Berlyant

    I'd like to split the column data by commas, take only the last segment ...

    You can use below approach (BigQuery Standard SQL)

    SELECT ARRAY_REVERSE(SPLIT(location))[SAFE_OFFSET(0)]
    

    Below is an example illustrating it:

    #standardSQL
    WITH `project.dataset.table` AS (
      SELECT '1,2,3,4,5' location UNION ALL
      SELECT '6,7,8'
    )
    SELECT location, ARRAY_REVERSE(SPLIT(location))[SAFE_OFFSET(0)] last_segment
    FROM `project.dataset.table`   
    

    with result

    Row location    last_segment     
    1   1,2,3,4,5   5    
    2   6,7,8       8      
    

    For trimming - you can use LTRIM(RTRIM()) - like in

    SELECT LTRIM(RTRIM(ARRAY_REVERSE(SPLIT(location))[SAFE_OFFSET(0)]))