Unexpectedly slow MYSQL query on newly indexed data

I've got the following query:

SELECT DISTINCT
        CONCAT(COALESCE(location.google_id, ''),
                '-',
                COALESCE(locationData.resolution, ''),
                '-',
                COALESCE(locationData.time_slice, '')) AS google_id
    FROM
        LocationData AS locationData
            JOIN
        Location AS location ON location.id = locationData.location_id

    WHERE
        location.company_google_id = 5679037876797440
            AND location.google_id IN (4679055472328704, 6414382784315392, 5747093579759616)
            AND locationData.resolution = 8
            AND locationData.time_slice >= ((SELECT max(s.time_slice) FROM LocationData as s WHERE s.location_id = location.id ORDER BY s.time_slice ASC) - 255)
            AND location.active = TRUE
    ORDER BY location.google_id ASC , locationData.time_slice ASC
    LIMIT 0 , 101

I've got indices on all columns in the WHERE and ORDER BY clauses and I've added a compound index for (LocationData.time_slice, LocationData.location_id)

Running explain gives (which gave some challenges formatting here, so hopefully it shows up nicely):

   id | select_type        | table        | type  | possible_keys                              | key                |  key_len | ref                | rows | Extra
    1 | PRIMARY            | location     | range | PRIMARY,google_id_UNIQUE                   | google_id_UNIQUE   | 8        | NULL               |    3 | Using index condition; Using where; Using temporary; Using filesort
    1 | PRIMARY            | locationData | ref   | max_time_slice_idx,max_time_slice_idx_desc | max_time_slice_idx | 5        | index2.location.id |  301 | Using where
    2 | DEPENDENT SUBQUERY | s            | ref   | max_time_slice_idx,max_time_slice_idx_desc | max_time_slice_idx | 5        | index2.location.id |  301 | Using index

I know the dependent subquery is slow, and I'm open to suggestions for getting similar behavior, but I'm seeing this query take about 92 seconds to run, which is about 4 orders of magnitude different than test data I ran before adding the new compound index to production.

Is there index building that happens after the ALTER statement is run? Is there some way to check that the index is performing correctly?

Row counts for the two tables:

Production:

Location: 6,814

LocationData: 13,070,888

Test Data:

Location: 626

LocationData: 594,780

Any thoughts or suggestions are appreciated. Thanks in advance!

1 answer

  • answered 2018-10-11 19:51 scaisEdge

    Just a suggestion
    you could avoid the subselect using an inner join

    SELECT DISTINCT
        CONCAT(COALESCE(location.google_id, ''),
                '-',
                COALESCE(locationData.resolution, ''),
                '-',
                COALESCE(locationData.time_slice, '')) AS google_id
    FROM LocationData AS locationData
    INNER JOIN Location AS location ON location.id = locationData.location_id
    INNER JOIN (
                SELECT s.location_id, max(s.time_slice)  -255 my_max_time_slice
                FROM LocationData as s
                GROUP BY s.location_id
            ) t on t.location_id = Location.id
    
    WHERE
        location.company_google_id = 5679037876797440
            AND location.google_id IN (4679055472328704, 6414382784315392, 5747093579759616)
            AND locationData.resolution = 8
            AND locationData.time_slice >= t.my_max_time_slice
            AND location.active = TRUE
    ORDER BY location.google_id ASC , locationData.time_slice ASC
    LIMIT 0 , 101
    

    In this way you shoudl avoid the repeteation of the subquery for each id using just one query for buil the aggregated result for the max_time_slice

    hope this is useful