SQL improvement

I realized the below SQL using DISTINCT is a bad thing for performance.

However, if that is not included it will return too many rows (due to the one-to-many relationship between Instrument and Party tables).

I have searched the internet and see a few suggestions about getting rid of the DISTINCT, but I honestly have no idea what the best solution is.

Can you please provide recommendations for improving this SQL and ensuring it only returns one row per instrument (as opposed to multiple rows for each instrument)?

Just FYI - some things I saw as performance suggestions:

  1. Get rid of IN statement that is easy and will be done

  2. Replace DISTINCT with GROUP BY does this really improve performance?

  3. Perhaps do an outer SELECT that includes DISTINCT around the big SQL without the DISTINCT. My thought behind this is that the DISTINCT would be performed on a much smaller subset than the 1M++ rows in INSTRUMENT table.

My query:

select * 
from
(
  SELECT DISTINCT
    I.UOID,
    I.INSTRUMENT_ID,
    I.STATUS,
    I.A_ACTIVITY_ORIG,
    I.A_CURRENCY,
    I.A_OPER_BK_ORG_ORIG,
    I.A_POSITION_ACTIVE,
    I.A_PRODUCT,
    I.A_PRODUCT_TYPE,
    I.A_TERMS_ACTIVE,
    I.CURR_COI,
    I.BOUT_COI,
    I.A_CUST_RELATIONSHP,
    I.DATE_START,
    I.DATE_END,
    I.CLIB_COI,
    I.CLIB_BASE,
    I.BLIB_COI,
    I.BLIB_BASE,
    I.BOUT_BASE,
    I.AVAL_COI,
    I.SEQUENCE_NUM,
    I.AVAL_BASE,
    I.MAXU_COI,
    I.MAXU_BASE,
    I.A_CURRENCY_BASE,I.A_CLIENT_BANK,
    I.A_PRODUCT_CATEGORY,
    I.A_ASSIGNMENT_ACTV,
    I.A_RELATED_ACTIVITY,
    C.CUSTOMER_ID,
    C.SHORT_NAME
  FROM instrument I 
  INNER JOIN PARTY P ON P.A_INSTRUMENT = I.UOID 
  INNER JOIN CUSTOMER C ON P.A_CUSTOMER = C.UOID AND C.CUSTOMER_ID = :customerId
  WHERE (I.STATUS <> 'TMP') 
    AND (I.A_CLIENT_BANK = :clientBank)  
    AND I.A_PRODUCT_CATEGORY <> 'CM'  
    AND I.STATUS NOT IN ( 'CAN','CLO','DEA','LIQ')
)
where rownum <= :maxSize;

1 answer

  • answered 2018-11-08 07:05 Thorsten Kettner

    The customer you are looking for is related to an instrument by a party. As a customer is also related to multiple parties, they can be related to the same instrument multiple times. You could use an IN or EXISTS clause to get all instruments that occur in the customer's parties, but you also want to select from the customer table. So you want to join instead, and this is one of the few situations where using DISTINCT really makes sense.

    As to your ideas:

    Get rid of IN statement that is easy and will be done

    I see no reason why you would do that. Are you saying that you want to replace the IN with multiple OR? That only gets the query less readable and does the same thing internally.

    Replace DISTINCT with GROUP BY does this really improve performance?

    Don't. It does the same thing, but lessens readability. GROUP BY is for aggregation. As you are not using any aggregation function, use DISTINCT. The execution plan should be exactly the same.

    Perhaps do an outer SELECT that includes DISTINCT around the big SQL without the DISTINCT. My thought behind this is that the DISTINCT would be performed on a much smaller subset than the 1M++ rows in INSTRUMENT table.

    No, the steps will be just the same: find the customer's parties, find these parties' instruments that match your criteria, show only some of them.

    I see nothing to improve your query, except for the use of indexes. I don't know, whether it is better to get from the customer via parties to instruments or vice versa, so I'd create indexes for both cases:

    create index idxc1 on customer( customer_id, uoid );
    create index idxp1 on party ( a_customer, a_instrument );
    create index idxi1 on instrument ( uoid, a_client_bank, status, a_product_category );
    
    create index idxi2 on instrument ( a_client_bank, status, a_product_category, uoid );
    create index idxp2 on party ( a_instrument, a_customer );
    create index idxc2 on customer( uoid, customer_id );
    

    Then run the query or retrieve an explain plan and see which indexes are actually used and drop the others.