trying to compare the average of a column with each value of column- WITHOUT USING Nested or SubQUERY
this is how my data looks.
I want to find out the Business Entities who has Rate greater than the average rate. There is more data in the database- this is just part of it
Query I used :
SELECT A.BusinessEntityID, A.Rate, AVG(B.Rate) as AverageRate FROM HumanResources.EmployeePayHistory AS A INNER JOIN HumanResources.EmployeePayHistory AS B ON B.BusinessEntityID = A.BusinessEntityID - 1 GROUP BY A.BusinessEntityID, A.Rate HAVING A.Rate > AVG(B.Rate) Order By A.BusinessEntityID
I got this result
I get different average for each row/id but it should be the average of the whole Rate column and be 1 value
You can use
CROSS APPLYhere, and then get the average rate for all entities from the cross applied table. Basically you change your
INNER JOINto a
CROSS APPLYand remove the join condition. The query would look like this:
SELECT A.BusinessEntityID ,A.Rate ,AverageRate = AVG(B.Rate) FROM HumanResources.EmployeePayHistory AS A CROSS APPLY HumanResources.EmployeePayHistory AS B GROUP BY A.BusinessEntityID ,A.Rate HAVING A.Rate > AVG(B.Rate) ORDER BY A.BusinessEntityID;
As GMB mentioned,
CROSS APPLYonly works in SQL Server. But
CROSS JOINshould also work in this query in MySql.
SELECT A.BusinessEntityID ,A.Rate ,AverageRate = AVG(B.Rate) FROM HumanResources.EmployeePayHistory AS A CROSS JOIN HumanResources.EmployeePayHistory AS B GROUP BY A.BusinessEntityID ,A.Rate HAVING A.Rate > AVG(B.Rate) ORDER BY A.BusinessEntityID;
I don't know why you have been forbidden to use subqueries. The accepted solution (and you original attempt) to involve a self join and aggregation, which are both expensive tasks.
On the other hand, if you are running MySQL 8.0 (or another database/version that supports window functions - possibly SQL Server since you accepted a solution that uses
CROSS APPLY), you can do this with a window
avg()and a subquery.
select * from ( select BusinessEntityID, Rate, avg(Rate) over() AverageRate from mytable ) t where Rate > AverageRate
I would expect that this should be significantly more efficient than the self-join/aggregation solution on a large dataset: