Order of SQL Query Execution
The sql query below is based on the Advertureworks2019 database we loaded into our database during SQL class. -- Retrieve aggregated information about transaction types and associated work orders SELECT PT.TransactionType, AVG(PW.StockedQty) AS AVGQTY, COUNT(PT.Quantity) AS TOTALQTY, SUM(PT.ActualCost) AS TOTALCOST FROM [Production].[TransactionHistory] AS PT INNER JOIN [Production].[WorkOrder] AS PW ON PT.ProductID = PW.ProductID WHERE PT.Quantity > 13 -- Filters rows where quantity is greater than 13 (handles both integer and non-integer values) GROUP BY PT.TransactionType -- Groups results by transaction type (non-aggregated function) HAVING SUM(PT.ActualCost) >= 0.00 -- Filters groups where the total actual cost is greater than or equal to 0.00 ORDER BY TOTALCOST ASC -- Orders the result set by total cost in ascending order Each query begins with finding the data that we need in a database, and then filtering that data down into something that can be processed and understood as quickly as possible. Because each part of the query is executed sequentially, it's important to understand the order of execution so that you know what results are accessible where. Query order of execution 1. FROM and JOINs The FROM clause, and subsequent JOINs are first executed to determine the total working set of data that is being queried. This includes subqueries in this clause and can cause temporary tables to be created under the hood containing all the columns and rows of the tables being joined. 2. WHERE Once we have the total working set of data, the first-pass WHERE constraints are applied to the individual rows, and rows that do not satisfy the constraint are discarded. Each of the constraints can only access columns directly from the tables requested in the FROM clause. Aliases in the SELECT part of the query are not accessible in most databases since they may include expressions dependent on parts of the query that have not yet executed. 3. GROUP BY The remaining rows after the WHERE constraints are applied are then grouped based on common values in the column specified in the GROUP BY clause. As a result of the grouping, there will only be as many rows as there are unique values in that column. Implicitly, this means that you should only need to use this when you have aggregate functions in your query.