h3mm3's blog

Stories from behind the keyboard

  • RSS
  • Twitter

Recently I meant to demo how to use two T-SQL logical operators: ANY (a.k.a. SOME) and ALL. Tipically these two operators are used in the right-side of a comparison expression, combined with a sub-query. For instance, let’s consider a sample query against AdventureWorksLT database:

SELECT * FROM SalesLT.Product WHERE ListPrice < [ALL|ANY] (
SELECT UnitPrice FROM SalesLT.SalesOrderDetails WHERE ProductID = 123 )

As Microsoft TechNet says, ANY…

…returns TRUE when the comparison specified is TRUE for any pair (scalar_expression, x) where x is a value in the single-column set; otherwise, returns FALSE.

while ALL…

…returns TRUE when the comparison specified is TRUE for all pairs (scalar_expression, x), when x is a value in the single-column set; otherwise returns FALSE.

Referring to the the previous code, you can use “ALL” to select all the products that cost less than every unit price stored in the SalesOrderDetails table. On the other hand, you can use “ANY” to select all the products having a list price inferior than any unit price.

What You See is *not* What You Get

After this simple sample, I decided to entangle things a little bit, and I wrote the following code:

SELECT p.ProductId, p.ListPrice
FROM SalesLT.Product p
WHERE p.ListPrice > ANY (
  SELECT d.UnitPrice
  FROM SalesLT.SalesOrderDetail d
  WHERE d.ProductID = p.ProductID
)

The inner query is now a correlated sub-query and it gives different result-sets for every record evaluated by the outer query. With this composed query I meant to select all the products with a list price greater than – at least – any related unit price. Note that in this case, the sub-query returns only the list of unit prices related to every single product; for instance, when evaluating the outer query for the ProductID 780, the sub-query is equivalent to the following one:

SELECT d.UnitPrice 
FROM SalesLT.SalesOrderDetail d 
WHERE d.ProductID = 780

When I executed the query I found 142 records [0]. Then I executed a similar query, using “ALL” in place of “ANY”. I expected much less records. The result was astonishing: 295 records [0]!!! How could be possible that the number of products that cost more than EVERY sold product is greater than the number of products that cost more than ANY sold product?!

This couldn’t be possible…

Yet another devil hidden in the details…

At first I thought I found a bug in ANY and ALL implementations Sorpresa

Of course, I was wrong Occhiolino

The point is that not every product listed in the Product table has been sold. For such products, the correlated sub-query returns an empty result-set.

So, how does SQL Server evaluates ANY/ALL when doing comparisons against an empty set? It turned out that SQL Server handles such cases in a simplistic way [1]:

  • when evaluating an “ALL-ized” comparison operator, it returns TRUE if it doesn’t find any case for which the comparison is FALSE
  • when evaluating an “ANY-ized” comparison operator, it returns FALSE if it doesn’t find any case for which the comparison is TRUE

As a consequence, the ALL operator evaluates TRUE against empty sets and the ANY operator evaluates FALSE against empty sets. Good to know!

You can try this yourself:

-- Selecting all products:
SELECT * FROM SalesLT.Product 
WHERE ListPrice > ALL ( 
  SELECT ‘who cares?’ WHERE 1=0 
) 

-- Selecting no product:
SELECT * FROM SalesLT.Product 
WHERE ListPrice > ANY ( 
  SELECT ‘who cares?’ WHERE 1=0 
)

Since my Product table contains 153 records which are not referenced by any record in SalesOrderDetail, all these unreferenced products where selected in the “ALL-ized” version of the query and where discarded in the “ANY-ized” version (142+153=295, math cannot lie.)

By the way, this verbose query works as expected:

SELECT p.ProductId, p.ListPrice FROM SalesLT.Product p 
WHERE p.ListPrice > ANY ( 
SELECT d.UnitPrice FROM SalesLT.SalesOrderDetail d 
WHERE d.ProductID = p.ProductID )
AND EXISTS (
SELECT * FROM SalesLT.SalesOrderDetail d
WHERE d.ProductID = p.ProductID )

Happy programming!

[0] Numbers may vary in your database (I can’t remember if I used a clean version of AW)

[1] This is how every sane programmer would solve such a problem: check for a counterexample and exit the loop as soon as you find one.

No comments: