Prev: SQLjobvis 3.0 (free) - Visualize your SQL agent job histories...
Next: non-equal DRI: a possible new sql feature
From: Antti Nivala on 4 Mar 2010 00:44 We use some CLR UDFs to process BLOBs. In many cases, the optimizer seems to choose a bad plan because it guesses that our UDF will limit the results much more than it does. For example, when we have a WHERE condition such as OurUDF( Data ) = 1, the optimizer apparently guesses that the number of rows will drop to one third (33%). In reality, the number of rows remains almost the same because the UDF will return 1 for 95% to 100% of rows. Is it possible to tell the optimizer that this UDF has poor selectivity and that it is thus better to evaluate almost any other restricting conditions prior to evaluating the condition that involves the UDF? Or, as a related matter, is it possible to tell the optimizer that the execution of the UDF is somewhat slow and thus it would be wise to optimize for executing the UDF the smallest possible number of times? -Antti
From: Gert-Jan Strik on 4 Mar 2010 17:16
Antti Nivala wrote: > > We use some CLR UDFs to process BLOBs. In many cases, the optimizer seems to > choose a bad plan because it guesses that our UDF will limit the results much > more than it does. For example, when we have a WHERE condition such as > OurUDF( Data ) = 1, the optimizer apparently guesses that the number of rows > will drop to one third (33%). In reality, the number of rows remains almost > the same because the UDF will return 1 for 95% to 100% of rows. > > Is it possible to tell the optimizer that this UDF has poor selectivity and > that it is thus better to evaluate almost any other restricting conditions > prior to evaluating the condition that involves the UDF? > > Or, as a related matter, is it possible to tell the optimizer that the > execution of the UDF is somewhat slow and thus it would be wise to optimize > for executing the UDF the smallest possible number of times? > > -Antti I am afraid not. At least not in any direct way. Depending on your query, you might be able to rewrite it. For example using derived tables with grouping or replacing Inner Join with Outer Join, etc. But without seeing the query, there is not much to say about that. Also, you need good SQL skills to handle this. Maybe it is an option to create a computed column for OurUDF(Data) and persist it (index it). It would change the performance dynamics (performance hit when "Data" is changed, no performance hit when selecting from the table). It would also add statistics to the column that the optimizer can use. -- Gert-Jan |