From: jbdhl on 2 Mar 2010 06:09 Consider a table and a query referring to only a subset of the columns in that table. How early in the query evaluation is the projection carried out? Are the columns to be selected filtered out as early as in the very access method that reads the table rows from the buffer, or are the projection handled later, after the whole row has been fetched by the access method? Does it depend on the complexity of the query, how far down the three that the projection is handled out? Thanks!
From: vic on 2 Mar 2010 11:41 On Mar 2, 6:09 am, jbdhl <jbirksd...(a)gmail.com> wrote: > Consider a table and a query referring to only a subset of the columns > in that table. How early in the query evaluation is the projection > carried out? > > Are the columns to be selected filtered out as early as in the very > access method that reads the table rows from the buffer, or are the > projection handled later, after the whole row has been fetched by the > access method? > > Does it depend on the complexity of the query, how far down the three > that the projection is handled out? > > Thanks! That's an interesting topic: examining the query plan might help answer this to some degree. Each query can be handled differently by the optimizer because, as Plamen pointed out, the answer depends on many factors. Also consider that a one-table select statement that retrieves only one row, via an index, will still result in one entire datapage of I/O plus whatever index I/O was needed. That is, every column of every row on the datapage will be I/O'ed. So a partial answer to your question is that, at the point that the query engine access the datapages, extraneous columns have not yet been filtered out. Exceptions to this occur, as you probably already know, whenever a covering index is used. FYI, there are "column-based" or "vertical" database products on the market that do not operate this way, so that when data is accessed only the queried columns are being I/O'ed. It's as if every data access is actually done via a covering index. I believe that these products are used in the data warehouse/BI context rather than the OLTP context, and I assume that this is because there are performance trade-offs involved. - victor
From: jbdhl on 2 Mar 2010 15:58 > But in practice the query optimizer may move steps up ro down the execution plan to find efficient way to retrieve data. > One example is using covering index for the query where no unnecessary data access will be incurred. Will the projection ever happen directly in the access method that fetch rows from the buffer? Or will it always be done at some point after the initial fetch?
From: Erland Sommarskog on 2 Mar 2010 16:21 jbdhl (jbirksdahl(a)gmail.com) writes: > Will the projection ever happen directly in the access method that > fetch rows from the buffer? Or will it always be done at some point > after the initial fetch? I'm not sure that your question even does make sense. SQL Server reads rows from pages, either from disk or from the buffer cache. When a plan operator reads a row, it will read the columns that the parent operator asks for. These columns can be in the SELECT list, or be needed for a join elsewhere in the plan. I would suggest that you study some query plans in SQL Server Management Studio. This may not answer all your questions, but rather it may raise new questions, but hopefully questions that are more on the mark. -- Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se Links for SQL Server Books Online: SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
From: Gert-Jan Strik on 2 Mar 2010 16:23 jbdhl wrote: > > Consider a table and a query referring to only a subset of the columns > in that table. How early in the query evaluation is the projection > carried out? > > Are the columns to be selected filtered out as early as in the very > access method that reads the table rows from the buffer, or are the > projection handled later, after the whole row has been fetched by the > access method? > > Does it depend on the complexity of the query, how far down the three > that the projection is handled out? > > Thanks! In general, the optimizer is written to parse, compile and execute a query as fast as possible. As a rule it will cut out every unnecessary part that is possible, as long as there is a gain in the end, and it will try to do so as early as possible. Typically, the operator that locates the rows will also read the table data. The exceptions are operators like index joining or index intersection. Of course, the storage engine will only fetch the columns that are needed to process the query. However, since the relative cost is low (mainly memory space for intermediate results), other factors are typically much more important, such as access path and method, because they usually have a much bigger impact on performance. As you may know, physical I/O is orders of magnitude slower than any reading from the buffer cache, so avoiding I/O has the highest priority in the optimization strategy. Important cost factors (not necessarily in this order) are locating and locking a page, the I/O to fetch a (range of) pages, the cost associated with sorting, hashing and hash joins, etc. Compared to this any performance difference between selecting all colums or just a subset of columns from a buffered page is insignificant (if it is even possible to measure any effect). -- Gert-Jan
|
Pages: 1 Prev: Split a period of time Next: Whitepaper on Location Intelligence |