Prev: createing db 11gr1 on grid 11gr2
Next: Documentation
From: bugbear on 4 Dec 2009 11:48 Can anyone tell me (or point me at information) about how statistics are generated/used for multi-column indexes? We have a 2 column status model (major and minor status, if you like). This is key to our application. "Naturally" we have put a 2 column index on these two columns. But we have "doubts" about whether the CBO is always doing as good as job as we'd like. The key question seems to be: Are the columns "treated as a pair" so that the frequency estimate for a pair of statuses depends on both of them or... Are the columns treated in isolation so that the frequency estimate is simply obtained by getting the estimate for each column, and combining them mathematically. This would make a big difference in our app, since our major/minor status (in practice) have interesting and complex correlations. BugBear
From: bugbear on 4 Dec 2009 12:09 bugbear wrote: > Can anyone tell me (or point me at information) about > how statistics are generated/used for multi-column indexes? > Apologies: I should have said "in 10g". I've also found a reference to a new feature in 11g; "Multi-column histograms" If this is new in 11g it has "unfortunate" implications for our running on 10g... BugBear
From: joel garry on 4 Dec 2009 12:29 On Dec 4, 9:09 am, bugbear <bugbear(a)trim_papermule.co.uk_trim> wrote: > bugbear wrote: > > Can anyone tell me (or point me at information) about > > how statistics are generated/used for multi-column indexes? > > Apologies: I should have said "in 10g". > > I've also found a reference to a new feature in 11g; > "Multi-column histograms" > > If this is new in 11g it has "unfortunate" implications > for our running on 10g... > > BugBear Check this out: http://richardfoote.wordpress.com/2008/09/16/index-monitoring-and-index-statistics-the-great-gig-in-the-sky/#comments jg -- @home.com is bogus. "...Network hardware includes Dell servers, Dell workstations, IBM RS6000 and P570 RISK based systems, Cisco routers, and HP ProCurve switchs. MINIMUM REQUIREMENTS Qualified candidates must have a Bachelors degree in Computer Science or a related degree with 4-6 years programming experience with HTML/ CGI/Java, ASP.Net and an understanding of internet working and knowledge of applications available on the WEB, Business Objects XIR2, Crystal Reports XIR2, VB6 Programming Language, Visual Basic and Studio, in an Oracle/AIX/2003 environment, or..." - metrolink job ad.
From: Mladen Gogala on 4 Dec 2009 17:21 On Fri, 04 Dec 2009 16:48:43 +0000, bugbear wrote: > Can anyone tell me (or point me at information) about how statistics are > generated/used for multi-column indexes? > > We have a 2 column status model (major and minor status, if you like). > This is key to our application. > > "Naturally" we have put a 2 column index on these two columns. > > But we have "doubts" about whether the CBO is always doing as good as > job as we'd like. > > The key question seems to be: > > Are the columns "treated as a pair" so that the frequency estimate for a > pair of statuses depends on both of them or... > > Are the columns treated in isolation > so that the frequency estimate > is simply obtained by getting the estimate for each column, and > combining them mathematically. > > This would make a big difference in our app, since our major/minor > status (in practice) have interesting and complex correlations. > > BugBear 10g doesn't do multi-column histograms. 10g will create histograms for every single column of the index and will estimate the number of entries retrieved for each column by using those histograms. After that, it will calculate the selectivity of the conditions by multiplying the selectivities for every single column. Selectivity is the estimated number of retrieved values divided by the total number of values in that column. When the CBO gets the selectivity, it will multiply it by the index clustering factor to estimate the number of blocks that need to be retrieved. The correlation between the number of blocks and the price of the query is known only to the God and Jonathan, but there definitely is one. -- http://mgogala.byethost5.com
From: Robert Klemme on 5 Dec 2009 08:18
On 12/04/2009 11:21 PM, Mladen Gogala wrote: > On Fri, 04 Dec 2009 16:48:43 +0000, bugbear wrote: > >> Can anyone tell me (or point me at information) about how statistics are >> generated/used for multi-column indexes? >> >> We have a 2 column status model (major and minor status, if you like). >> This is key to our application. >> >> "Naturally" we have put a 2 column index on these two columns. >> >> But we have "doubts" about whether the CBO is always doing as good as >> job as we'd like. >> >> The key question seems to be: >> >> Are the columns "treated as a pair" so that the frequency estimate for a >> pair of statuses depends on both of them or... >> >> Are the columns treated in isolation >> so that the frequency estimate >> is simply obtained by getting the estimate for each column, and >> combining them mathematically. >> >> This would make a big difference in our app, since our major/minor >> status (in practice) have interesting and complex correlations. > > 10g doesn't do multi-column histograms. 10g will create histograms for > every single column of the index and will estimate the number of entries > retrieved for each column by using those histograms. After that, it will > calculate the selectivity of the conditions by multiplying the > selectivities for every single column. Selectivity is the estimated > number of retrieved values divided by the total number of values in that > column. When the CBO gets the selectivity, it will multiply it by the > index clustering factor to estimate the number of blocks that need to be > retrieved. The correlation between the number of blocks and the price of > the query is known only to the God and Jonathan, but there definitely is > one. Are there cases where it makes sense to use a FBI to get "multi column" histogram information? Of course, queries then would also have to use that concatenated value as query criteria which makes usage of this quite nasty (especially if the SQL is generated by some kind of persistence container). Cheers robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ |