From: Walter Roberson on 12 Aug 2010 19:32 Safa wrote: > I am more of an Excel user, and I already know that the frequency > command counts the instance of numbers that are less than or equal to > the upper limit of each bin. Obviously Matlab is using the formula: n(k) > counts the value x(i) if edges(k) <= x(i) < edges(k+1). It is unclear > however what Matlab does for the last bin, does it just count instances > of 1 exactly? What it does is precisely documented http://www.mathworks.com/access/helpdesk/help/techdoc/ref/histc.html "n(k) counts the value x(i) if edges(k) <= x(i) < edges(k+1). The last bin counts any values of x that match edges(end). Values outside the values in edges are not counted. Use -inf and inf in edges to include all non-NaN values." To repeat for emphasis: the last bin counts any values of x that match edges(end). > As I mentioned in my previous message, I already looked histc in Matlab > help, and I requested a way to CHANGE the histc so that it matches > Excel. Histc is an inbuild command in Matlab and I don't know how to > change the above inbuilt equation. Excel does not appear to follow a consistent method with regards to its lower bound. It appears that you might be perhaps able to duplicate excel's inconsistent method via T = histc(Y,[Bins(1)*(1+eps) Bins(2:end)*(1-eps)]); T(1:end-1) However, I am basing this on a single example and there might be a deeper more subtle reason why the 2 is not matched.
From: Steven_Lord on 13 Aug 2010 10:02 "Safa " <enxss10(a)nottingham.ac.uk> wrote in message news:i41uug$6ln$1(a)fred.mathworks.com... > "Roger Stafford" <ellieandrogerxyzzy(a)mindspring.com.invalid> wrote in > message <i3vgp9$io5$1(a)fred.mathworks.com>... >> > Thanks for your suggestion. I have read histc documentation several >> > times. It gives the following equation: >> > n(k) counts the value x(i) if edges(k) <= x(i) < edges(k+1). What I >> > would like to be able to do, is to tweak the histc command so that it >> > gives the same frequency distribution as in excel. Is this possible? At >> > the moment, the Excel and Matlab are counting the numbers differently, >> > and I am at a loss to why it is doing this. Appreciate any further >> > advice you could give on this matter. Thanks in advance. >> - - - - - - - - >> I repeat! This something you are entirely capable of finding out for >> yourself with the use of the sort function. Pin down the individual data >> value or values where Excel made one decision and histc a different one >> and then you are well on your way to solving your own problem. >> >> Roger Stafford > > I must say I wasn’t happy with the tone of your second message as > this is a serious query about the operation of Matlab. I will respond to > it, by submitting the following example. > > Y=[0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 > 0.9 0.95 1 1.2] > Bins=[0.2 0.4 0.6 0.8 1] > Histc(Y,Bins) gives 4,4,4,4,1 > > Frequency command in Excel gives 3,4,4,4,4 > > I am more of an Excel user, and I already know that the frequency command > counts the instance of numbers that are less than or equal to the upper > limit of each bin. Obviously Matlab is using the formula: n(k) counts the > value x(i) if edges(k) <= x(i) < edges(k+1). Indeed; that is the documented behavior of the function. http://www.mathworks.com/access/helpdesk/help/techdoc/ref/histc.html > It is unclear however what Matlab does for the last bin, does it just > count instances of 1 exactly? That too is documented on the page above: "The last bin counts any values of x that match edges(end). Values outside the values in edges are not counted." > As I mentioned in my previous message, I already looked histc in Matlab > help, and I requested a way to CHANGE the histc so that it matches Excel. > Histc is an inbuild command in Matlab and I don't know how to change the > above inbuilt equation. We do not provide the source code for HISTC and so you cannot change what HISTC does short of shadowing it (which I do NOT recommend.) I suppose you could create your own HISTC subfunction or private function to shadow it without making your shadowed version globally visible, but I'd still be wary of doing so. What I would recommend is creating your own function that does the equivalent of Excel's FREQUENCY command, but you would likely need to test it thoroughly as the documentation for the FREQUENCY command leaves unsaid (at least in my cursory glance) what the command does in certain (potentially common) scenarios. For your particular issue, you will need to be extra careful, as many of the numbers in your Y vector and almost all of the numbers in your Bins vector cannot be exactly represented in floating-point double precision. I'm assuming that for your real problem (rather than the demonstration example above) that you will not be typing in the Y vector but are computing it somehow; if that's the case, you may think the third element is the same as the first element of the Bins vector, but it may not be. x = 0:0.1:1; x == 0.3 % returns all false values, even though it _looks_ like 0.3 is the fourth element of x. This is the CORRECT behavior. See question 6.1 in the newsgroup FAQ for more information on this issue related to floating-point arithmetic. -- Steve Lord slord(a)mathworks.com comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ To contact Technical Support use the Contact Us link on http://www.mathworks.com
First
|
Prev
|
Pages: 1 2 Prev: Problem with fmincom Next: loop with manipulation of matrix multiplication |