From: Frédéric Bergeron on
"Matt J " <mattjacREMOVE(a)THISieee.spam> wrote in message <i0afpa$3is$1(a)fred.mathworks.com>...
> "Matt J " <mattjacREMOVE(a)THISieee.spam> wrote in message <i0ad6n$91i$1(a)fred.mathworks.com>...
>
> >
> > I could be wrong, but it seems like you can solve this by simple algebra. Let's make the following definitions
> >
> > N=length(data); %known
> > S=sum(data); %known
> > O=sum(data(over)-x)
> > U=sum(x-data(under))
> >
> > Then you have 2 equations that allow you to solve for O and U:
> >
> > O-R*U=0
> > O+U=S;
> ========
>
> Forget it. I pulled the equation O+U=S from my imagination somehow.
>
> In any case, there are at most N different possibilities for the sets
> data(over) and data(under). For each one, x ranges over a known interval separating data(over) from data(under) and in that interval O -R*U is a known linear function of x. You could simply loop over each of these N sets and test whether
> O(x)-R*U(x) has a root in that interval.


Hey thank you,


The O+U=S equation is fine for me, I just don't get the O-U=S-N*x.
Anyway, I decided to do a loop for a lot of values in the interval [min(data) max(data)], than calulate the ratio R for each of them and then take the value associated with the ratio R the closest than the previously defined one.

So I started by making a for loop, than I tought a while loop will be better and more efficient to locate the value with the least error on R. I redefine the upper or lower value of the interval for each while loop, and it's way faster. Here's my code if someone interested:

*****************************
clear all;

%Input
R=2; %Cut/fill ratio
datamin=55;
datamax=65;
data=datamin+(datamax-datamin)*rand(1,50);

%% First method
tic;
x=min(data):(max(data)-min(data))/200:max(data);
for i=1:length(x)
y(i)=abs(R*sum(x(i)-data(data<x(i)))-sum(data(data>=x(i))-x(i)));
end
x_best1=x(y==min(y));
R2_1=sum(data(data>=x_best1)-x_best1)/sum(x_best1-data(data<x_best1));
erreur_ratio1=abs(R2_1-R)/R;
t1=toc;

%% Second method
tic;
erreur_ratio2=1; %error on R
x1=min(data); %Lower value of the interval around the x_best
x2=max(data); %Upper value

while erreur_ratio2>0.01

%Calculs
x_best2=x1+(x2-x1)/2; %Taking the middle value of the interval

R2_2=sum(data(data>=x_best2)-x_best2)/sum(x_best2-data(data<x_best2));
if R2_2==R;
break %If the good ratio is found
elseif R2_2<R
x2=x_best2;
else %R2>R
x1=x_best2;
end
erreur_ratio2=abs(R2_2-R)/R;
end
t2=toc;

%% Methods comparison
fprintf('\nt1= %g\nerreur_ratio1= %g\nx_best1= %g\n',t1,erreur_ratio1,x_best1);
fprintf('\nt2= %g\nerreur_ratio2= %g\nx_best2= %g\n\n',t2,erreur_ratio2,x_best2);

****************
and I get:

t1= 0.030075
erreur_ratio1= 0.0113545
x_best1= 59.3732

t2= 0.0003564
erreur_ratio2= 0.00205718
x_best2= 59.3845

Thanks again for the answers!!
Fred
From: Frédéric Bergeron on
"Matt J " <mattjacREMOVE(a)THISieee.spam> wrote in message
> The following code implements this, assuming that all data points are unique. I leave the generalization as an exercise.
> ...
> %data
> R=0.75;
> data=rand(1000,1);
>
>
> %engine
> data=sort(data);
> S=sum(data);
> N=length(data);
>
> Sunder=cumsum(data(1:end-1));
> Sover=S-Sunder;
> Nunder=(1:N-1).';
> Nover=N-Nunder;
>
>
> X=(Sover+R*Sunder)./(Nover+R*Nunder);
>
> x=X(X>=data(1:end-1) & X<=data(2:end))
>
> %Check
> over=(data>=x);
> under=~over;
> Rcheck=sum(data(over)-x)/sum(x-data(under))

=========

Hey,

Thanks for this code, but since I got a working code in my last message, I think I'll use the code with the while loop. It's maybe longer than your code but since I don't understand all the things you do, I'll use my code, while referring to yours if I get any problem. Thanks!
From: Matt J on
"Frédéric Bergeron" <frederic.bergeron(a)logiag.com> wrote in message <i0aj3h$e0m$1(a)fred.mathworks.com>...

> Thanks for this code, but since I got a working code in my last message, I think I'll use the code with the while loop. It's maybe longer than your code but since I don't understand all the things you do, I'll use my code, while referring to yours if I get any problem. Thanks!
==============

That's fine, as long as you only need a modestly accurate result. Just for your reference though, I reran your test, but substituting your Method #1 with my closed-form method. As you can see, the closed-form method accuracy achieves machine precision accuracy in approximately the same amount of time.

%Closed form
t1= 0.000141917
erreur_ratio1= 2.96059e-016
x_best1= 0.524563

%Iterative approx.
t2= 0.000346971
erreur_ratio2= 0.00396906
x_best2= 0.52504
From: Frédéric Bergeron on
> That's fine, as long as you only need a modestly accurate result. Just for your reference though, I reran your test, but substituting your Method #1 with my closed-form method. As you can see, the closed-form method accuracy achieves machine precision accuracy in approximately the same amount of time.
>
> %Closed form
> t1= 0.000141917
> erreur_ratio1= 2.96059e-016
> x_best1= 0.524563
>
> %Iterative approx.
> t2= 0.000346971
> erreur_ratio2= 0.00396906
> x_best2= 0.52504

Cool thanks again... Yes I only need modestly accurate result. This was for a GUI leveling agricultural fields. So I have just finished transforming my program in 3D and instead of finding a single scalar, it finds the constant of the plane equation, i.e. c in z=ax+by+c, that respect the ratio R.
From: Roger Stafford on
"Matt J " <mattjacREMOVE(a)THISieee.spam> wrote in message <i0ahod$f3u$1(a)fred.mathworks.com>...
> The following code implements this, assuming that all data points are unique. I leave the generalization as an exercise.
>
> R=0.75;
> data=rand(1000,1);
> %engine
> data=sort(data);
> S=sum(data);
> N=length(data);
> Sunder=cumsum(data(1:end-1));
> Sover=S-Sunder;
> Nunder=(1:N-1).';
> Nover=N-Nunder;
> X=(Sover+R*Sunder)./(Nover+R*Nunder);
> x=X(X>=data(1:end-1) & X<=data(2:end))
- - - - - - - - - -
Hello Matt J. I want to congratulate you on that solution you came up with so quickly. I worked for quite a while on the problem before looking at yours and came up with something that was based on the same principles but a lot more complicated. That is, I directly solved for all the R values that correspond to boundaries between the discrete (and sorted) data values, and then checked to see which pair of these the given R lay in. Then I solved for x using the appropriate Sunder/Sover pair. Your way is more efficient.

However, I notice that both our methods are subject to a possible error which, with an unfortunate choice of R, can actually occur. Because of round off errors it is possible that the x you obtain in

x=X(X>=data(1:end-1) & X<=data(2:end))

can come up with either two answers which are essentially the same or no answer. I have checked and found this to happen when the given R value is very near one of the R boundary values mentioned above. The case of two values is easily dealt with by just choosing the first one since they are essentially equal. However when no answer is provided, that requires a little more effort. Perhaps the two inequalities above can be slightly widened by a sufficient "epsilon" amount based on possible round off errors to always provide at least one value. Or possibly using a more robust method of defining an "interval distance" function which is zero within an interval but equal to the distance from the closer end otherwise, and then choosing the nearest interval with the min function.

Hello Fred. My advice to you is to make use of Matt's solution. It is a very efficient and accurate algorithm. The rare difficulty I have discussed here is quite easily overcome.

Roger Stafford