From: Paul on
Hello! I am looking for a bit of help.

I need to find the fastest algorithm to search a sorted list of numbers so I can calculate a linear interpolation. So far I have been using a binary search which finds the two values that bracket the number desired. Essentially I have a series of x and y values describing a arbitrary function and am trying to find the y value for an arbitrary x value not in the list but within the bounds of the list. Right now I'm achieving 4E-7 seconds per search on a Intel Q9550 w/ 4GB RAM in Matlab R2010a.

Here is a sample of the data I am working with:
x=[-9.2103 -6.0756 -4.9966 -4.3418 -3.8737 -3.511 -3.216 -2.9681 -2.7549 -2.5683 -2.4028 -2.2545 -2.1203 -1.9981 -1.8861 -1.7829 -1.6875 -1.5988 -1.5162 -1.439 -1.3666 -1.2987 -1.2349 -1.1747 -1.1179 -1.0643 -1.0136 -0.9656 -0.9202 -0.8772 -0.8363 -0.7976 -0.7609 -0.7261 -0.6931 -0.6619 -0.6322 -0.6041 -0.5776 -0.5524 -0.5287 -0.5063 -0.4853 -0.4654 -0.4468 -0.4294 -0.4132 -0.3981 -0.384 -0.3711 -0.3592 -0.3484 -0.3385 -0.3297 -0.3219 -0.315 -0.3092 -0.3042 -0.3003 -0.297 -0.2937 -0.2904 -0.2871 -0.2831 -0.2782 -0.2724 -0.2656 -0.258 -0.2494 -0.24 -0.2298 -0.2187 -0.2067 -0.194 -0.1804 -0.166 -0.1509 -0.1351 -0.1185 -0.1012 -0.0833 -0.0646 -0.0453 -0.0254 -0.0049 0.0162 0.0379 0.0601 0.0828 0.1061 0.1298 0.154 0.1786 0.2037 0.2292 0.255 0.2813 0.3078 0.3348 0.362 0.3895 0.4174 0.4455 0.4738 0.5025 0.5313 0.5603 0.5896 0.619 0.6487 0.6785 0.7084 0.7385 0.7688 0.7991 0.8296 0.8602 0.8909 0
..9217 0.9525 0.9835 1.0145 1.0456 1.0767 1.1079 1.1391 1.1704 1.2017 1.233 1.2644 1.2957 1.3271 1.3586 1.39 1.4214 1.4528 1.4843 1.5157 1.5471 1.5785 1.61 1.6413 1.6727 1.7041 1.7354 1.7667 1.798 1.8293 1.8606 1.8918 1.923 1.9542 1.9853 2.0164 2.0475 2.0785 2.1096 2.1406 2.1715 2.2025 2.2333 2.2642 2.295 2.3258 2.3566 2.3873 2.418 2.4486 2.4792 2.5098 2.5404 2.5709 2.6014 2.6318 2.6622 2.6926 2.723 2.7533 2.7835 2.8138 2.844 2.8742 2.9043 2.9344 2.9645 2.9946 3.0246 3.0546 3.0846 3.1145 3.1444 3.1743 3.2042 3.234 3.2638 3.2935 3.3233 3.353 3.3826 3.4123 3.4419 3.4715 3.5011 3.5307 3.5602 3.5897 3.6192 3.6486 3.6781 3.7075 3.7368 3.7662 3.7956 3.8249 3.8542 3.8835 3.9127 3.942 3.9712 4.0004 4.0296 4.0588 4.0879 4.117 4.1462 4.1753 4.2043 4.2334 4.2625 4.2915 4.3205 4.3495 4.3785 4.4075 4.4365 4.4654 4.4943 4.5233 4.5522 4.581 4.6099];

looking for value 1.6356.

The x values are not evenly spaced over the whole range, though are nearly so over a subrange. Any help or suggestions will be really appreciated!
From: Jan Simon on
Dear Paul,

> So far I have been using a binary search which finds the two values that bracket the number desired.

If you post the code, there would be the possibility, that somebody accelerates it.
Do you search just one or plenty of values? How many?
Did you try a binary search in a MEX function?
Please explain more details.

Kind regards, Jan
From: Paul on
"Jan Simon" <matlab.THIS_YEAR(a)nMINUSsimon.de> wrote in message <i2psrk$2v6$1(a)fred.mathworks.com>...
> Dear Paul,
>
> > So far I have been using a binary search which finds the two values that bracket the number desired.
>
> If you post the code, there would be the possibility, that somebody accelerates it.
> Do you search just one or plenty of values? How many?
> Did you try a binary search in a MEX function?
> Please explain more details.
>
> Kind regards, Jan

I search for just one value each time the function is called. The function is just a .m file as I dont have a mex file for binary search. I should probably look for one. The first comparison deals with the idea that the gap between the first and second points in x is large compared to the others and so in a even distribution of values in the interval, likely to be the one in a fair number of the incidences.

function Fprime=interpolation(R,F,Rprime)
%R is the list of x-values from the function
%F is the list of y-values from the function
%Rprime is the x-value of interest
%Fprime is the value of the function at Rprime

if Rprime<R(2)
Fprime=((F(2)-F(1))/(R(2)-R(1)))*(Rprime-R(1))+F(1);
else
searching=true;
low=2;
high=length(R);
mid=round(high/2);
while searching
if R(mid)>Rprime
high=mid;
mid=round((low+high)/2);
else
low=mid;
mid=round((low+high)/2);
end
searching=(high-low)>1;
end
Fprime=((F(high)-F(low))/(R(high)-R(low)))*(Rprime-R(low))+F(low);
end

Hope this helps!
From: Jan Simon on
Dear Paul,

> > Do you search just one or plenty of values? How many?

> I search for just one value each time the function is called.

Again: How often do you search for a number in a certain vector?
If you call the function millions of times, it would a good idea to create an equally spaced vector once together with an index vector pointing to the original indices.
Or you can divide the vector in 4 almost equally sized intervals and perform the first 2 binary search steps in a vectorized way.

But if you search in a certain vector once only, such pre-processing does not help. So please post the real (magnitudes of the) dimensions and numbers of calls.

Kind regards, Jan
From: Paul on
"Jan Simon" <matlab.THIS_YEAR(a)nMINUSsimon.de> wrote in message <i2q3c8$ebk$1(a)fred.mathworks.com>...
> Dear Paul,
>
> > > Do you search just one or plenty of values? How many?
>
> > I search for just one value each time the function is called.
>
> Again: How often do you search for a number in a certain vector?
> If you call the function millions of times, it would a good idea to create an equally spaced vector once together with an index vector pointing to the original indices.
> Or you can divide the vector in 4 almost equally sized intervals and perform the first 2 binary search steps in a vectorized way.
>
> But if you search in a certain vector once only, such pre-processing does not help. So please post the real (magnitudes of the) dimensions and numbers of calls.
>
> Kind regards, Jan

Jan,

Thanks for all your questions and advice so far! Sorry for my lack of coherence.

Each vector is searched something like 150,000 times. Each Rprime is used on average 12 times, in 12 different R and F vectors. I search something like 1,800,000 Rprimes.

How would this get preprocessed?

What this function is trying to do it recreate the spherical bessel functions i and k but cut down on the time significantly. I made a list of x and y values for spherical bessel functions to order 25 so that the error of a linear interpolation over a subinterval defined by adjacent x-values is below some bound. Each of the R and F vectors are those for a particular order of the bessel function. Essentially F_i=sbesselk(i,R_i). This computation takes really really long especially for higher order bessel functions. Hence the interpolation and search.