From: Ray Koopman on
Zach,

The point I was trying to make was that inefficiencies in the code
that was wrapped around MovingAverage were costing substantially more
than compiling was saving.

Ray

----- Zach Bjornson <bjornson(a)mit.edu> wrote:
> Ray,
>
> Critical statement there is "under your test conditions." I played with
> Stefan's problem for quite a while and came up with a few moving average
> functions, and tried them all with and without compiling. His function
> in particular was only 15% slow compiled/uncompiled on my computer with
> his data set. The functions I came up with were usually faster when
> compiled, depending on the data set. Also depending on the data set,
> some were faster than the built-in MovingAverage function. They were
> never faster than the inbuilt function with his data set however, so I
> never sent my functions along. Since this came up though, my futzing is
> below.
>
> My initial response to Stefen's inquiry was the thought that Compile
> would have no effect on MovingAverage, or would just add kernel time
> while Mmeca decides to execute it with normal Mathematica code, but I'm
> not sure that's true.
>
> -Zach
>
> (*data-set dependencies are illustrated between the top and bottom half
> of this*)
>
> $HistoryLength=0 (*to prevent artificially high speeds*)
>
> 1.1 Your function
> movAverageOwn2FCorig =
> Compile[{{dataInput, _Real,
> 1}, {days, _Integer}, {length, _Integer}},
> N[Mean[dataInput[[1 + # ;; days + #]]]] & /@
> Range[0, length - days, 1]]
>
> In[165]:=
> First(a)Timing[
> Do[movAverageOwn2FCorig[Range[1000000], 2, 1000000];, {10}]]/10
> Out[165]= 1.7347
>
> 1.2 Inbuilt Mathematica function
> In[164]:= First(a)Timing[Do[MovingAverage[Range[1000000], 2];, {10}]]/10
> Out[164]= 1.6942
>
> 1.3 My variation #1
> movAverageOwn2FCa =
> Compile[{{dataInput, _Real, 1}, {days, _Integer}},
> Table[Mean[dataInput[[i ;; i + days - 1]]], {i,
> Length(a)dataInput - days + 1}]]
>
> In[166]:=
> First(a)Timing[Do[movAverageOwn2FC[Range[1000000], 2];, {10}]]/10
> Out[166]= 1.6146
>
> Non-compiled function version gives a time of 4.0311
> for this same data set.
>
> 1.4 My variation #2
> movAverageOwn2Fb =
> Compile[{{dataInput, _Real, 1}, {days, _Integer}},
> With[{innerdata = Partition[dataInput, days, 1]},
> Table[Mean[innerdata[[i]]], {i, Length(a)innerdata}]
> ]]
>
> In[167]:=
> First(a)Timing[Do[movAverageOwn2F3[Range[1000000], 2];, {10}]]/10
> Out[167]= 1.6287
>
> Note that this *is* data-set dependent... for example, the same
> functions tested on your data symbol give:
> In[169]:= First(a)Timing[Do[MovingAverage[data, 2];, {10}]]/10
>
> Out[169]= 0.0015
>
> In[170]:= First(a)Timing[Do[movAverageOwn2Fa[data, 2];, {10}]]/10
>
> Out[170]= 0.0171
>
> In[171]:= First(a)Timing[Do[movAverageOwn2Fb[data, 2];, {10}]]/10
>
> Out[171]= 0.0156
>
> In[173]:=
> First(a)Timing[Do[movAverageOwn2FCorig[data, 2, Length(a)data];, {10}]]/10
>
> Out[173]= 0.0171
>
> On 4/4/2010 7:45 AM, Ray Koopman wrote:
>> Your compiled movAverageC takes 25% more time than the uncompiled
>>
>> movAv[data_, start_, end_, incr_] := Transpose(a)PadRight@Join[{data},
>> Table[MovingAverage[data, r], {r, start, end, incr}]]
>>
>> under your test conditions.
>>
>> On Apr 1, 3:59 am, sheaven<shea...(a)gmx.de> wrote:
>>
>>> Hello everyone!
>>>
>>> I am new to Mathematica and try get a understanding of its power. I
>>> plan to use Mathematica mainly for financial data analysis (large
>>> lists...).
>>>
>>> Currently, I am trying to optimize calculation time for calculations
>>> based on some sample data. I started with with a moving average of
>>> share prices, because Mathematica already has a built in moving
>>> average function for benchmarking.
>>>
>>> I know that the built-in functions are always more efficient than any
>>> user built function. Unfortunately, I have to create functions not
>>> built in (e.g. something like "moving variance") in the future.
>>>
>>> I have tried numerous ways to calc the moving average as efficiently
>>> as possible. So far, I found that a function based on Span (or
>>> List[[x;;y]]) is most efficient. Below are my test results.
>>> Unfortunately, my UDF is still more than 5x slower than the built in
>>> function.
>>>
>>> Do you have any ideas to further speed up the function. I am already
>>> using Compile and Parallelize.
>>>
>>> This is what I got so far:
>>>
>>> 1. Functions for moving average:
>>>
>>> 1.1. Moving average based on built in function:
>>>
>>> (*Function calcs moving average based on built in function for
>>> specified number of days, e.g. 30 days to 250 days in steps of 10*)
>>> movAverageC = Compile[{{inputData, _Real, 1}, {start, _Integer}, {end,
>>> _Integer}, {incr, _Integer}}, Module[{data, size, i},
>>> size = Length[inputData];
>>> Transpose[Join[{inputData}, PadRight[MovingAverage[inputData, #],
>>> size]& /@ Table[x, {x, start, end, incr}]]]
>>> ]
>>> ]
>>>
>>> 1.2. User defined function based on Span:
>>> (*UDF for moving average based on Span*)
>>> movAverageOwn2FC = Compile[{{dataInput, _Real, 1}, {days, _Integer},
>>> {length, _Integer}},
>>> N[Mean[dataInput[[1 + # ;; days + #]]]]& /@ Range[0, length - days,
>>> 1]
>>> ]
>>>
>>> (*Function calcs moving average based on UDF "movAverageOwn2FC" for
>>> specified number of days, e.g. 30 days to 250 days in steps of 10*)
>>> movAverageOwn2C = Compile[{{dataInput, _Real, 1}, {start, _Integer},
>>> {end, _Integer}, {incr, _Integer}}, Module[{length},
>>> length = Length[dataInput];
>>> Transpose[Join[{dataInput}, PadRight[movAverageOwn2FC[dataInput, #,
>>> length], length]& /@ Range[start, end, incr]]]
>>> ]
>>> ]
>>>
>>> 2. Create sample data:
>>> data = 100 + #& /@ Accumulate[RandomReal[{-1, 1}, {10000}]];
>>>
>>> 3. Test if functions yield same results:
>>> Test1 = movAverageC[data, 30, 250, 10]; (*Moving average for 30 days
>>> to 250 days in steps of 10*)
>>>
>>> Test2 = movAverageOwn2C[data, 30, 250, 10]; (*Moving average for 30
>>> days to 250 days in steps of 10*)
>>>
>>> Test1 == Test2
>>> Out = True
>>>
>>> 4. Performance testing (Singe Core):
>>> AbsoluteTiming[Table[movAverageC[data, 30, 250, 10], {n, 1, 20, 1}];]
>>> (*Repeat function 20x for testing purposes*)
>>> Out = {1.3030000, Null}
>>>
>>> AbsoluteTiming[Table[movAverageOwn2C[data, 30, 250, 10], {n, 1, 20,
>>> 1}];] (*Repeat function 20x for testing purposes*)
>>> Out = {11.4260000, Null}
>>>
>>> => Result UDF 9x slower
>>>
>>> 5. Performance testing (multi core):
>>> LaunchKernels[]
>>>
>>> Out = {KernelObject[1, "local"], KernelObject[2, "local"]}
>>>
>>> DistributeDefinitions[data, movAverageOwn2C, movAverageOwn2FC,
>>> movAverageC]
>>>
>>> AbsoluteTiming[Parallelize[Table[movAverageC[data, 30, 250, 10], {n,
>>> 1, 20, 1}]];]
>>> Out = {1.3200000, Null}
>>>
>>> AbsoluteTiming[Parallelize[Table[movAverageOwn2C[data, 30, 250, 10],
>>> {n, 1, 20, 1}]];]
>>> Out = {6.7170000, Null}
>>>
>>> => Result UDF 5x slower
>>> Very strange that the built in function does not get faster with
>>> Parallelize
>>>
>>> I would very much appreciate any input on how to decrease calculation
>>> time based on the user defined function.
>>>
>>> Many thanks
>>> Stefan
>>>
>>
>

From: sheaven on
Thanks everyone for the input!

Ray is absolutely correct (Compile does not speed up the calculation
based on my function movAverageOwn2FC). I used Compile with a function
similar to variation#2 from Zach with Partition where Compile resulted
in a speed up of c. 50%. Therefore, I thought that Compile is always
faster.
Obviously, this is a mistake :-)

I did some testing and here is the outcome:

1. Moving average functions


1.1 Built-in Mathematica function

In[1]:= maMathematica =
Compile[{{inputData, _Real,
1}, {start, _Integer}, {end, _Integer}, {incr, _Integer}},
Module[{data, size, i},
size = Length[inputData];
Transpose[
Join[{inputData},
PadRight[MovingAverage[inputData, #], size] & /@
Table[x, {x, start, end, incr}]]]
]
]


1.2 Function based on Span

In[2]:= maSpanF[dataInput_, days_, length_] :=
N[Mean[dataInput[[1 + # ;; days + #]]]] & /@
Range[0, length - days, 1]

In[3]:= maSpan[dataInput_, start_, end_, incr_] := Module[{length},
length = Length[dataInput];
Transpose[
Join[{dataInput},
PadRight[maSpanF[dataInput, #, length], length] & /@
Range[start, end, incr]]]
]


1.3 Function based on Convolve

In[4]:= maConvolveF[data_, windowLen_] :=
Module[{ker = Table[1, {windowLen}]/windowLen},
ListConvolve[ker, data]]

In[5]:= maConvolve =
Compile[{{dataInput, _Real,
1}, {start, _Integer}, {end, _Integer}, {incr, _Integer}},
Module[{length},
length = Length[dataInput];
Transpose[
Join[{dataInput},
PadRight[maConvolveF[dataInput, #], length] & /@
Range[start, end, incr]]]
]
]


1.4 Function based on Drop

In[6]:= maDropF =
Function[{vData, days}, With[{vAcc = Prepend[Accumulate(a)vData, 0.]},
Developer`ToPackedArray[(Drop[vAcc, days] - Drop[vAcc, -days])/
days, Real]]];

In[7]:= maDrop[dataInput_, start_, end_, incr_] := Module[{length},
length = Length[dataInput];
Transpose[
Join[{dataInput},
PadRight[maDropF[dataInput, #], length] & /@
Range[start, end, incr]]]
]


2. Dataset

data = 100 + Accumulate[RandomReal[{-1, 1}, {10000}]];


3. Test for equality

In[15]:= maMathematica[data, 30, 250, 10] ==
maConvolve[data, 30, 250, 10] == maSpan[data, 30, 250, 10] ==
maDrop[data, 30, 250, 10]

Out[15]= False

This is very strange to me. Equality is only given based on Precision
of 8:

In[16]:= SetPrecision[maMathematica[data, 30, 250, 10], 8] ==
SetPrecision[maConvolve[data, 30, 250, 10], 8] ==
SetPrecision[maSpan[data, 30, 250, 10], 8] ==
SetPrecision[maDrop[data, 30, 250, 10], 8]

Out[16]= True

Any ideas why this is happening? Numbers of the different functions
start to differ at the 10th decimal. My understanding was that
Mathematica was only testing equality up to the precision it knows to
be correct!?


4. Performance

In[17]:= AbsoluteTiming[
Table[maMathematica[data, 30, 250, 10], {n, 1, 10, 1}];]

Out[17]= {0.8230823, Null}

In[18]:= AbsoluteTiming[
Table[maSpan[data, 30, 250, 10], {n, 1, 10, 1}];]

Out[18]= {6.0676067, Null}

In[21]:= AbsoluteTiming[
Table[maDrop[data, 30, 250, 10], {n, 1, 10, 1}];]

Out[21]= {0.4480448, Null}

In[22]:= AbsoluteTiming[
Table[maConvolve[data, 30, 250, 10], {n, 1, 10, 1}];]

Out[22]= {0.7780778, Null}

In essence:
Span is by far the slowest (12 times slower than Drop)
Built-in function is slower than functions based on Convolve and Drop
(2 times slower than Drop)
Convolve is slightly faster than the built-in function but is slower
than Drop (2 times slower than Drop)
Drop is by far the fastest function

>From a performance standpoint, it seems necessary to think a lot about
different algorithms for doing the same thing=85
Is there any good reference you can recommend giving some best
practices for dealing with large lists?

For those being interest I also prepared functions for the coefficient
of variation (CV) based on the variance of the sample (many thanks to
Raffy for his proposal). The impact of the algorithm on the
performance of the function is more or less the same compared to the
moving average.

cvDropF =
Function[{vData, days},
With[{v1 = Prepend[Accumulate[vData], 0.],
v2 = Prepend[Accumulate[vData^2], 0.]},
Developer`ToPackedArray[
Sqrt[(Drop[v2, days] - Drop[v2, -days])/(days -
1) - ((Drop[v1, days] - Drop[v1, -days])/
days)^2*(days/(days - 1))]/(Sqrt[days]*
maDropF[vData, days]), Real]]];

cvDrop[dataInput_, start_, end_, incr_] := Module[{length},
length = Length[dataInput];
Transpose[
Join[{dataInput},
PadRight[cvDropF[dataInput, #], length] & /@
Range[start, end, incr]]]
]

cvSpanF[dataInput_, days_] := Module[{},
Sqrt[Variance[dataInput[[1 + # ;; days + #]]] & /@
Range[0, Length[dataInput] - days, 1]]/(Sqrt[days]*
maSpanF[dataInput, days, Length[dataInput]])
]

cvSpan[dataInput_, start_, end_, incr_] := Module[{length},
length = Length[dataInput];
Transpose[
Join[{dataInput},
PadRight[cvSpanF[dataInput, #], length] & /@
Range[start, end, incr]]]
]

cvConvolveF[data_, days_] :=
Module[{ker1 = Table[1, {days}]/(days - 1),
ker2 = Table[1, {days}]/(days), conv1, conv2},
conv1 = ListConvolve[ker1, data^2];
conv2 = (ListConvolve[ker2, data]^2)*(days/(days - 1));
Sqrt[conv1 - conv2]/(Sqrt[days]*maConvolveF[data, days])
]

cvConvolve =
Compile[{{dataInput, _Real,
1}, {start, _Integer}, {end, _Integer}, {incr, _Integer}},
Module[{length},
length = Length[dataInput];
Transpose[
Join[{dataInput},
PadRight[cvConvolveF[dataInput, #], length] & /@
Range[start, end, incr]]]
]
]

@Raffy:
I think there is an error in your moving average function. It should
read like this (only for a single period):
mv =
Function[{vData, days}, With[
{v1 = Prepend[Accumulate[vData], 0.],
v2 = Prepend[Accumulate[vData^2], 0.]},
Developer`ToPackedArray[
Sqrt[(Drop[v2, days] - Drop[v2, -days])/(days -
1) - ((Drop[v1, days] - Drop[v1, -days])/
days)^2*(days/(days - 1))]
, Real]]];



From: Ray Koopman on
On Apr 7, 4:26 am, sheaven <karg.ste...(a)googlemail.com> wrote:
>
> I did some testing and here is the outcome:
>
> [...]
>
> Drop is by far the fastest function

movAc is just as fast as maDropF, and its worst case
has one digit more precision than maDropF's worst case.

In[1]:= << Developer`

In[2]:= movAc[data_, days_] :=
Accumulate[
Prepend[Drop[data, days] - Drop[data, -days],
Tr(a)Take[data, days]]]/days

In[3]:= maDropF =
Function[{vData, days},
With[{vAcc = Prepend[Accumulate(a)vData, 0.]},
ToPackedArray[(Drop[vAcc, days] - Drop[vAcc, -days])/days,
Real]]];

In[4]:= PackedArrayQ[
data = 100 + Accumulate[RandomReal[{-1, 1}, {1*^4}]] ]

Out[4]= True

In[5]:= AbsoluteTiming[
m0 = Table[MovingAverage[data, days], {days, 30, 250}];]

Out[5]= {1.500583, Null}

In[6]:= AbsoluteTiming[
m1 = Table[movAc[data, days], {days, 30, 250}];]
{m1 == m0, Max(a)Abs[m1/m0 - 1]}

Out[6]= {0.792954, Null}
Out[7]= {False, 1.60982*10^-14}

In[8]:= AbsoluteTiming[
m2 = Table[maDropF[data, days], {days, 30, 250}];]
{m2 == m0, Max(a)Abs[m2/m0 - 1]}

Out[8]= {0.820129, Null}
Out[9]= {False, 1.93734*10^-13}

From: Bill Rowe on
On 4/7/10 at 7:26 AM, karg.stefan(a)googlemail.com (sheaven) wrote:

>Ray is absolutely correct (Compile does not speed up the calculation
>based on my function movAverageOwn2FC). I used Compile with a
>function similar to variation#2 from Zach with Partition where
>Compile resulted in a speed up of c. 50%. Therefore, I thought that
>Compile is always faster. Obviously, this is a mistake :-)

Definitely. Compile does improve execution speed of some code.
But my experience is spending a bit more time understanding the
problem and avoiding the use of things like For is more
effective at improving execution speed than simply using
Compile. And as you've note, Compile will degrade the execution
speed of some code. Compile is definitely something that needs
to be used intelligently.

<snip>

>3. Test for equality
>
>In[15]:= maMathematica[data, 30, 250, 10] ==
>maConvolve[data, 30, 250, 10] == maSpan[data, 30, 250, 10] ==
>maDrop[data, 30, 250, 10]

>Out[15]= False

>This is very strange to me. Equality is only given based on
>Precision of 8:

>In[16]:= SetPrecision[maMathematica[data, 30, 250, 10], 8] ==
>SetPrecision[maConvolve[data, 30, 250, 10], 8] ==
>SetPrecision[maSpan[data, 30, 250, 10], 8] ==
>SetPrecision[maDrop[data, 30, 250, 10], 8]

>Out[16]= True

>Any ideas why this is happening? Numbers of the different functions
>start to differ at the 10th decimal. My understanding was that
>Mathematica was only testing equality up to the precision it knows
>to be correct!?

Per the documentation, Equal treats to machine precision values
as being equal when they match except for the last 7 binary
digits (~2 decimal digits). The reason you are not getting True
for the comparison above is each method is doing a different set
of operations on the data which lead to differences due to the
limits of machine precision. You can gain a bit more insight as follows:

In[27]:= data = 100 + Accumulate[RandomReal[{-1, 1}, {10000}]];

In[28]:= a = maMathematica[data, 30, 250, 10]; b =
maConvolve[data, 30, 250, 10]; c = maSpan[data, 30, 250, 10];
d =
maDrop[data, 30, 250, 10];

In[29]:= Outer[Equal, {a, b, c, d}, {a, b, c, d}, 1]

Out[29]= {{True, True, True, False}, {True, True, True, False}, {True,
True, True, False}, {False, False, False, True}}

which shows the method based on Drop is the one causing your
comparison to yield False rather than True.

In[30]:= Outer[
Union[Flatten(a)Chop[#1 - #2]] == {0} &, {a, b, c, d}, {a, b, c,
d}, 1]

Out[30]= {{True, True, True, True}, {True, True, True, True}, {True,
True, True, True}, {True, True, True, True}}

which shows each method does return the same result for within
the limitations of machine precision computations.

Note,

In[34]:= data = 100 + Accumulate[RandomReal[{-1, 1}, {500}]];

In[35]:= a = maMathematica[data, 30, 250, 10]; b =
maConvolve[data, 30, 250, 10]; c = maSpan[data, 30, 250, 10];
d =
maDrop[data, 30, 250, 10];

In[36]:= Outer[Equal, {a, b, c, d}, {a, b, c, d}, 1]

Out[36]= {{True, True, True, True}, {True, True, True, True}, {True,
True, True, True}, {True, True, True, True}}

showing when the number of operations is reduced all methods
yield True when compared using Equal. So, the issue is clearly
loss of significance as computations are done with machine
precision values.

One other thought you may want to consider. You tested for
execution times only. Although, I've not looked at each of the
methods in detail, I suspect the memory requirements for each
will also differ. When working with large data sets, this might
be a more significant factor than execution time.


From: Ray Koopman on
On Apr 7, 4:26 am, sheaven <karg.ste...(a)googlemail.com> wrote:
> I did some testing and here is the outcome:
>
> [...]
>
> Drop is by far the fastest function

You didn't give ListConvolve a fair test, because you defined the
kernel as a list of exact rationals. If you define it as a list of
reals, ListConvolve is just as fast as drop-type methods and does
not lose precision.

In[1]:= movAc[data_, days_] :=
Accumulate[
Prepend[Drop[data, days] - Drop[data, -days],
Tr(a)Take[data, days]]]/days

In[2]:= maDropF =
Function[{vData, days},
With[{vAcc = Prepend[Accumulate(a)vData, 0.]},
Developer`ToPackedArray[(Drop[vAcc, days] - Drop[vAcc, -days])/
days, Real]]];

In[3]:= moCon[data_, days_] :=
ListConvolve[ConstantArray[1./days, days], data]

In[4]:= data = 100 + Accumulate[RandomReal[{-1, 1}, {1*^4}]] ;

In[5]:= AbsoluteTiming[
m0 = Table[MovingAverage[data, days], {days, 30, 250}];]

Out[5]= {1.514497, Null}

In[6]:= AbsoluteTiming[
m1 = Table[movAc[data, days], {days, 30, 250}];]
{m1 == m0, Max(a)Abs[m1/m0 - 1]}

Out[6]= {0.831964, Null}
Out[7]= {False, 4.89608*10^-14}

In[8]:= AbsoluteTiming[
m2 = Table[maDropF[data, days], {days, 30, 250}];]
{m2 == m0, Max(a)Abs[m2/m0 - 1]}

Out[8]= {0.892764, Null}
Out[9]= {False, 5.38902*10^-13}

In[10]:= AbsoluteTiming[
m3 = Table[moCon[data, days], {days, 30, 250}];]
{m3 == m0, Max(a)Abs[m3/m0 - 1]}

Out[10]= {0.823370, Null}
Out[11]= {True, 1.11022*10^-16}