I was doing some signal processing in R a few weeks ago and was somewhat disappointed in how long it was taking to run my experiments. Digging around a little, it seemed like the vast majority of time was being taken up in autocorrelation and cross-correlation computations.
In particular, I was computing a “Type-II” autocorrelation, defined as
The inefficient way to compute this in directly in R is
1 2 3 4 5 |
|
But autocorrelation can be computed efficiently via FFT.
You can get this efficient computation by calling the convolve
function the right way.
1 2 3 4 5 |
|
But this version was still taking a long time and I had to poke into the convolve
implementation to finally figure out why.
When convolve
is called for type='open'
with real arguments, it simplifies to the following.
1 2 3 4 5 6 7 8 9 10 |
|
Can you see the two problems?
The thing that jumped out at me first is that it computes the same FFT twice when it is used for autocorrelation. That should give a 2x speedup.
The second took me a bit more trial and error investigation, including trying to benchmark the internal fft
function against fftw. I was testing them with
different sized vectors when it finally dawned on me that the length of my input vector was a power of 2 but the
zero padding inside of convolve
was creating a length that was no longer a power of 2. The FFT implementation
is a bit slower for those lengths so convolve
is doing three slow FFT operations when I just need two fast ones.
With that knowledge, I whipped up the following. Finally, I have a fast autocorrelation that runs in about
35% of the time as the convolve
version did.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|