Introduction
Convolution and correlation are foundational operations in signal processing. They underpin filtering, feature extraction, pattern matching, and system identification, and they form the mathematical bedrock of frequency-domain methods such as the FFT and the wavelet transform.
This article reviews the continuous and discrete definitions of convolution and correlation, clarifies how the two are related, and uses the convolution theorem to bridge the time and frequency domains. We then implement and benchmark np.convolve, scipy.signal.correlate, and FFT-based fast convolution in Python. Finally, we show that the moving average filter is simply a convolution with a box window.
Definition of Convolution
Continuous Convolution
For two continuous functions \(f(t)\) and \(g(t)\) , convolution is defined as
\[(f * g)(t) = \int_{-\infty}^{\infty} f(\tau) \cdot g(t - \tau) \, d\tau \tag{1}\]Intuitively, we flip \(g\) in time, slide it by \(t\) , multiply with \(f\) , and integrate. Because the response of a linear time-invariant (LTI) system is the convolution of its input with its impulse response, filtering is convolution in the most fundamental sense.
Discrete Convolution
For discrete signals \(x[n]\) and \(h[n]\) ,
\[y[n] = (x * h)[n] = \sum_{k=-\infty}^{\infty} x[k] \cdot h[n - k] \tag{2}\]In practice, we time-reverse \(h\) , slide it across \(x\) , take elementwise products, and sum. If \(x\) has length \(N\) and \(h\) has length \(M\) , the linear convolution output has length \(N + M - 1\) .
Algebraic Properties
Convolution satisfies several useful identities.
- Commutativity: \(x * h = h * x\)
- Associativity: \((x * h) * g = x * (h * g)\)
- Distributivity: \(x * (h + g) = x * h + x * g\)
Cascaded filters can therefore be merged into a single equivalent kernel.
Definition of Correlation
Cross-Correlation
The cross-correlation of \(x[n]\) and \(y[n]\) at lag \(\ell\) is
\[R_{xy}[\ell] = \sum_{n=-\infty}^{\infty} x[n] \cdot y[n + \ell] \tag{3}\]It measures how similar \(x\) is to a shifted version of \(y\) , and is widely used for delay estimation and template matching.
Auto-Correlation
When \(y = x\) , we obtain the auto-correlation, which detects periodicities and self-similarity in a single signal:
\[R_{xx}[\ell] = \sum_{n=-\infty}^{\infty} x[n] \cdot x[n + \ell] \tag{4}\]\(R_{xx}[0]\) equals the total energy (or variance) and is the global maximum of \(R_{xx}\) .
Relationship to Convolution
Cross-correlation and convolution differ only in whether the second argument is time-reversed. For real signals,
\[R_{xy}[\ell] = (x * \tilde{y})[\ell], \quad \tilde{y}[n] = y[-n] \tag{5}\]so correlation can be implemented as convolution with a time-reversed kernel. For complex signals, also take the complex conjugate (\(\tilde{y}[n] = y^*[-n]\) ).
The Convolution Theorem
Time-Domain Convolution ↔ Frequency-Domain Multiplication
The most consequential property of convolution is that it becomes pointwise multiplication in the frequency domain. With the discrete Fourier transform (DFT) denoted \(\mathcal{F}\) ,
\[\mathcal{F}\{x * h\}[k] = X[k] \cdot H[k] \tag{6}\]Conversely, multiplication in time corresponds to convolution in frequency — the very mechanism behind spectral leakage discussed in the windowing/PSD article:
\[\mathcal{F}\{x \cdot h\}[k] = \frac{1}{N}(X * H)[k] \tag{7}\]Fast Convolution Complexity
Direct convolution of two length-\(N\) signals costs \(O(N^2)\) . Combined with the \(O(N \log N)\) FFT, the convolution theorem gives
\[y = \mathcal{F}^{-1}\{ \mathcal{F}\{x\} \cdot \mathcal{F}\{h\} \} \tag{8}\]at total cost \(O(N \log N)\) . For long kernels this is many orders of magnitude faster. To produce linear convolution (rather than circular), zero-pad both inputs to length at least \(N + M - 1\) .
Correlation Theorem
A similar identity holds for correlation:
\[\mathcal{F}\{R_{xy}\}[k] = X^*[k] \cdot Y[k] \tag{9}\]In particular, the Fourier transform of the auto-correlation equals the power spectral density (Wiener–Khinchin theorem), which is the theoretical foundation of PSD estimation.
Python Implementation
np.convolve and scipy.signal.correlate
We start by validating the basic APIs.
import numpy as np
from scipy.signal import correlate
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0])
h = np.array([1.0, 0.5, 0.25])
# Linear convolution (mode='full' yields length N+M-1)
y_conv = np.convolve(x, h, mode='full')
print('convolve:', y_conv)
# Cross-correlation (mode='full')
y_corr = correlate(x, h, mode='full')
print('correlate:', y_corr)
# Verify equation (5): correlation = convolution with flipped h
y_corr_via_conv = np.convolve(x, h[::-1], mode='full')
print('correlate via flipped conv:', y_corr_via_conv)
correlate(x, h) and convolve(x, h[::-1]) agree, confirming equation \((5)\)
.
FFT-Based Fast Convolution
We benchmark direct convolution against FFT-based convolution for a long kernel.
import numpy as np
import time
from scipy.signal import fftconvolve
np.random.seed(0)
N = 1 << 16 # 65536
M = 1 << 12 # 4096
x = np.random.randn(N)
h = np.random.randn(M)
# Direct convolution
t0 = time.perf_counter()
y_direct = np.convolve(x, h, mode='full')
t_direct = time.perf_counter() - t0
# SciPy FFT convolution
t0 = time.perf_counter()
y_fft = fftconvolve(x, h, mode='full')
t_fft = time.perf_counter() - t0
# Manual FFT convolution implementing equation (8)
L = N + M - 1
n_fft = 1 << int(np.ceil(np.log2(L)))
t0 = time.perf_counter()
X = np.fft.rfft(x, n=n_fft)
H = np.fft.rfft(h, n=n_fft)
y_manual = np.fft.irfft(X * H, n=n_fft)[:L]
t_manual = time.perf_counter() - t0
print(f'direct : {t_direct*1e3:7.2f} ms')
print(f'fftconv : {t_fft*1e3:7.2f} ms')
print(f'manual : {t_manual*1e3:7.2f} ms')
print(f'max err : {np.max(np.abs(y_direct - y_fft)):.2e}')
For \(N = 65536\)
and \(M = 4096\)
, np.convolve typically takes hundreds of milliseconds, while fftconvolve finishes in roughly a dozen milliseconds. The maximum error matches floating-point round-off (\(\sim 10^{-10}\)
).
Connection to the Moving Average
A simple moving average (SMA) of length \(L\) is a convolution with the box window \(h_{\text{box}}[n]\) whose coefficients are all \(1/L\) :
\[h_{\text{box}}[n] = \frac{1}{L}, \quad 0 \leq n \leq L - 1 \tag{10}\] \[y[n] = \frac{1}{L}\sum_{k=0}^{L-1} x[n - k] = (x * h_{\text{box}})[n] \tag{11}\]import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1)
fs = 1000
t = np.arange(0, 1, 1/fs)
signal = np.sin(2*np.pi*5*t) + 0.4*np.random.randn(len(t))
L = 21
h_box = np.ones(L) / L
# Moving average = convolution with the box window
sma = np.convolve(signal, h_box, mode='same')
# Cross-check with scipy.ndimage.uniform_filter1d
from scipy.ndimage import uniform_filter1d
sma_ref = uniform_filter1d(signal, size=L, mode='nearest')
plt.figure(figsize=(10, 4))
plt.plot(t, signal, alpha=0.4, label='Noisy signal')
plt.plot(t, sma, label=f'Convolution with box (L={L})')
plt.plot(t, sma_ref, '--', label='uniform_filter1d (reference)')
plt.xlabel('Time [s]')
plt.ylabel('Amplitude')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
Up to boundary handling, both outputs coincide, confirming that the moving average is a special case of convolution. Replacing the kernel by a triangular or Hann window naturally extends the idea to more general FIR filters.
Summary
- Convolution flips one signal, slides it, and accumulates the elementwise product; it fully describes any LTI system response.
- Correlation is similar but without the time reversal: cross-correlation \(R_{xy}\) enables template matching, while auto-correlation \(R_{xx}\) uncovers periodic structure.
- The convolution theorem (equation \((6)\) ) turns time-domain convolution into frequency-domain multiplication, enabling \(O(N \log N)\) fast convolution via the FFT.
- Moving average filters are convolutions with a box window — the simplest FIR filter.
- Wiener–Khinchin shows that auto-correlation and PSD are two views of the same quantity.
Mastering these primitives makes filter design, spectral analysis, and time-frequency analysis feel like one coherent mathematical framework.
Related Posts
- Fast Fourier Transform (FFT): Theory and Python Implementation — The DFT/FFT foundation underlying the convolution theorem.
- Window Functions and Power Spectral Density (PSD): Theory and Python Implementation — PSD estimation built on the auto-correlation / Wiener–Khinchin perspective.
- Wavelet Transform: Theory and Python Implementation — A convolution-based time-frequency extension of FFT.
- Comparison of Moving Average Filters — Frames SMA/WMA/EMA through the convolution-with-a-window lens.
- Autocorrelation and Cross-Correlation: Theory and Python Implementation — Builds on the correlation definitions here for ACF/PACF time-series analysis and delay estimation (GCC-PHAT).
- Cepstrum Analysis: Theory and Python Implementation — A canonical application that factors convolution through the log spectrum, enabling F0 estimation and spectral envelope analysis.
- FIR and IIR Filter Comparison — Contrasts FIR filters that implement convolution directly with the recursive structure of IIR filters.
References
- Oppenheim, A. V., & Schafer, R. W. (2009). Discrete-Time Signal Processing (3rd ed.). Prentice Hall.
- Smith, S. W. (1997). The Scientist and Engineer’s Guide to Digital Signal Processing. California Technical Publishing.
- NumPy convolve documentation
- SciPy signal documentation