What Is the Savitzky-Golay Filter?
The Savitzky-Golay filter performs local polynomial least-squares fitting within a moving window and uses the polynomial value as the smoothed result. It was proposed by Abraham Savitzky and Marcel J.E. Golay in 1964.
While moving average and EMA filters tend to blur peaks and sharp transitions, the Savitzky-Golay filter can remove noise while preserving signal features (peak positions, widths, and heights).
Key Applications
- Smoothing spectroscopic data
- Peak detection in chromatography
- Preprocessing biosignals (ECG, EEG)
- Derivative estimation of signals
Mathematical Foundation
Local Polynomial Fitting
Within a window of width \(2m+1\) (\(m\) is the number of data points on each side), a polynomial of degree \(p\) is fitted by least squares.
Setting the window center at \(n=0\), for data points \(x_{-m}, x_{-m+1}, \dots, x_m\), we fit:
\[ \hat{x}(n) = \sum_{k=0}^{p} a_k n^k = a_0 + a_1 n + a_2 n^2 + \cdots + a_p n^p \tag{1} \]The least squares problem minimizes the sum of squared residuals:
\[ \min_{\mathbf{a}} \sum_{n=-m}^{m} \left(x_n - \sum_{k=0}^{p} a_k n^k \right)^2 \tag{2} \]Matrix Formulation and Normal Equations
Defining the Vandermonde matrix \(\mathbf{A}\):
\[ \mathbf{A} = \begin{bmatrix} (-m)^0 & (-m)^1 & \cdots & (-m)^p \\ (-m+1)^0 & (-m+1)^1 & \cdots & (-m+1)^p \\ \vdots & & \ddots & \vdots \\ m^0 & m^1 & \cdots & m^p \end{bmatrix} \tag{3} \]The normal equation solution is:
\[ \mathbf{a} = (\mathbf{A}^T \mathbf{A})^{-1} \mathbf{A}^T \mathbf{x} \tag{4} \]The smoothed value at the center is \(\hat{x}(0) = a_0\), which can be expressed as a linear transformation:
\[ \hat{x}_0 = \mathbf{c}^T \mathbf{x} \tag{5} \]where \(\mathbf{c}\) is the first row of \((\mathbf{A}^T \mathbf{A})^{-1} \mathbf{A}^T\) (corresponding to \(a_0\)). This \(\mathbf{c}\) represents the convolution coefficients of the filter.
Key Properties
- Window width \(2m+1\) vs polynomial degree \(p\): Higher \(p\) preserves signal shape better but reduces noise removal. When \(p = 0\), the Savitzky-Golay filter reduces to a simple moving average.
- FIR filter: The convolution coefficients \(\mathbf{c}\) depend only on window width and polynomial degree, not on the signal. It is therefore a type of FIR filter.
- Derivative estimation: Extracting \(a_1, a_2, \dots\) yields simultaneous estimates of the signal’s first and second derivatives.
Frequency Response
The frequency response of the Savitzky-Golay filter is the DTFT of the convolution coefficients \(\mathbf{c}\):
\[ H(e^{j\omega}) = \sum_{n=-m}^{m} c_n e^{-j\omega n} \tag{6} \]Comparison with EMA and Moving Average
| Filter | Passband | Stopband | Shape Preservation | Group Delay |
|---|---|---|---|---|
| Savitzky-Golay | Wide | Gentle | Excellent | Constant (linear phase) |
| EMA | Parameter-dependent | Gentle | Moderate | Frequency-dependent |
| Moving Average | Width-dependent | Gentle | Poor | Constant (linear phase) |
| Butterworth | Sharp (order-dependent) | Sharp | Design-dependent | Nonlinear |
The Savitzky-Golay filter has linear phase response, so peak positions do not shift. This is a key difference from EMA (nonlinear phase).
Python Implementation
scipy.signal.savgol_filter
With scipy.signal’s savgol_filter, you can apply the filter in a single line.
import numpy as np
from scipy.signal import savgol_filter
import matplotlib.pyplot as plt
# Generate test data
np.random.seed(42)
t = np.linspace(0, 1, 500)
# Signal with a peak + noise
signal_clean = np.sin(2 * np.pi * 3 * t) + 0.5 * np.exp(-((t - 0.3) / 0.02)**2)
noise = 0.2 * np.random.randn(len(t))
signal_noisy = signal_clean + noise
# Apply Savitzky-Golay filter
# window_length: window width (odd), polyorder: polynomial degree
y_sg = savgol_filter(signal_noisy, window_length=21, polyorder=3)
# Compare with moving average
kernel = np.ones(21) / 21
y_ma = np.convolve(signal_noisy, kernel, mode='same')
# Plot
fig, axes = plt.subplots(2, 1, figsize=(12, 8))
axes[0].plot(t, signal_clean, 'g', label='Original', linewidth=2)
axes[0].plot(t, signal_noisy, 'gray', alpha=0.5, label='Noisy')
axes[0].plot(t, y_sg, 'b', label='Savitzky-Golay (w=21, p=3)')
axes[0].plot(t, y_ma, 'r--', label='Moving Average (w=21)')
axes[0].set_title('Smoothing Comparison')
axes[0].legend()
axes[0].grid(True)
# Zoom in on peak region
mask = (t > 0.2) & (t < 0.4)
axes[1].plot(t[mask], signal_clean[mask], 'g', label='Original', linewidth=2)
axes[1].plot(t[mask], y_sg[mask], 'b', label='Savitzky-Golay')
axes[1].plot(t[mask], y_ma[mask], 'r--', label='Moving Average')
axes[1].set_title('Peak Region (Zoomed)')
axes[1].legend()
axes[1].grid(True)
plt.tight_layout()
plt.show()
Zooming into the peak region reveals that while the moving average significantly blunts the peak, the Savitzky-Golay filter preserves both peak position and height.
Parameter Effects
Visualize how window length and polynomial order affect filter characteristics.
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
# Effect of window length (fixed polyorder=3)
for w in [5, 11, 21, 41]:
y = savgol_filter(signal_noisy, window_length=w, polyorder=3)
axes[0].plot(t, y, label=f'w={w}')
axes[0].plot(t, signal_clean, 'k--', alpha=0.5, label='Original')
axes[0].set_title('Effect of Window Length (polyorder=3)')
axes[0].legend()
axes[0].grid(True)
# Effect of polynomial order (fixed window=21)
for p in [0, 1, 2, 3, 5]:
y = savgol_filter(signal_noisy, window_length=21, polyorder=p)
axes[1].plot(t, y, label=f'p={p}')
axes[1].plot(t, signal_clean, 'k--', alpha=0.5, label='Original')
axes[1].set_title('Effect of Polynomial Order (window=21)')
axes[1].legend()
axes[1].grid(True)
plt.tight_layout()
plt.show()
Observations:
- Larger window width provides stronger smoothing but may lose fine signal structure
- Polynomial order \(p = 0\) is equivalent to a simple moving average
- When \(p\) approaches the window width, noise removal becomes negligible
- In practice, \(p = 2\) or \(p = 3\) works well in most scenarios
Derivative Estimation
A key feature of the Savitzky-Golay filter is simultaneous smoothing and derivative estimation.
# Estimate 1st and 2nd derivatives
dy_sg = savgol_filter(signal_noisy, window_length=21, polyorder=3, deriv=1, delta=t[1]-t[0])
d2y_sg = savgol_filter(signal_noisy, window_length=21, polyorder=4, deriv=2, delta=t[1]-t[0])
# True derivative (analytical)
dy_true = 2 * np.pi * 3 * np.cos(2 * np.pi * 3 * t) + \
0.5 * (-2 * (t - 0.3) / 0.02**2) * np.exp(-((t - 0.3) / 0.02)**2)
fig, axes = plt.subplots(2, 1, figsize=(12, 6))
axes[0].plot(t, dy_true, 'g', label='True derivative')
axes[0].plot(t, dy_sg, 'b', label='SG 1st derivative')
axes[0].set_title('First Derivative Estimation')
axes[0].legend()
axes[0].grid(True)
axes[1].plot(t, d2y_sg, 'b', label='SG 2nd derivative')
axes[1].set_title('Second Derivative Estimation')
axes[1].legend()
axes[1].grid(True)
plt.tight_layout()
plt.show()
Numerical differentiation usually amplifies noise, but the Savitzky-Golay filter performs smoothing and differentiation simultaneously, enabling stable derivative estimation.
Frequency Response Visualization
from scipy.signal import savgol_coeffs, freqz
# Get filter coefficients
for w, p in [(11, 2), (21, 3), (41, 3)]:
coeffs = savgol_coeffs(w, p)
freq, response = freqz(coeffs, worN=2048)
plt.plot(freq / np.pi, 20 * np.log10(np.abs(response)),
label=f'w={w}, p={p}')
plt.xlabel('Normalized Frequency (×π rad/sample)')
plt.ylabel('Gain (dB)')
plt.title('Savitzky-Golay Filter Frequency Response')
plt.legend()
plt.grid(True)
plt.ylim(-60, 5)
plt.tight_layout()
plt.show()
Choosing the Right Filter
| Use Case | Recommended Filter | Reason |
|---|---|---|
| Smoothing while preserving peaks | Savitzky-Golay | Polynomial fitting preserves shape |
| Real-time smoothing | EMA | \(O(1)\) computation, causal |
| Removing specific frequency bands | Butterworth / Notch | Sharp cutoff characteristics |
| Known noise statistics | Wiener / Kalman | Statistically optimal |
| Derivative estimation | Savitzky-Golay | Integrated smoothing and differentiation |
| Non-stationary signals | Wavelet | Time-frequency analysis |
Related Articles
- Exponential Moving Average (EMA) Filter Frequency Response - EMA is a first-order IIR filter with low computational cost, but lacks the shape preservation capability of Savitzky-Golay.
- Moving Average Filter Types and Comparison - Setting the Savitzky-Golay polynomial order to 0 yields a simple moving average.
- FFT: Algorithm and Python Implementation - The foundational algorithm for analyzing filter frequency characteristics.
- Low-Pass Filter Design and Comparison - Compares frequency response and group delay of various low-pass filters.
- FIR and IIR Filter Comparison - The Savitzky-Golay filter is a type of FIR filter with linear phase characteristics.
- Butterworth Filter Design and Python Implementation - An IIR filter for cases requiring sharper cutoff characteristics.
- Wiener Filter: Theory and Python Implementation - Statistically optimal filter design, effective when noise statistics are known.
- Window Functions and Power Spectral Density (PSD) - Window function theory related to Savitzky-Golay window width selection.
- Matplotlib Tips: Creating Publication-Quality Plots - Tips for creating high-quality plots of frequency responses and smoothing results.
- Notch Filter Design and Python Implementation - A filter for removing specific frequencies, a different approach from Savitzky-Golay.
- Kalman Filter: Theory and Python Implementation - Probabilistic filtering based on state-space models, a fundamentally different paradigm from Savitzky-Golay.
References
- Savitzky, A., & Golay, M. J. E. (1964). Smoothing and Differentiation of Data by Simplified Least Squares Procedures. Analytical Chemistry, 36(8), 1627-1639.
- Schafer, R. W. (2011). What Is a Savitzky-Golay Filter? IEEE Signal Processing Magazine, 28(4), 111-117.
- scipy.signal.savgol_filter documentation