What Is a Median Filter?
A median filter is a non-linear filter that sorts the values within a sliding window and outputs the median (middle value) of those sorted samples.
Unlike linear filters such as the EMA filter or moving average filter, the median filter is exceptionally effective at removing impulse noise (spike-like noise) while preserving signal edges.
Common Applications
- Impulse noise removal: Sensor glitches, bit-flip errors in digital transmission
- Image processing: Salt-and-pepper noise removal
- Medical signal processing: ECG artifact removal
- Financial data: Filtering of erroneous tick data in price series
Mathematical Definition
For a window of size \(2m+1\) (where \(m\) is the one-sided half-width), the output \(y[n]\) at time \(n\) is:
\[ y[n] = \text{median}\bigl(x[n-m],\, x[n-m+1],\, \dots,\, x[n],\, \dots,\, x[n+m]\bigr) \tag{1} \]The \(2m+1\) values are sorted, and the \((m+1)\)-th value (the middle one) is selected as the output.
Example (window size \(= 5\), \(m = 2\)):
Suppose a portion of the input is \([\ldots, 3, 4, \mathbf{100}, 4, 3, \ldots]\) where the bold value is a spike:
\[ \text{sort}(3, 4, 100, 4, 3) = (3, 3, \mathbf{4}, 4, 100) \Rightarrow y[n] = 4 \tag{2} \]The spike value \(100\) is completely eliminated, and the true signal value \(4\) is recovered.
Key Differences from Linear Filters
The Superposition Principle Does Not Hold
Linear filters compute outputs as weighted sums of inputs. Median filtering involves sorting, so the superposition principle fails:
\[ \text{median}(ax_1 + bx_2) \neq a\cdot\text{median}(x_1) + b\cdot\text{median}(x_2) \tag{3} \]This non-linearity is precisely what gives the median filter its unique advantage over linear filters for impulse noise.
Impulse Noise Robustness
Linear filters spread impulse noise across neighboring samples. The median filter confines the spike’s influence within the window and eliminates it through sorting.
| Filter | Gaussian Noise Removal | Impulse Noise Removal | Edge Preservation |
|---|---|---|---|
| Moving Average | Good | Poor (spreads) | Low |
| EMA | Good (weighted) | Poor (spreads) | Low |
| Savitzky-Golay | Good | Poor | High |
| Median | Moderate | Excellent | High |
Python Implementation
Manual Implementation
import numpy as np
def median_filter_manual(x: np.ndarray, window_size: int) -> np.ndarray:
"""Median filter with reflect padding at edges."""
assert window_size % 2 == 1, "window_size must be odd"
m = window_size // 2
n = len(x)
y = np.empty(n)
x_padded = np.pad(x, m, mode="reflect")
for i in range(n):
window = x_padded[i : i + window_size]
y[i] = np.median(window)
return y
Using scipy.signal.medfilt
In practice, scipy.signal.medfilt is the preferred choice:
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
# --- Signal generation ---
np.random.seed(42)
t = np.linspace(0, 1, 500)
x_clean = np.sin(2 * np.pi * 5 * t)
noise_gaussian = 0.2 * np.random.randn(len(t))
x_noisy = x_clean + noise_gaussian
# Add impulse noise
rng = np.random.default_rng(0)
impulse_indices = rng.choice(len(t), size=30, replace=False)
x_impulse = x_noisy.copy()
x_impulse[impulse_indices] += rng.choice([-4.0, 4.0], size=30)
# --- Apply filters ---
window_size = 11
y_median = signal.medfilt(x_impulse, kernel_size=window_size)
y_ma = np.convolve(x_impulse, np.ones(window_size) / window_size, mode="same")
# --- Plot ---
fig, axes = plt.subplots(3, 1, figsize=(12, 9), sharex=True)
axes[0].plot(t, x_impulse, color="gray", alpha=0.7, label="Noisy signal (with impulses)")
axes[0].plot(t, x_clean, color="blue", linestyle="--", label="Clean signal", linewidth=2)
axes[0].set_ylabel("Amplitude")
axes[0].legend()
axes[0].set_title("Input Signal (Impulse Noise Added)")
axes[1].plot(t, y_median, color="red", label=f"Median filter (window={window_size})")
axes[1].plot(t, x_clean, color="blue", linestyle="--", label="Clean signal", linewidth=2)
axes[1].set_ylabel("Amplitude")
axes[1].legend()
axes[1].set_title("Median Filter Output")
axes[2].plot(t, y_ma, color="green", label=f"Moving average (window={window_size})")
axes[2].plot(t, x_clean, color="blue", linestyle="--", label="Clean signal", linewidth=2)
axes[2].set_ylabel("Amplitude")
axes[2].set_xlabel("Time [s]")
axes[2].legend()
axes[2].set_title("Moving Average Output (Comparison)")
plt.tight_layout()
plt.show()
Window Size Selection
The window size \(W = 2m+1\) is the only parameter of the median filter.
import numpy as np
from scipy import signal
import matplotlib.pyplot as plt
np.random.seed(0)
t = np.linspace(0, 1, 300)
x_clean = np.sin(2 * np.pi * 3 * t)
x_noisy = x_clean.copy()
idx = np.random.choice(len(t), size=30, replace=False)
x_noisy[idx] += np.random.choice([-3, 3], size=30)
window_sizes = [3, 7, 15, 31]
fig, axes = plt.subplots(len(window_sizes), 1, figsize=(12, 10), sharex=True)
for ax, ws in zip(axes, window_sizes):
y = signal.medfilt(x_noisy, kernel_size=ws)
mse = np.mean((y - x_clean) ** 2)
ax.plot(t, x_noisy, color="gray", alpha=0.5, label="Noisy input")
ax.plot(t, y, color="red", label=f"window={ws}, MSE={mse:.4f}")
ax.plot(t, x_clean, color="blue", linestyle="--", label="Clean signal", linewidth=1.5)
ax.legend(loc="upper right")
ax.set_ylabel("Amplitude")
axes[-1].set_xlabel("Time [s]")
plt.suptitle("Median Filter with Different Window Sizes")
plt.tight_layout()
plt.show()
Selection guidelines:
- Small window (3–5): Removes fine spikes while preserving signal detail. Suitable for sparse noise.
- Large window (11–31): Stronger denoising but may blunt sharp peaks and edges.
- Consecutive noise: The window size must exceed the length of any consecutive noise run.
Combining with EMA for Mixed Noise
The median filter excels at impulse noise but is only moderate for Gaussian noise. A practical two-stage pipeline combines both:
from scipy import signal
import numpy as np
def hybrid_filter(x: np.ndarray, median_window: int = 5, ema_alpha: float = 0.2) -> np.ndarray:
"""
Two-stage filter: Median → EMA
Step 1: Remove impulse noise with median filter
Step 2: Smooth residual Gaussian noise with EMA
"""
# Step 1: Impulse removal
y_median = signal.medfilt(x, kernel_size=median_window)
# Step 2: EMA smoothing
y_ema = np.zeros_like(y_median)
y_ema[0] = y_median[0]
for i in range(1, len(y_median)):
y_ema[i] = ema_alpha * y_median[i] + (1 - ema_alpha) * y_ema[i - 1]
return y_ema
This two-stage design is widely used in sensor data pipelines and communication system denoising.
Computational Complexity and Edge Handling
Complexity
A naive implementation sorts \(W\) values per output sample, giving \(O(W \log W)\) per sample and \(O(NW \log W)\) overall. scipy.signal.medfilt uses efficient internal algorithms. For large-scale data, scipy.ndimage.median_filter is often faster and offers more flexible edge-handling options:
from scipy import ndimage
# reflect padding (generally recommended)
y_reflect = ndimage.median_filter(x, size=11, mode="reflect")
# nearest-neighbor padding
y_nearest = ndimage.median_filter(x, size=11, mode="nearest")
Edge Handling
scipy.signal.medfilt uses zero-padding by default, which can cause artifacts at signal boundaries. Use scipy.ndimage.median_filter with mode="reflect" to avoid this.
Summary
| Property | Median Filter |
|---|---|
| Filter type | Non-linear |
| Impulse noise removal | Excellent |
| Gaussian noise removal | Moderate |
| Edge preservation | High |
| Parameters | 1 (window size \(W\)) |
| Complexity | \(O(NW \log W)\) |
| Causality | Semi-causal (non-causal with symmetric window) |
Use the median filter when impulse noise dominates. For Gaussian-dominated noise, prefer the EMA filter or Butterworth filter. For peak-shape preservation, consider the Savitzky-Golay filter. For time-varying environments, use adaptive filters.
Related Articles
- Frequency Characteristics of the Exponential Moving Average Filter - The canonical linear filter; useful as a baseline comparison.
- Moving Average Filter Types and Comparison - SMA, WMA, EMA compared with the median filter.
- Savitzky-Golay Filter Theory and Python Implementation - Polynomial-fitting smoother with strong edge preservation.
- Adaptive Filter (LMS/RLS) Theory and Python Implementation - Overcomes the fixed-window limitation of the median filter.
- Butterworth Filter Design and Python Implementation - Linear filter with maximally flat passband.
- Time Series Anomaly Detection: From Statistical Methods to Kalman Filter - Median filtering as a preprocessing step for anomaly detection.
- Introduction to Signal Processing Filters - Systematic overview of filter families.
References
- Tukey, J. W. (1977). Exploratory Data Analysis. Addison-Wesley.
- Pitas, I., & Venetsanopoulos, A. N. (1990). Nonlinear Digital Filters: Principles and Applications. Springer.
- scipy.signal.medfilt documentation
- scipy.ndimage.median_filter documentation
Related Tools
- DevToolBox - Free Developer Tools - 85+ developer utilities including JSON formatter and regex tester
- CalcBox - Everyday Calculators - 61+ calculators including statistics and compound interest