Moving Average Filters Compared: SMA, WMA, and EMA in Python

Mathematical definitions and Python implementations of Simple Moving Average (SMA), Weighted Moving Average (WMA), and Exponential Moving Average (EMA), with comparisons of noise reduction, response speed, and frequency characteristics.

The Moving Average is one of the most commonly used filtering techniques for smoothing time-series data. It is used in a wide range of applications including noise reduction, trend extraction, and sensor data preprocessing.

This article defines three representative moving average filters mathematically, implements them in Python, and compares their performance.

Simple Moving Average (SMA)

Definition

SMA is the arithmetic mean of the most recent \(N\) data points.

\[ y_t = \frac{1}{N} \sum_{i=0}^{N-1} x_{t-i} \tag{1} \]

All data points receive equal weight \(1/N\). \(N\) is the window size — larger values produce stronger smoothing.

Characteristics

  • Pros: Simple to implement. Effective for high-frequency noise removal
  • Cons: Slow response to rapid changes (delay of \(\frac{N-1}{2}\) samples). Requires a buffer of size \(N\)

Python Implementation

import numpy as np

def sma(x, window):
    """Simple Moving Average filter"""
    kernel = np.ones(window) / window
    return np.convolve(x, kernel, mode='same')

Weighted Moving Average (WMA)

Definition

WMA assigns linearly decreasing weights, giving more importance to recent data.

\[ y_t = \frac{\sum_{i=0}^{N-1} (N - i) \cdot x_{t-i}}{\sum_{i=0}^{N-1} (N - i)} = \frac{2}{N(N+1)} \sum_{i=0}^{N-1} (N - i) \cdot x_{t-i} \tag{2} \]

Characteristics

  • Pros: Faster tracking of recent data compared to SMA
  • Cons: Still requires a buffer of size \(N\). Weights drop to zero abruptly at the window boundary

Python Implementation

def wma(x, window):
    """Weighted Moving Average filter"""
    weights = np.arange(1, window + 1, dtype=float)
    weights /= weights.sum()
    return np.convolve(x, weights[::-1], mode='same')

Exponential Moving Average (EMA)

Definition

EMA is a recursively computed moving average that assigns exponentially decaying weights to all past data.

\[ y_t = \alpha \cdot x_t + (1 - \alpha) \cdot y_{t-1} \tag{3} \]

\(\alpha\) (\(0 < \alpha \le 1\)) is the smoothing factor. Smaller \(\alpha\) produces stronger smoothing. To match an SMA window of size \(N\), \(\alpha = \frac{2}{N + 1}\) is commonly used.

Characteristics

  • Pros: No buffer needed (only stores the previous value). Memory efficient. Smooth weight decay
  • Cons: The parameter \(\alpha\) may be less intuitive than a window size

Python Implementation

def ema(x, alpha):
    """Exponential Moving Average filter"""
    y = np.zeros_like(x, dtype=float)
    y[0] = x[0]
    for t in range(1, len(x)):
        y[t] = alpha * x[t] + (1 - alpha) * y[t - 1]
    return y

For detailed frequency analysis of EMA, see EMA Filter Frequency Characteristics.

Comparison of the Three Filters

Comparison Experiment

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(42)
n = 200
t = np.linspace(0, 4 * np.pi, n)

signal = np.sin(t) + 0.5 * np.sin(5 * t)
noise = np.random.normal(0, 0.5, n)
observed = signal + noise

window = 15
alpha = 2 / (window + 1)

y_sma = sma(observed, window)
y_wma = wma(observed, window)
y_ema = ema(observed, alpha)

fig, axes = plt.subplots(2, 2, figsize=(12, 8), sharex=True)

axes[0, 0].plot(t, observed, alpha=0.4, label='Observed')
axes[0, 0].plot(t, signal, 'k--', label='True signal')
axes[0, 0].set_title('Original')
axes[0, 0].legend()

axes[0, 1].plot(t, observed, alpha=0.3)
axes[0, 1].plot(t, y_sma, label=f'SMA (N={window})')
axes[0, 1].set_title('SMA')
axes[0, 1].legend()

axes[1, 0].plot(t, observed, alpha=0.3)
axes[1, 0].plot(t, y_wma, label=f'WMA (N={window})')
axes[1, 0].set_title('WMA')
axes[1, 0].legend()

axes[1, 1].plot(t, observed, alpha=0.3)
axes[1, 1].plot(t, y_ema, label=f'EMA (α={alpha:.3f})')
axes[1, 1].set_title('EMA')
axes[1, 1].legend()

for ax in axes.flat:
    ax.grid(True, alpha=0.3)
    ax.set_ylabel('Amplitude')

axes[1, 0].set_xlabel('Time')
axes[1, 1].set_xlabel('Time')
plt.tight_layout()
plt.show()

Comparison Table

PropertySMAWMAEMA
Weight distributionUniformLinear decayExponential decay
Memory\(O(N)\) (window buffer)\(O(N)\)\(O(1)\) (previous value only)
Computation (per step)\(O(N)\) (\(O(1)\) with diff method)\(O(N)\)\(O(1)\)
Past data influenceZero outside windowZero outside windowDecays exponentially, never zero
Delay\((N-1)/2\) samplesLess than SMA\(\alpha\)-dependent, smallest
Step responseLinear rampCurved rampExponential convergence

Frequency Characteristics

The SMA transfer function is:

\[ G_{SMA}(z) = \frac{1}{N} \sum_{i=0}^{N-1} z^{-i} = \frac{1}{N} \cdot \frac{1 - z^{-N}}{1 - z^{-1}} \tag{4} \]

SMA has zeros (frequencies where gain drops to zero), while EMA gain decreases monotonically. This means SMA can completely remove specific frequency components, while EMA provides a smooth overall attenuation.

For detailed comparison of frequency responses, see Lowpass Filter Design and Comparison.

Selection Guide by Use Case

Use CaseRecommendedReason
Sensor preprocessingEMAMemory efficient, suitable for real-time
Financial trend analysisSMA / EMASMA is a standard technical indicator; EMA tracks short-term trends
Embedded systemsEMA\(O(1)\) memory and computation
Offline signal processingSMA / WMABatch processing relaxes memory constraints
Known noise frequencySMAWindow size can target zeros at the noise frequency

References

  • Smith, S. W. (1997). The Scientist and Engineer’s Guide to Digital Signal Processing. California Technical Publishing.
  • Oppenheim, A. V., & Schafer, R. W. (2009). Discrete-Time Signal Processing (3rd ed.). Prentice Hall.