Window Functions and Power Spectral Density (PSD): Theory and Python Implementation

Introduction

In the FFT article, we covered the DFT algorithm and basic frequency analysis. We briefly touched on using the Hann window to mitigate spectral leakage, but there are many types of window functions, and choosing the right one for the task at hand is essential.

Moreover, the amplitude spectrum from FFT shows “how much amplitude exists at each frequency,” but in practice we often need to know the signal power per unit frequency. This quantity is the Power Spectral Density (PSD).

This article dives deeper into the mathematics behind spectral leakage, compares the characteristics of major window functions, and implements PSD estimation via Welch’s method in Python.

Mathematical Understanding of Spectral Leakage

Finite-Length Signals and the Rectangular Window

In practice, we can never record a signal for infinite duration. We acquire a finite-length signal \(x[n]\) (\(n = 0, 1, \ldots, N-1\)). Mathematically, this is equivalent to multiplying an infinite-length signal \(x_\infty[n]\) by a rectangular window \(w_R[n]\).

\[x[n] = x_\infty[n] \cdot w_R[n] \tag{1}\]

The rectangular window is defined as:

\[w_R[n] = \begin{cases} 1, & 0 \leq n \leq N-1 \\ 0, & \text{otherwise} \end{cases} \tag{2}\]

Effect in the Frequency Domain

Multiplication in the time domain corresponds to convolution in the frequency domain.

\[X(f) = X_\infty(f) * W_R(f) \tag{3}\]

The Fourier transform of the rectangular window is the Dirichlet kernel, which in continuous approximation is proportional to the sinc function.

\[W_R(f) \approx N \cdot \text{sinc}(Nf) \cdot e^{-j\pi f(N-1)} \tag{4}\]

This sinc function has a broad mainlobe and slowly decaying sidelobes. Through the convolution in Eq. \((3)\), a spectrum that should be concentrated at a single frequency spreads to adjacent frequencies via these sidelobes. This is the mathematical origin of spectral leakage.

Mainlobe and Sidelobes

The frequency characteristics of a window function are described by two key elements:

Mainlobe: The primary peak around the center frequency. A narrower mainlobe yields higher frequency resolution.
Sidelobes: Secondary peaks on either side of the mainlobe. Lower sidelobe levels mean less spectral leakage.

The rectangular window has a peak sidelobe level of only \(-13\) dB, which means weak signal components near a strong tone can be buried under the sidelobes. By choosing different window functions, we can control this tradeoff.

Major Window Functions

Below are the mathematical definitions of commonly used window functions, all of length \(N\).

Rectangular Window

\[w[n] = 1, \quad 0 \leq n \leq N-1 \tag{5}\]

Equivalent to applying no window. It has the narrowest mainlobe and therefore the best frequency resolution, but the sidelobes are large and spectral leakage is severe.

Hann Window

\[w[n] = 0.5\left(1 - \cos\left(\frac{2\pi n}{N-1}\right)\right) \tag{6}\]

A cosine-based window that tapers to zero at both ends. The sidelobe decay rate of \(-18\) dB/oct is fast, making it the most widely used general-purpose window.

Hamming Window

\[w[n] = 0.54 - 0.46\cos\left(\frac{2\pi n}{N-1}\right) \tag{7}\]

Similar to the Hann window, but the endpoints do not reach zero (they have a value of approximately 0.08). The first sidelobe is suppressed to \(-43\) dB, but the far sidelobe decay is only \(-6\) dB/oct, slower than the Hann window.

Blackman Window

\[w[n] = 0.42 - 0.5\cos\left(\frac{2\pi n}{N-1}\right) + 0.08\cos\left(\frac{4\pi n}{N-1}\right) \tag{8}\]

Composed of three cosine terms. The first sidelobe is extremely low at \(-58\) dB, making it suitable for high dynamic range analysis, but the wider mainlobe reduces frequency resolution.

Kaiser Window

\[w[n] = \frac{I_0\left(\beta\sqrt{1 - \left(\frac{2n}{N-1} - 1\right)^2}\right)}{I_0(\beta)} \tag{9}\]

Here \(I_0\) is the zeroth-order modified Bessel function of the first kind, and \(\beta\) is the shape parameter. By adjusting \(\beta\), you can continuously control the tradeoff between mainlobe width and sidelobe suppression. \(\beta = 0\) gives the rectangular window, and \(\beta \approx 5.4\) approximates the Hamming window.

Window Function Characteristics Comparison

Window	Mainlobe Width	Peak Sidelobe [dB]	Sidelobe Decay	ENBW (bins)
Rectangular	2 bins	-13	-6 dB/oct	1.00
Hann	4 bins	-32	-18 dB/oct	1.50
Hamming	4 bins	-43	-6 dB/oct	1.36
Blackman	6 bins	-58	-18 dB/oct	1.73
Kaiser (β=6)	Variable	Variable	Variable	Variable

ENBW (Equivalent Noise Bandwidth) indicates how many frequency bins of bandwidth a window effectively has for white noise. A larger value means more noise influence on the estimate.

Window Functions: Frequency Response Comparison (Python)

Let us generate all window functions and compare their frequency responses on a dB scale.

import numpy as np
import matplotlib.pyplot as plt
from scipy.signal.windows import hann, hamming, blackman, kaiser

# Window length
N = 64
# FFT size (zero-padded for interpolation)
N_fft = 4096

# Generate each window function
windows = {
    'Rectangular': np.ones(N),
    'Hann': hann(N),
    'Hamming': hamming(N),
    'Blackman': blackman(N),
    'Kaiser (β=6)': kaiser(N, beta=6),
}

# Compute and plot frequency responses
fig, ax = plt.subplots(figsize=(10, 6))

for name, w in windows.items():
    # Compute frequency response via FFT
    W = np.fft.fft(w, n=N_fft)
    W_shift = np.fft.fftshift(W)
    # Normalize (mainlobe peak at 0 dB)
    W_dB = 20 * np.log10(np.maximum(np.abs(W_shift) / np.abs(W_shift).max(), 1e-12))
    # Frequency axis (in bins)
    freq_bins = np.linspace(-N / 2, N / 2, N_fft)
    ax.plot(freq_bins, W_dB, label=name)

ax.set_xlim(-15, 15)
ax.set_ylim(-120, 5)
ax.set_xlabel('Frequency [bins]')
ax.set_ylabel('Magnitude [dB]')
ax.set_title('Window Functions: Frequency Response Comparison')
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

This plot visually confirms that the rectangular window has the narrowest mainlobe but the largest sidelobes, while the Blackman window has very low sidelobes at the cost of a wider mainlobe.

Power Spectral Density (PSD)

Definition and Physical Meaning

The amplitude spectrum \(|X[k]|\) shows the signal amplitude at each frequency bin, but it depends on the FFT length \(N\) and sampling rate \(f_s\), making direct comparison between spectra obtained under different conditions difficult.

Power Spectral Density (PSD) is defined as signal power per unit frequency and shows how a signal’s power is distributed across the frequency axis. Its unit is \(\text{V}^2/\text{Hz}\) (or the appropriate physical unit squared per Hz).

An important property of PSD is that integrating it over all frequencies yields the total signal power (variance).

\[\int_0^{f_s/2} S_{xx}(f) \, df = \sigma_x^2 \tag{10}\]

Difference from Amplitude Spectrum

Property	Amplitude Spectrum	Power Spectral Density
Value	\(\\|X[k]\\|\)	\(S_{xx}(f)\)
Unit	V (signal unit)	\(\text{V}^2/\text{Hz}\)
N dependency	Proportional to N	Independent of N
Use case	Check amplitude at a frequency	Quantitative power distribution

PSD Estimation Methods

Periodogram

The simplest PSD estimator is the periodogram, computed directly from the FFT result.

\[\hat{S}_{xx}[k] = \frac{|X[k]|^2}{N \cdot f_s \cdot U} \tag{11}\]

Here \(X[k]\) is the DFT of the windowed signal and \(U = \frac{1}{N}\sum_{n=0}^{N-1} w[n]^2\) is the window power correction factor (\(U=1\) for the rectangular window). The periodogram is asymptotically unbiased (unbiased as \(N \to \infty\)), but suffers from high variance. Even increasing the data length \(N\) does not reduce the variance, so the estimate fluctuates significantly due to noise.

Welch’s Method

Welch’s method (1967) is a practical technique for reducing periodogram variance. The procedure is:

Divide the signal into segments of length \(L\) with overlap
Apply a window function \(w[n]\) to each segment
Compute the modified periodogram for each segment
Average all segment results

The modified periodogram of the \(i\)-th segment is defined as:

\[\hat{S}_{xx}^{(i)}[k] = \frac{1}{L \cdot f_s \cdot U} \left| \sum_{n=0}^{L-1} x_i[n] \cdot w[n] \cdot e^{-j2\pi kn/L} \right|^2 \tag{12}\]

where \(U\) is the window power normalization factor:

\[U = \frac{1}{L}\sum_{n=0}^{L-1} w[n]^2 \tag{13}\]

Averaging over \(K\) segments gives the Welch PSD estimate:

\[\hat{S}_{xx}^{\text{Welch}}[k] = \frac{1}{K}\sum_{i=1}^{K} \hat{S}_{xx}^{(i)}[k] \tag{14}\]

Overlap is used to make effective use of the data and increase the number of segments available for averaging. A 50% overlap is typical. Excessive overlap increases inter-segment correlation and diminishes the averaging benefit.

Segment length tradeoff: longer segments improve frequency resolution but yield fewer segments for averaging (higher variance). Shorter segments reduce variance but degrade frequency resolution.

Practical PSD Estimation in Python

We generate a test signal with two sinusoids (50 Hz and 120 Hz) plus white noise, then compare PSD estimates from the periodogram and Welch’s method.

import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import welch
from scipy.signal.windows import hann

# --- Generate test signal ---
np.random.seed(0)
fs = 1000            # Sampling frequency [Hz]
T = 4.0              # Signal duration [s]
t = np.arange(0, T, 1/fs)
N_total = len(t)

# 50 Hz (amplitude 1.0) and 120 Hz (amplitude 0.5) + white noise
signal = (np.sin(2 * np.pi * 50 * t)
          + 0.5 * np.sin(2 * np.pi * 120 * t)
          + 0.8 * np.random.randn(N_total))

# --- 1. Periodogram (manual implementation) ---
window = hann(N_total)
X = np.fft.fft(signal * window)
freqs_periodo = np.fft.fftfreq(N_total, 1/fs)
# Window power correction
U = np.mean(window**2)
# One-sided periodogram
psd_periodo = (np.abs(X[:N_total//2])**2) / (N_total * fs * U)
psd_periodo[1:-1] *= 2  # Double all bins except DC and Nyquist
freqs_periodo = freqs_periodo[:N_total//2]

# --- 2. Welch's method (scipy.signal.welch) ---
freqs_welch, psd_welch = welch(signal, fs=fs, window='hann',
                                nperseg=512, noverlap=256)

# --- 3. Window function comparison ---
windows_list = ['hann', 'hamming', 'blackman']
psd_results = {}
for win_name in windows_list:
    f, p = welch(signal, fs=fs, window=win_name,
                 nperseg=512, noverlap=256)
    psd_results[win_name] = (f, p)

# --- Plot ---
fig, axes = plt.subplots(3, 1, figsize=(10, 12))

# (a) Periodogram vs Welch's method
axes[0].semilogy(freqs_periodo, psd_periodo, alpha=0.5, label='Periodogram')
axes[0].semilogy(freqs_welch, psd_welch, label='Welch (nperseg=512)')
axes[0].set_xlabel('Frequency [Hz]')
axes[0].set_ylabel('PSD [V²/Hz]')
axes[0].set_title('Periodogram vs Welch Method')
axes[0].legend()
axes[0].set_xlim(0, 200)
axes[0].grid(True, alpha=0.3)

# (b) Welch's method with different windows
for win_name, (f, p) in psd_results.items():
    axes[1].semilogy(f, p, label=f'Welch ({win_name})')
axes[1].set_xlabel('Frequency [Hz]')
axes[1].set_ylabel('PSD [V²/Hz]')
axes[1].set_title('Welch Method: Window Function Comparison')
axes[1].legend()
axes[1].set_xlim(0, 200)
axes[1].grid(True, alpha=0.3)

# (c) Input signal waveform
axes[2].plot(t[:500], signal[:500])
axes[2].set_xlabel('Time [s]')
axes[2].set_ylabel('Amplitude')
axes[2].set_title('Input Signal (first 0.5 s)')
axes[2].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

The periodogram shows large fluctuations in the estimate, whereas Welch’s method produces a much smoother result through averaging. The peaks at 50 Hz and 120 Hz are clearly visible, with the 120 Hz peak approximately 1/4 the power of the 50 Hz peak (since the amplitude ratio is 0.5, and power scales as amplitude squared).

Window Function Selection Guidelines

Use Case	Recommended Window	Reason
General-purpose analysis	Hann	Good balance of sidelobe suppression and resolution
Resolving close frequencies	Rectangular / Kaiser (low β)	Narrowest mainlobe
High dynamic range analysis	Blackman / Kaiser (high β)	Lowest sidelobes for detecting weak signals
Hardware / real-time	Hamming	Simple computation, non-zero at endpoints

In practice, it is recommended to start with the Hann window and switch to other windows as needed.

Summary

Spectral leakage occurs because finite-length signal truncation (rectangular windowing) causes convolution in the frequency domain
Window functions control the tradeoff between mainlobe width and sidelobe suppression; options include Hann, Hamming, Blackman, and Kaiser for different purposes
Power Spectral Density (PSD) expresses power per unit frequency, enabling comparison across different measurement conditions
The periodogram has high variance, so Welch’s method with segment averaging is the standard practical estimator
Window function selection also affects PSD estimation quality, so choosing appropriately for the analysis objective is important

Fast Fourier Transform (FFT): Algorithm and Python Implementation - Covers the DFT/FFT algorithms and basic frequency analysis that serve as prerequisites for this article.
Frequency Characteristics of the Exponential Moving Average (EMA) Filter - A reference for understanding filter frequency response concepts.
Lowpass Filter Design and Comparison - Compares frequency responses of various lowpass filters.
FIR and IIR Filter Comparison - Explains how window functions are used in FIR filter design.
Wavelet Transform: Theory and Python Implementation - Covers wavelet analysis for time-frequency localization that FFT/PSD cannot capture.
Types and Comparison of Moving Average Filters - Analyzes moving average filter frequency characteristics using FFT.
Matplotlib Practical Tips: Creating Publication-Quality Plots - Techniques for producing publication-quality spectral plots.

References

Harris, F. J. (1978). “On the Use of Windows for Harmonic Analysis with the Discrete Fourier Transform.” Proceedings of the IEEE, 66(1), 51-83.
Welch, P. D. (1967). “The Use of Fast Fourier Transform for the Estimation of Power Spectra.” IEEE Transactions on Audio and Electroacoustics, 15(2), 70-73.
Oppenheim, A. V., & Schafer, R. W. (2009). Discrete-Time Signal Processing (3rd ed.). Prentice Hall.
NumPy FFT documentation
SciPy Signal Processing documentation