What is the Extended Kalman Filter?
The Kalman Filter (KF) is the optimal state estimator for linear Gaussian systems, but most real-world systems are nonlinear. The Extended Kalman Filter (EKF) extends the KF framework to nonlinear systems by linearizing the nonlinear functions around the current estimate using Jacobian matrices (first-order Taylor expansion).
EKF is the most fundamental nonlinear filtering technique and serves as the starting point for more advanced methods:
- EKF: First-order linearization via Jacobian matrices
- UKF: Direct approximation of nonlinear transforms using sigma points (Unscented Transformation)
- CKF: Sigma point selection based on the spherical-radial cubature rule
- Particle Filter: Monte Carlo sampling to approximate arbitrary distributions
Nonlinear State-Space Model
Consider the following nonlinear state-space model:
\[ \mathbf{x}_k = f(\mathbf{x}_{k-1}) + \mathbf{w}_{k-1}, \quad \mathbf{w}_{k-1} \sim \mathcal{N}(\mathbf{0}, \mathbf{Q}) \tag{1} \]\[ \mathbf{z}_k = h(\mathbf{x}_k) + \mathbf{v}_k, \quad \mathbf{v}_k \sim \mathcal{N}(\mathbf{0}, \mathbf{R}) \tag{2} \]Here \(f\) is the nonlinear state transition function and \(h\) is the nonlinear observation function. In the linear KF, these are expressed as matrices (\(f(\mathbf{x}) = \mathbf{A}\mathbf{x}\) and \(h(\mathbf{x}) = \mathbf{H}\mathbf{x}\)), but in the EKF they are general nonlinear functions.
Linearization via Jacobian Matrices
The key idea of the EKF is to perform a Taylor expansion of the nonlinear functions \(f\) and \(h\) around the current estimate, retaining only the first-order terms.
The Jacobian of the state transition function \(f\) is computed at the previous estimate \(\hat{\mathbf{x}}_{k-1|k-1}\):
\[ \mathbf{F}_k = \frac{\partial f}{\partial \mathbf{x}}\bigg|_{\mathbf{x}=\hat{\mathbf{x}}_{k-1|k-1}} = \begin{bmatrix} \frac{\partial f_1}{\partial x_1} & \cdots & \frac{\partial f_1}{\partial x_n} \\\ \vdots & \ddots & \vdots \\\ \frac{\partial f_n}{\partial x_1} & \cdots & \frac{\partial f_n}{\partial x_n} \end{bmatrix} \tag{3} \]The Jacobian of the observation function \(h\) is computed at the predicted state \(\hat{\mathbf{x}}_{k|k-1}\):
\[ \mathbf{H}_k = \frac{\partial h}{\partial \mathbf{x}}\bigg|_{\mathbf{x}=\hat{\mathbf{x}}_{k|k-1}} \tag{4} \]This linearization leads to the characteristic structure of the EKF: the Jacobian matrices \(\mathbf{F}_k, \mathbf{H}_k\) are used for covariance propagation, while the original nonlinear functions \(f, h\) are used for state and observation computation.
EKF Algorithm
Prediction Step
The state prediction uses the nonlinear function \(f\) directly:
\[ \hat{\mathbf{x}}_{k|k-1} = f(\hat{\mathbf{x}}_{k-1|k-1}) \tag{5} \]The covariance prediction uses the Jacobian \(\mathbf{F}_k\) for linear approximation:
\[ \mathbf{P}_{k|k-1} = \mathbf{F}_k \mathbf{P}_{k-1|k-1} \mathbf{F}_k^T + \mathbf{Q} \tag{6} \]Update Step
The observation residual (innovation) uses the nonlinear function \(h\):
\[ \mathbf{y}_k = \mathbf{z}_k - h(\hat{\mathbf{x}}_{k|k-1}) \tag{7} \]The innovation covariance and Kalman gain use the Jacobian \(\mathbf{H}_k\):
\[ \mathbf{S}_k = \mathbf{H}_k \mathbf{P}_{k|k-1} \mathbf{H}_k^T + \mathbf{R} \tag{8} \]\[ \mathbf{K}_k = \mathbf{P}_{k|k-1} \mathbf{H}_k^T \mathbf{S}_k^{-1} \tag{9} \]State and covariance update:
\[ \hat{\mathbf{x}}_{k|k} = \hat{\mathbf{x}}_{k|k-1} + \mathbf{K}_k \mathbf{y}_k \tag{10} \]\[ \mathbf{P}_{k|k} = (\mathbf{I} - \mathbf{K}_k \mathbf{H}_k) \mathbf{P}_{k|k-1} \tag{11} \]The key difference from the standard KF is that the nonlinear functions \(f, h\) are used for state propagation and observation prediction, while the Jacobian matrices \(\mathbf{F}_k, \mathbf{H}_k\) are used for covariance propagation.
Python Implementation
We implement the EKF for a 2D tracking problem where a target moves in a plane and an observer at the origin measures range and bearing.
Problem Setup
The state vector is \(\mathbf{x} = [p_x, p_y, v_x, v_y]^T\) (position and velocity). The state transition follows a constant velocity model:
\[ f(\mathbf{x}) = \begin{bmatrix} p_x + v_x \Delta t \\\ p_y + v_y \Delta t \\\ v_x \\\ v_y \end{bmatrix} \]The observation model is a nonlinear function where the observer at the origin measures range \(r\) and bearing \(\theta\):
\[ h(\mathbf{x}) = \begin{bmatrix} \sqrt{p_x^2 + p_y^2} \\\ \arctan(p_y / p_x) \end{bmatrix} \]The Jacobian of the observation function \(h\) is, with \(r = \sqrt{p_x^2 + p_y^2}\):
\[ \mathbf{H} = \begin{bmatrix} p_x / r & p_y / r & 0 & 0 \\\ -p_y / r^2 & p_x / r^2 & 0 & 0 \end{bmatrix} \]Model Definition
import numpy as np
import matplotlib.pyplot as plt
dt = 1.0 # time step
def f(x):
"""State transition function (constant velocity model)"""
px, py, vx, vy = x
return np.array([px + vx * dt, py + vy * dt, vx, vy])
def jacobian_F(x):
"""Jacobian of the state transition function"""
return np.array([
[1, 0, dt, 0],
[0, 1, 0, dt],
[0, 0, 1, 0],
[0, 0, 0, 1]
])
def h(x):
"""Observation function (range and bearing)"""
px, py = x[0], x[1]
r = np.sqrt(px**2 + py**2)
theta = np.arctan2(py, px)
return np.array([r, theta])
def jacobian_H(x):
"""Jacobian of the observation function"""
px, py = x[0], x[1]
r = np.sqrt(px**2 + py**2)
return np.array([
[px / r, py / r, 0, 0],
[-py / r**2, px / r**2, 0, 0]
])
EKF Implementation
class EKF:
def __init__(self, x0, P0, Q, R):
"""
Parameters
----------
x0 : array, initial state estimate
P0 : array, initial covariance matrix
Q : array, process noise covariance
R : array, observation noise covariance
"""
self.x = x0.copy()
self.P = P0.copy()
self.Q = Q
self.R = R
def predict(self):
"""Prediction step (Eq. 5, 6)"""
F = jacobian_F(self.x)
self.x = f(self.x) # Use nonlinear f for state prediction
self.P = F @ self.P @ F.T + self.Q # Use Jacobian F for covariance
return self.x.copy()
def update(self, z):
"""Update step (Eq. 7-11)"""
H = jacobian_H(self.x)
y = z - h(self.x) # Use nonlinear h for innovation
# Normalize bearing difference to [-pi, pi]
y[1] = (y[1] + np.pi) % (2 * np.pi) - np.pi
S = H @ self.P @ H.T + self.R
K = self.P @ H.T @ np.linalg.inv(S)
self.x = self.x + K @ y
self.P = (np.eye(len(self.x)) - K @ H) @ self.P
return self.x.copy()
Simulation
np.random.seed(42)
T = 100 # number of time steps
# Noise parameters
Q = np.diag([0.1, 0.1, 0.01, 0.01])
R = np.diag([0.5, 0.01]) # range noise, bearing noise
# True initial state (velocity set to produce a curved trajectory)
x_true = np.array([10.0, 0.0, 0.5, 1.0])
# Generate true trajectory and observations
true_states = [x_true.copy()]
measurements = []
for k in range(T):
# Slowly rotate velocity direction (gentle curve)
omega_true = 0.03
vx, vy = x_true[2], x_true[3]
x_true[2] = vx * np.cos(omega_true) - vy * np.sin(omega_true)
x_true[3] = vx * np.sin(omega_true) + vy * np.cos(omega_true)
x_true = f(x_true) + np.random.multivariate_normal(np.zeros(4), Q)
true_states.append(x_true.copy())
z = h(x_true) + np.random.multivariate_normal(np.zeros(2), R)
measurements.append(z.copy())
true_states = np.array(true_states)
measurements = np.array(measurements)
# Run EKF
x0 = np.array([10.5, -0.5, 0.0, 0.0])
P0 = np.diag([2.0, 2.0, 1.0, 1.0])
ekf = EKF(x0, P0, Q, R)
estimates = [x0.copy()]
for k in range(T):
ekf.predict()
ekf.update(measurements[k])
estimates.append(ekf.x.copy())
estimates = np.array(estimates)
# Convert observations to Cartesian coordinates for visualization
obs_x = measurements[:, 0] * np.cos(measurements[:, 1])
obs_y = measurements[:, 0] * np.sin(measurements[:, 1])
Visualization
plt.figure(figsize=(10, 8))
plt.plot(true_states[:, 0], true_states[:, 1], "b-",
linewidth=1.5, label="True trajectory")
plt.scatter(obs_x, obs_y, c="gray", s=10, alpha=0.5,
label="Measurements (Cartesian)")
plt.plot(estimates[:, 0], estimates[:, 1], "r--",
linewidth=1.5, label="EKF estimate")
plt.plot(true_states[0, 0], true_states[0, 1], "go",
markersize=10, label="Start")
plt.xlabel("x")
plt.ylabel("y")
plt.title("Extended Kalman Filter - Range-Bearing Tracking")
plt.legend()
plt.axis("equal")
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig("ekf_result.png", dpi=150)
plt.show()
# RMSE
rmse_x = np.sqrt(np.mean((true_states[1:, 0] - estimates[1:, 0])**2))
rmse_y = np.sqrt(np.mean((true_states[1:, 1] - estimates[1:, 1])**2))
print(f"Position RMSE: x={rmse_x:.4f}, y={rmse_y:.4f}")
Limitations of the EKF
While the EKF is the most widely used nonlinear filtering technique, it has several limitations:
Linearization error: Only the first-order terms of the Taylor expansion are retained, so approximation accuracy degrades when nonlinearity is strong. The estimate can become unstable in regions where the observation function has high curvature.
Sensitivity to initial estimate: If the initial estimate is far from the true state, the linearization becomes inaccurate and the filter may diverge.
Analytic Jacobian requirement: For complex systems, deriving the Jacobian matrices of \(f\) and \(h\) analytically can be tedious and error-prone. Numerical differentiation can be used as an alternative but introduces a trade-off between computational cost and accuracy.
Gaussian assumption: The EKF assumes the posterior distribution is Gaussian, making it unable to handle multimodal distributions.
To address these issues, the Unscented Transformation introduces sigma points that approximate the nonlinear transform directly without linearization (UKF). The CKF improves numerical stability by selecting sigma points based on the cubature rule, and the Particle Filter removes the Gaussian assumption entirely.
Related Articles
- Kalman Filter: Theory and Python Implementation - The linear Kalman Filter that EKF extends.
- Filtering Methods in Signal Processing - A systematic overview of KF, EKF, UKF, and particle filters.
- Unscented Transformation (UT): Python Implementation - Introduces the UT that avoids EKF’s linearization problem by using sigma points.
- Cubature Kalman Filter (CKF): Theory and Python Implementation - Addresses UKF’s weight issues using the spherical-radial cubature rule.
- Particle Filter in Python: Comparing Resampling Methods - A Monte Carlo-based filtering method that handles non-Gaussian distributions.
References
- Thrun, S., Burgard, W., & Fox, D. (2005). “Probabilistic Robotics.” MIT Press.
- Simon, D. (2006). “Optimal State Estimation: Kalman, H-infinity, and Nonlinear Approaches.” Wiley.