Extended Kalman Filter (EKF): Theory and Python Implementation

What is the Extended Kalman Filter?

The Kalman Filter (KF) is the optimal state estimator for linear Gaussian systems, but most real-world systems are nonlinear. The Extended Kalman Filter (EKF) extends the KF framework to nonlinear systems by linearizing the nonlinear functions around the current estimate using Jacobian matrices (first-order Taylor expansion).

EKF is the most fundamental nonlinear filtering technique and serves as the starting point for more advanced methods:

EKF: First-order linearization via Jacobian matrices
UKF: Direct approximation of nonlinear transforms using sigma points (Unscented Transformation)
CKF: Sigma point selection based on the spherical-radial cubature rule
Particle Filter: Monte Carlo sampling to approximate arbitrary distributions

Nonlinear State-Space Model

Consider the following nonlinear state-space model:

\[ \mathbf{x}_k = f(\mathbf{x}_{k-1}) + \mathbf{w}_{k-1}, \quad \mathbf{w}_{k-1} \sim \mathcal{N}(\mathbf{0}, \mathbf{Q}) \tag{1} \]\[ \mathbf{z}_k = h(\mathbf{x}_k) + \mathbf{v}_k, \quad \mathbf{v}_k \sim \mathcal{N}(\mathbf{0}, \mathbf{R}) \tag{2} \]

Here \(f\) is the nonlinear state transition function and \(h\) is the nonlinear observation function. In the linear KF, these are expressed as matrices (\(f(\mathbf{x}) = \mathbf{A}\mathbf{x}\) and \(h(\mathbf{x}) = \mathbf{H}\mathbf{x}\)), but in the EKF they are general nonlinear functions.

Linearization via Jacobian Matrices

The key idea of the EKF is to perform a Taylor expansion of the nonlinear functions \(f\) and \(h\) around the current estimate, retaining only the first-order terms.

The Jacobian of the state transition function \(f\) is computed at the previous estimate \(\hat{\mathbf{x}}_{k-1|k-1}\):

\[ \mathbf{F}_k = \frac{\partial f}{\partial \mathbf{x}}\bigg|_{\mathbf{x}=\hat{\mathbf{x}}_{k-1|k-1}} = \begin{bmatrix} \frac{\partial f_1}{\partial x_1} & \cdots & \frac{\partial f_1}{\partial x_n} \\\ \vdots & \ddots & \vdots \\\ \frac{\partial f_n}{\partial x_1} & \cdots & \frac{\partial f_n}{\partial x_n} \end{bmatrix} \tag{3} \]

The Jacobian of the observation function \(h\) is computed at the predicted state \(\hat{\mathbf{x}}_{k|k-1}\):

\[ \mathbf{H}_k = \frac{\partial h}{\partial \mathbf{x}}\bigg|_{\mathbf{x}=\hat{\mathbf{x}}_{k|k-1}} \tag{4} \]

This linearization leads to the characteristic structure of the EKF: the Jacobian matrices \(\mathbf{F}_k, \mathbf{H}_k\) are used for covariance propagation, while the original nonlinear functions \(f, h\) are used for state and observation computation.

EKF Algorithm

Prediction Step

The state prediction uses the nonlinear function \(f\) directly:

\[ \hat{\mathbf{x}}_{k|k-1} = f(\hat{\mathbf{x}}_{k-1|k-1}) \tag{5} \]

The covariance prediction uses the Jacobian \(\mathbf{F}_k\) for linear approximation:

\[ \mathbf{P}_{k|k-1} = \mathbf{F}_k \mathbf{P}_{k-1|k-1} \mathbf{F}_k^T + \mathbf{Q} \tag{6} \]

Update Step

The observation residual (innovation) uses the nonlinear function \(h\):

\[ \mathbf{y}_k = \mathbf{z}_k - h(\hat{\mathbf{x}}_{k|k-1}) \tag{7} \]

The innovation covariance and Kalman gain use the Jacobian \(\mathbf{H}_k\):

\[ \mathbf{S}_k = \mathbf{H}_k \mathbf{P}_{k|k-1} \mathbf{H}_k^T + \mathbf{R} \tag{8} \]\[ \mathbf{K}_k = \mathbf{P}_{k|k-1} \mathbf{H}_k^T \mathbf{S}_k^{-1} \tag{9} \]

State and covariance update:

\[ \hat{\mathbf{x}}_{k|k} = \hat{\mathbf{x}}_{k|k-1} + \mathbf{K}_k \mathbf{y}_k \tag{10} \]\[ \mathbf{P}_{k|k} = (\mathbf{I} - \mathbf{K}_k \mathbf{H}_k) \mathbf{P}_{k|k-1} \tag{11} \]

The key difference from the standard KF is that the nonlinear functions \(f, h\) are used for state propagation and observation prediction, while the Jacobian matrices \(\mathbf{F}_k, \mathbf{H}_k\) are used for covariance propagation.

Python Implementation

We implement the EKF for a 2D tracking problem where a target moves in a plane and an observer at the origin measures range and bearing.

Problem Setup

The state vector is \(\mathbf{x} = [p_x, p_y, v_x, v_y]^T\) (position and velocity). The state transition follows a constant velocity model:

\[ f(\mathbf{x}) = \begin{bmatrix} p_x + v_x \Delta t \\\ p_y + v_y \Delta t \\\ v_x \\\ v_y \end{bmatrix} \]

The observation model is a nonlinear function where the observer at the origin measures range \(r\) and bearing \(\theta\):

\[ h(\mathbf{x}) = \begin{bmatrix} \sqrt{p_x^2 + p_y^2} \\\ \arctan(p_y / p_x) \end{bmatrix} \]

The Jacobian of the observation function \(h\) is, with \(r = \sqrt{p_x^2 + p_y^2}\):

\[ \mathbf{H} = \begin{bmatrix} p_x / r & p_y / r & 0 & 0 \\\ -p_y / r^2 & p_x / r^2 & 0 & 0 \end{bmatrix} \]

Model Definition

import numpy as np
import matplotlib.pyplot as plt

dt = 1.0  # time step

def f(x):
    """State transition function (constant velocity model)"""
    px, py, vx, vy = x
    return np.array([px + vx * dt, py + vy * dt, vx, vy])

def jacobian_F(x):
    """Jacobian of the state transition function"""
    return np.array([
        [1, 0, dt, 0],
        [0, 1, 0, dt],
        [0, 0, 1,  0],
        [0, 0, 0,  1]
    ])

def h(x):
    """Observation function (range and bearing)"""
    px, py = x[0], x[1]
    r = np.sqrt(px**2 + py**2)
    theta = np.arctan2(py, px)
    return np.array([r, theta])

def jacobian_H(x):
    """Jacobian of the observation function"""
    px, py = x[0], x[1]
    r = np.sqrt(px**2 + py**2)
    return np.array([
        [px / r,      py / r,  0, 0],
        [-py / r**2,  px / r**2, 0, 0]
    ])

EKF Implementation

class EKF:
    def __init__(self, x0, P0, Q, R):
        """
        Parameters
        ----------
        x0 : array, initial state estimate
        P0 : array, initial covariance matrix
        Q  : array, process noise covariance
        R  : array, observation noise covariance
        """
        self.x = x0.copy()
        self.P = P0.copy()
        self.Q = Q
        self.R = R

    def predict(self):
        """Prediction step (Eq. 5, 6)"""
        F = jacobian_F(self.x)
        self.x = f(self.x)           # Use nonlinear f for state prediction
        self.P = F @ self.P @ F.T + self.Q  # Use Jacobian F for covariance
        return self.x.copy()

    def update(self, z):
        """Update step (Eq. 7-11)"""
        H = jacobian_H(self.x)
        y = z - h(self.x)            # Use nonlinear h for innovation
        # Normalize bearing difference to [-pi, pi]
        y[1] = (y[1] + np.pi) % (2 * np.pi) - np.pi

        S = H @ self.P @ H.T + self.R
        K = self.P @ H.T @ np.linalg.inv(S)
        self.x = self.x + K @ y
        self.P = (np.eye(len(self.x)) - K @ H) @ self.P
        return self.x.copy()

Simulation

np.random.seed(42)
T = 100  # number of time steps

# Noise parameters
Q = np.diag([0.1, 0.1, 0.01, 0.01])
R = np.diag([0.5, 0.01])  # range noise, bearing noise

# True initial state (velocity set to produce a curved trajectory)
x_true = np.array([10.0, 0.0, 0.5, 1.0])

# Generate true trajectory and observations
true_states = [x_true.copy()]
measurements = []

for k in range(T):
    # Slowly rotate velocity direction (gentle curve)
    omega_true = 0.03
    vx, vy = x_true[2], x_true[3]
    x_true[2] = vx * np.cos(omega_true) - vy * np.sin(omega_true)
    x_true[3] = vx * np.sin(omega_true) + vy * np.cos(omega_true)

    x_true = f(x_true) + np.random.multivariate_normal(np.zeros(4), Q)
    true_states.append(x_true.copy())

    z = h(x_true) + np.random.multivariate_normal(np.zeros(2), R)
    measurements.append(z.copy())

true_states = np.array(true_states)
measurements = np.array(measurements)

# Run EKF
x0 = np.array([10.5, -0.5, 0.0, 0.0])
P0 = np.diag([2.0, 2.0, 1.0, 1.0])
ekf = EKF(x0, P0, Q, R)

estimates = [x0.copy()]
for k in range(T):
    ekf.predict()
    ekf.update(measurements[k])
    estimates.append(ekf.x.copy())

estimates = np.array(estimates)

# Convert observations to Cartesian coordinates for visualization
obs_x = measurements[:, 0] * np.cos(measurements[:, 1])
obs_y = measurements[:, 0] * np.sin(measurements[:, 1])

Visualization

plt.figure(figsize=(10, 8))
plt.plot(true_states[:, 0], true_states[:, 1], "b-",
         linewidth=1.5, label="True trajectory")
plt.scatter(obs_x, obs_y, c="gray", s=10, alpha=0.5,
            label="Measurements (Cartesian)")
plt.plot(estimates[:, 0], estimates[:, 1], "r--",
         linewidth=1.5, label="EKF estimate")
plt.plot(true_states[0, 0], true_states[0, 1], "go",
         markersize=10, label="Start")
plt.xlabel("x")
plt.ylabel("y")
plt.title("Extended Kalman Filter - Range-Bearing Tracking")
plt.legend()
plt.axis("equal")
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig("ekf_result.png", dpi=150)
plt.show()

# RMSE
rmse_x = np.sqrt(np.mean((true_states[1:, 0] - estimates[1:, 0])**2))
rmse_y = np.sqrt(np.mean((true_states[1:, 1] - estimates[1:, 1])**2))
print(f"Position RMSE: x={rmse_x:.4f}, y={rmse_y:.4f}")

Limitations of the EKF

While the EKF is the most widely used nonlinear filtering technique, it has several limitations:

Linearization error: Only the first-order terms of the Taylor expansion are retained, so approximation accuracy degrades when nonlinearity is strong. The estimate can become unstable in regions where the observation function has high curvature.
Sensitivity to initial estimate: If the initial estimate is far from the true state, the linearization becomes inaccurate and the filter may diverge.
Analytic Jacobian requirement: For complex systems, deriving the Jacobian matrices of \(f\) and \(h\) analytically can be tedious and error-prone. Numerical differentiation can be used as an alternative but introduces a trade-off between computational cost and accuracy.
Gaussian assumption: The EKF assumes the posterior distribution is Gaussian, making it unable to handle multimodal distributions.

To address these issues, the Unscented Transformation introduces sigma points that approximate the nonlinear transform directly without linearization (UKF). The CKF improves numerical stability by selecting sigma points based on the cubature rule, and the Particle Filter removes the Gaussian assumption entirely.

Kalman Filter: Theory and Python Implementation - The linear Kalman Filter that EKF extends.
Unscented Kalman Filter (UKF): Nonlinear State Estimation Theory and Python Implementation - Sigma-point based nonlinear filter that avoids EKF’s linearization.
Cubature Kalman Filter (CKF): Theory and Python Implementation - Addresses UKF’s weight issues using the spherical-radial cubature rule.
Particle Filter Python Implementation: Comparing Resampling Methods - A Monte Carlo-based filtering method that handles non-Gaussian distributions.
Kalman Smoother (RTS Smoother): Theory and Python Implementation - Smoothing using observations from all time steps.
Filtering Methods in Signal Processing - A systematic overview of KF, EKF, UKF, and particle filters.

References

Thrun, S., Burgard, W., & Fox, D. (2005). “Probabilistic Robotics.” MIT Press.
Simon, D. (2006). “Optimal State Estimation: Kalman, H-infinity, and Nonlinear Approaches.” Wiley.