Unscented Transformation: Algorithm and Python Implementation

A Python implementation of the Unscented Transformation for estimating the mean and covariance of nonlinearly transformed random variables using sigma points.

What is the Unscented Transformation?

The Unscented Transformation (UT) is a method for estimating the mean \(\bar{y}\) and covariance \(P_y\) of a random variable \(y = f(x)\), where \(x\) is transformed by a nonlinear function \(f\), given the known mean \(\bar{x}\) and covariance \(P_x\) of \(x\).

One approach is to estimate these quantities using Monte Carlo sampling:

\[ \bar{y}\simeq\frac{1}{N}\sum_{i=1}^Nf(x_i) \]\[ P_y \simeq \frac{1}{N}\sum_{i=1}^N(f(x_i)-\bar{y})(f(x_i)-\bar{y})^T \]

However, this requires a very large number of samples \(N\) for accurate estimation, making it computationally expensive.

The UT solves this problem by using a small set of carefully chosen representative points (sigma points) to efficiently and accurately estimate the statistical properties of the transformed variable – without any linear approximation. This retains the benefits of Monte Carlo methods (handling nonlinearity) while drastically reducing computational cost.

Reference: What is UKF (Unscented Kalman Filter)?


Python Implementation

Here we consider transforming a 2-dimensional input \(X=(X_1, X_2)\) to a 1-dimensional output \(Y=f(X)=X_1 \cdot X_2\).

1. Input Random Variable Setup

Define the mean vector \(\bar{x}\) and covariance matrix \(P_x\) of the input \(X\).

\[ \bar{x} = [E[X_1], E[X_2]] \]

\[ P_x = \begin{pmatrix} \text{Var}[X_1] & \text{Cov}[X_1, X_2] \\ \text{Cov}[X_2, X_1] & \text{Var}[X_2] \end{pmatrix} \]

As an example, assume \(X_1\) has mean 0 and variance 1, and \(X_2\) has mean 1 and variance 4, with covariance 2.

import matplotlib.pyplot as plt
import numpy as np
import random
import math
import scipy.linalg

# Dimension of input x
n = 2
# Dimension of output y
m = 1

# Mean vector and covariance matrix of x
x_mean = np.array([0.0, 1.0])
x_P = np.array([[1.0, 2.0], [2.0, 4.0]])

print(f"Input x mean: {x_mean}")
print(f"Input x covariance matrix:\n{x_P}")

# Nonlinear transformation f(x) = x[0] * x[1]
def f(x):
  return np.array([x[0] * x[1]])

2. Sigma Point Calculation

Sigma points are carefully chosen representative points that accurately capture the mean and covariance of the input random variable \(x\). They are computed using the following formulas:

\[ \sigma_0 = \bar{x} \tag{1} \]

\[ \sigma_i = \bar{x} + (\sqrt{(n+\lambda)P_x})\_i \quad \text{for } i=1, \dots, n \tag{2} \]

\[ \sigma_i = \bar{x} - (\sqrt{(n+\lambda)P_x})\_i \quad \text{for } i=n+1, \dots, 2n \tag{3} \]

Here, \( (\sqrt{(n+\lambda)P_x})\_i \) represents the \(i\)-th column of the matrix square root (e.g., via Cholesky decomposition).

The parameter \(\lambda\) is calculated as:

\[ \lambda = \alpha^2 (n + \kappa) - n \tag{4} \]
  • \(\alpha\): A scalar that determines the spread of sigma points around the mean (typically a small positive value between 0 and 1).
  • \(\kappa\): A tuning parameter, usually set to 0.
# Parameter settings
alpha = 0.5
kappa = 0.0

# Eq. 4: Calculate lambda
lam = alpha**2 * (n + kappa) - n
print(f"lambda: {lam}")

# Array to store sigma points (2n+1 sigma points)
sigma_points = np.zeros((n, 2 * n + 1))

# Eq. 1: First sigma point is the mean
sigma_points[:, 0] = x_mean

# Compute matrix square root of (n+lambda) * Px using Cholesky decomposition
sqrt_term = scipy.linalg.cholesky((n + lam) * x_P, lower=True)

# Eq. 2, 3: Compute remaining sigma points
for i in range(n):
    sigma_points[:, i + 1] = x_mean + sqrt_term[:, i]
    sigma_points[:, i + n + 1] = x_mean - sqrt_term[:, i]

print("\nGenerated sigma points:")
for i in range(2 * n + 1):
  print(f"  Sigma point {i}: {sigma_points[:, i]}")

3. Transformation and Statistical Calculation

Each sigma point is transformed by the nonlinear function \(f\) to obtain \(y_{\sigma}\):

\[ y\_{\sigma, i} = f(\sigma_i) \tag{5} \]

Next, the weights \(w_i\) for each sigma point are calculated:

\[ w_0 = \frac{\lambda}{n + \lambda} \tag{6} \]

\[ w_i = \frac{1}{2(n + \lambda)} \quad \text{for } i=1, \dots, 2n \tag{7} \]

Finally, the mean \(\bar{y}\) and covariance \(P_y\) of \(y\) are computed using these weights and transformed sigma points:

\[ \bar{y} \approx \sum*{i=0}^{2n} w_i y*{\sigma, i} \tag{8} \]

\[ P*y \approx \sum*{i=0}^{2n} w*i (y*{\sigma, i} - \bar{y})(y\_{\sigma, i} - \bar{y})^T \tag{9} \]
# Array to store transformed sigma points
sigma_y = np.zeros((m, 2 * n + 1))
# Eq. 5: Nonlinear transformation
for i in range(2 * n + 1):
  sigma_y[:, i] = f(sigma_points[:, i])

# Array to store weights
w = np.zeros(2 * n + 1)
# Eq. 6, 7: Weight calculation
w[0] = lam / (n + lam)
for i in range(1, 2 * n + 1):
  w[i] = 1 / (2 * (n + lam))

print(f"\nCalculated weights: {w}")

# Eq. 8: Calculate mean of y
y_mean = np.sum(w * sigma_y, axis=1)

# Eq. 9: Calculate covariance of y
y_P = np.zeros((m, m))
for i in range(2 * n + 1):
  diff = sigma_y[:, i] - y_mean
  y_P += w[i] * np.outer(diff, diff) # Use outer product for (m,m) matrix

print(f"\nEstimated mean of output y: {y_mean}")
print(f"Estimated covariance matrix of output y:\n{y_P}")

This implementation demonstrates how the mean and covariance of a nonlinearly transformed random variable can be efficiently estimated from a small number of sigma points. The Unscented Transformation is the foundation of the Unscented Kalman Filter (UKF).