Unscented Transformation: Algorithm and Python Implementation

A Python implementation of the Unscented Transformation for estimating the mean and covariance of nonlinearly transformed random variables using sigma points.

What is the Unscented Transformation?

The Unscented Transformation (UT) is a method for estimating the mean $\bar{y}$ and covariance $P_y$ of a random variable $y = f(x)$, where $x$ is transformed by a nonlinear function $f$, given the known mean $\bar{x}$ and covariance $P_x$ of $x$.

One approach is to estimate these quantities using Monte Carlo sampling:

$$ \bar{y}\simeq\frac{1}{N}\sum_{i=1}^Nf(x_i) $$

$$ P_y \simeq \frac{1}{N}\sum_{i=1}^N(f(x_i)-\bar{y})(f(x_i)-\bar{y})^T $$

However, this requires a very large number of samples $N$ for accurate estimation, making it computationally expensive.

The UT solves this problem by using a small set of carefully chosen representative points (sigma points) to efficiently and accurately estimate the statistical properties of the transformed variable – without any linear approximation. This retains the benefits of Monte Carlo methods (handling nonlinearity) while drastically reducing computational cost.

Reference: What is UKF (Unscented Kalman Filter)?


Python Implementation

Here we consider transforming a 2-dimensional input $X=(X_1, X_2)$ to a 1-dimensional output $Y=f(X)=X_1 \cdot X_2$.

1. Input Random Variable Setup

Define the mean vector $\bar{x}$ and covariance matrix $P_x$ of the input $X$.

$$ \bar{x} = [E[X_1], E[X_2]] $$ $$ P_x = \begin{pmatrix} \text{Var}[X_1] & \text{Cov}[X_1, X_2] \ \text{Cov}[X_2, X_1] & \text{Var}[X_2] \end{pmatrix} $$

As an example, assume $X_1$ has mean 0 and variance 1, and $X_2$ has mean 1 and variance 4, with covariance 2.

import matplotlib.pyplot as plt
import numpy as np
import random
import math
import scipy.linalg

# Dimension of input x
n = 2
# Dimension of output y
m = 1

# Mean vector and covariance matrix of x
x_mean = np.array([0.0, 1.0])
x_P = np.array([[1.0, 2.0], [2.0, 4.0]])

print(f"Input x mean: {x_mean}")
print(f"Input x covariance matrix:\n{x_P}")

# Nonlinear transformation f(x) = x[0] * x[1]
def f(x):
  return np.array([x[0] * x[1]])

2. Sigma Point Calculation

Sigma points are carefully chosen representative points that accurately capture the mean and covariance of the input random variable $x$. They are computed using the following formulas:

$$ \sigma_0 = \bar{x} \tag{1} $$ $$ \sigma_i = \bar{x} + (\sqrt{(n+\lambda)P_x})_i \quad \text{for } i=1, \dots, n \tag{2} $$ $$ \sigma_i = \bar{x} - (\sqrt{(n+\lambda)P_x})_i \quad \text{for } i=n+1, \dots, 2n \tag{3} $$

Here, $ (\sqrt{(n+\lambda)P_x})_i $ represents the $i$-th column of the matrix square root (e.g., via Cholesky decomposition).

The parameter $\lambda$ is calculated as: $$ \lambda = \alpha^2 (n + \kappa) - n \tag{4} $$

  • $\alpha$: A scalar that determines the spread of sigma points around the mean (typically a small positive value between 0 and 1).
  • $\kappa$: A tuning parameter, usually set to 0.
# Parameter settings
alpha = 0.5
kappa = 0.0

# Eq. 4: Calculate lambda
lam = alpha**2 * (n + kappa) - n
print(f"lambda: {lam}")

# Array to store sigma points (2n+1 sigma points)
sigma_points = np.zeros((n, 2 * n + 1))

# Eq. 1: First sigma point is the mean
sigma_points[:, 0] = x_mean

# Compute matrix square root of (n+lambda) * Px using Cholesky decomposition
sqrt_term = scipy.linalg.cholesky((n + lam) * x_P, lower=True)

# Eq. 2, 3: Compute remaining sigma points
for i in range(n):
    sigma_points[:, i + 1] = x_mean + sqrt_term[:, i]
    sigma_points[:, i + n + 1] = x_mean - sqrt_term[:, i]

print("\nGenerated sigma points:")
for i in range(2 * n + 1):
  print(f"  Sigma point {i}: {sigma_points[:, i]}")

3. Transformation and Statistical Calculation

Each sigma point is transformed by the nonlinear function $f$ to obtain $y_{\sigma}$: $$ y_{\sigma, i} = f(\sigma_i) \tag{5} $$

Next, the weights $w_i$ for each sigma point are calculated: $$ w_0 = \frac{\lambda}{n + \lambda} \tag{6} $$ $$ w_i = \frac{1}{2(n + \lambda)} \quad \text{for } i=1, \dots, 2n \tag{7} $$

Finally, the mean $\bar{y}$ and covariance $P_y$ of $y$ are computed using these weights and transformed sigma points: $$ \bar{y} \approx \sum_{i=0}^{2n} w_i y_{\sigma, i} \tag{8} $$ $$ P_y \approx \sum_{i=0}^{2n} w_i (y_{\sigma, i} - \bar{y})(y_{\sigma, i} - \bar{y})^T \tag{9} $$

# Array to store transformed sigma points
sigma_y = np.zeros((m, 2 * n + 1))
# Eq. 5: Nonlinear transformation
for i in range(2 * n + 1):
  sigma_y[:, i] = f(sigma_points[:, i])

# Array to store weights
w = np.zeros(2 * n + 1)
# Eq. 6, 7: Weight calculation
w[0] = lam / (n + lam)
for i in range(1, 2 * n + 1):
  w[i] = 1 / (2 * (n + lam))

print(f"\nCalculated weights: {w}")

# Eq. 8: Calculate mean of y
y_mean = np.sum(w * sigma_y, axis=1)

# Eq. 9: Calculate covariance of y
y_P = np.zeros((m, m))
for i in range(2 * n + 1):
  diff = sigma_y[:, i] - y_mean
  y_P += w[i] * np.outer(diff, diff) # Use outer product for (m,m) matrix

print(f"\nEstimated mean of output y: {y_mean}")
print(f"Estimated covariance matrix of output y:\n{y_P}")

This implementation demonstrates how the mean and covariance of a nonlinearly transformed random variable can be efficiently estimated from a small number of sigma points. The Unscented Transformation is the foundation of the Unscented Kalman Filter (UKF).