What is the Unscented Transformation?
The Unscented Transformation (UT) is a method for estimating the mean $\bar{y}$ and covariance $P_y$ of a random variable $y = f(x)$, where $x$ is transformed by a nonlinear function $f$, given the known mean $\bar{x}$ and covariance $P_x$ of $x$.
One approach is to estimate these quantities using Monte Carlo sampling:
$$ \bar{y}\simeq\frac{1}{N}\sum_{i=1}^Nf(x_i) $$
$$ P_y \simeq \frac{1}{N}\sum_{i=1}^N(f(x_i)-\bar{y})(f(x_i)-\bar{y})^T $$
However, this requires a very large number of samples $N$ for accurate estimation, making it computationally expensive.
The UT solves this problem by using a small set of carefully chosen representative points (sigma points) to efficiently and accurately estimate the statistical properties of the transformed variable – without any linear approximation. This retains the benefits of Monte Carlo methods (handling nonlinearity) while drastically reducing computational cost.
Reference: What is UKF (Unscented Kalman Filter)?
Python Implementation
Here we consider transforming a 2-dimensional input $X=(X_1, X_2)$ to a 1-dimensional output $Y=f(X)=X_1 \cdot X_2$.
1. Input Random Variable Setup
Define the mean vector $\bar{x}$ and covariance matrix $P_x$ of the input $X$.
$$ \bar{x} = [E[X_1], E[X_2]] $$ $$ P_x = \begin{pmatrix} \text{Var}[X_1] & \text{Cov}[X_1, X_2] \ \text{Cov}[X_2, X_1] & \text{Var}[X_2] \end{pmatrix} $$
As an example, assume $X_1$ has mean 0 and variance 1, and $X_2$ has mean 1 and variance 4, with covariance 2.
import matplotlib.pyplot as plt
import numpy as np
import random
import math
import scipy.linalg
# Dimension of input x
n = 2
# Dimension of output y
m = 1
# Mean vector and covariance matrix of x
x_mean = np.array([0.0, 1.0])
x_P = np.array([[1.0, 2.0], [2.0, 4.0]])
print(f"Input x mean: {x_mean}")
print(f"Input x covariance matrix:\n{x_P}")
# Nonlinear transformation f(x) = x[0] * x[1]
def f(x):
return np.array([x[0] * x[1]])
2. Sigma Point Calculation
Sigma points are carefully chosen representative points that accurately capture the mean and covariance of the input random variable $x$. They are computed using the following formulas:
$$ \sigma_0 = \bar{x} \tag{1} $$ $$ \sigma_i = \bar{x} + (\sqrt{(n+\lambda)P_x})_i \quad \text{for } i=1, \dots, n \tag{2} $$ $$ \sigma_i = \bar{x} - (\sqrt{(n+\lambda)P_x})_i \quad \text{for } i=n+1, \dots, 2n \tag{3} $$
Here, $ (\sqrt{(n+\lambda)P_x})_i $ represents the $i$-th column of the matrix square root (e.g., via Cholesky decomposition).
The parameter $\lambda$ is calculated as: $$ \lambda = \alpha^2 (n + \kappa) - n \tag{4} $$
- $\alpha$: A scalar that determines the spread of sigma points around the mean (typically a small positive value between 0 and 1).
- $\kappa$: A tuning parameter, usually set to 0.
# Parameter settings
alpha = 0.5
kappa = 0.0
# Eq. 4: Calculate lambda
lam = alpha**2 * (n + kappa) - n
print(f"lambda: {lam}")
# Array to store sigma points (2n+1 sigma points)
sigma_points = np.zeros((n, 2 * n + 1))
# Eq. 1: First sigma point is the mean
sigma_points[:, 0] = x_mean
# Compute matrix square root of (n+lambda) * Px using Cholesky decomposition
sqrt_term = scipy.linalg.cholesky((n + lam) * x_P, lower=True)
# Eq. 2, 3: Compute remaining sigma points
for i in range(n):
sigma_points[:, i + 1] = x_mean + sqrt_term[:, i]
sigma_points[:, i + n + 1] = x_mean - sqrt_term[:, i]
print("\nGenerated sigma points:")
for i in range(2 * n + 1):
print(f" Sigma point {i}: {sigma_points[:, i]}")
3. Transformation and Statistical Calculation
Each sigma point is transformed by the nonlinear function $f$ to obtain $y_{\sigma}$: $$ y_{\sigma, i} = f(\sigma_i) \tag{5} $$
Next, the weights $w_i$ for each sigma point are calculated: $$ w_0 = \frac{\lambda}{n + \lambda} \tag{6} $$ $$ w_i = \frac{1}{2(n + \lambda)} \quad \text{for } i=1, \dots, 2n \tag{7} $$
Finally, the mean $\bar{y}$ and covariance $P_y$ of $y$ are computed using these weights and transformed sigma points: $$ \bar{y} \approx \sum_{i=0}^{2n} w_i y_{\sigma, i} \tag{8} $$ $$ P_y \approx \sum_{i=0}^{2n} w_i (y_{\sigma, i} - \bar{y})(y_{\sigma, i} - \bar{y})^T \tag{9} $$
# Array to store transformed sigma points
sigma_y = np.zeros((m, 2 * n + 1))
# Eq. 5: Nonlinear transformation
for i in range(2 * n + 1):
sigma_y[:, i] = f(sigma_points[:, i])
# Array to store weights
w = np.zeros(2 * n + 1)
# Eq. 6, 7: Weight calculation
w[0] = lam / (n + lam)
for i in range(1, 2 * n + 1):
w[i] = 1 / (2 * (n + lam))
print(f"\nCalculated weights: {w}")
# Eq. 8: Calculate mean of y
y_mean = np.sum(w * sigma_y, axis=1)
# Eq. 9: Calculate covariance of y
y_P = np.zeros((m, m))
for i in range(2 * n + 1):
diff = sigma_y[:, i] - y_mean
y_P += w[i] * np.outer(diff, diff) # Use outer product for (m,m) matrix
print(f"\nEstimated mean of output y: {y_mean}")
print(f"Estimated covariance matrix of output y:\n{y_P}")
This implementation demonstrates how the mean and covariance of a nonlinearly transformed random variable can be efficiently estimated from a small number of sigma points. The Unscented Transformation is the foundation of the Unscented Kalman Filter (UKF).