What is Filtering?
Filtering is a technique for removing noise components from observed time-series data and extracting the underlying signal.
State-Space Model
In filtering, we estimate the internal state $x_t$, which cannot be directly observed, from observations $z_t$. The system behavior is represented by two models:
- Process Model (System Model): Describes how the state $x_t$ evolves over time. $$ x_t = f(x_{t-1}, u_{t-1}) + q_{t-1} \tag{1} $$
- Observation Model: Describes how observations $z_t$ are generated from the state $x_t$. $$ z_t = h(x_t) + r_t \tag{2} $$
Here, $u_t$ is the control input, and $q_t$ and $r_t$ are the process noise and observation noise, respectively, which are generally assumed to follow zero-mean Gaussian distributions. $$ q \sim N(0, Q) $$ $$ r \sim N(0, R) $$ $Q$ and $R$ are the noise covariance matrices.
Kalman Filter (KF)
Overview
The KF is an optimal filter that accurately estimates the mean and covariance of the state under the assumptions that the system is linear and that the noise follows a Gaussian distribution. It sequentially estimates the state by repeating two steps: “prediction” and “update (filtering).”
Algorithm
The probability distribution of state $x$ is represented by its mean $\mu$ and covariance $\Sigma$.
1. Prediction Step: Predicts the current state from the previous state estimate.
- Prior state estimate: Predicts the current state $\hat{\mu}t$ from the previous state $\mu{t-1}$. $$ \hat{\mu}t = A\mu{t-1} + Bu_{t-1} \tag{3} $$
- Prior error covariance matrix: Computes the prediction uncertainty $\hat{\Sigma}_t$. $$ \hat{\Sigma}t = A\Sigma{t-1}A^T + Q \tag{4} $$
2. Update Step: Corrects the prediction with the observation $z_t$ to obtain a more accurate current state estimate.
- Kalman gain: A coefficient that determines how much to weight the prediction versus the observation. $$ K_t = \hat{\Sigma}_t H^T (H\hat{\Sigma}_t H^T + R)^{-1} \tag{5} $$
- Posterior state estimate: Corrects the predicted value $\hat{\mu}_t$ with the observation $z_t$. $$ \mu_t = \hat{\mu}_t + K_t(z_t - H\hat{\mu}_t) \tag{6} $$
- Posterior error covariance matrix: Computes the updated uncertainty $\Sigma_t$. $$ \Sigma_t = (I - K_tH)\hat{\Sigma}_t \tag{7} $$
Extended Kalman Filter (EKF)
Overview
The EKF extends the KF to handle nonlinear systems. It applies the KF framework by linearizing the nonlinear functions around the current state estimate using a first-order Taylor expansion.
Method
The process model $f$ and observation model $h$ are linearized using Jacobian matrices (partial derivatives). $$ F_t = \frac{\partial f}{\partial x} \bigg|{x=\mu{t-1}} $$ $$ H_t = \frac{\partial h}{\partial x} \bigg|_{x=\hat{\mu}_t} $$ These $F_t$ and $H_t$ are used in place of $A$ and $H$ in the linear Kalman filter equations, though the prediction step uses the nonlinear functions directly.
- Limitation: For highly nonlinear systems, the linearization error becomes large, potentially degrading estimation accuracy or causing divergence.
Unscented Kalman Filter (UKF)
Overview
Like the EKF, the UKF handles nonlinear systems, but instead of directly linearizing the functions, it uses the Unscented Transform to propagate the probability distribution of the state.
Method
A small set of representative points (sigma points) that capture the current state distribution are sampled and passed through the nonlinear function. The mean and covariance are then recomputed from the transformed points using weighted averages, achieving higher accuracy than the EKF.
- Advantage: No Jacobian computation is required, and it is more robust than the EKF for highly nonlinear systems.
Particle Filter (PF)
Overview
The PF is a more general filtering method for nonlinear and non-Gaussian state-space models, based on the Monte Carlo method.
Method
The probability distribution of the state is approximated by a set of many sample points called particles. Each particle represents a “hypothesis” about the state, and each is assigned a weight based on its likelihood.
The PF mainly consists of three steps: “prediction,” “update,” and “resampling.”
- Prediction: All particles are propagated forward in time according to the process model.
- Update: When an observation is received, the likelihood of each particle is computed and the weights are updated.
- Resampling: Particles are resampled according to their weights. Low-likelihood (implausible) particles are eliminated, while high-likelihood particles are duplicated, enabling efficient estimation.
- Advantage: Highly versatile, as it can represent complex probability distributions beyond Gaussian.
- Limitation: As the state dimension increases, the number of particles needed to adequately represent the distribution grows exponentially – a problem known as the “curse of dimensionality.”