Class Meeting 03: Robot State Estimation

Today's Class Meeting

Gaining familiarity with various approaches to robot state estimation (Bayes filters, Kalman filters, particle filters) - here's a link to today's slides
Completing a class exercise on state estimation

Robot State Estimation

Today, we're covering topics related to how a robot estimates its state ($x_t$) using measurement data ($z_t$) and the knowledge of the actions the robot takes in the environment ($u_t$). This problem can be represented as a hidden Markov model or dynamic Bayes network as depicted in the following diagram.

Bayes Filter Algorithm

The following is the Bayes filter algorithm as we discussed during class:

$\textrm{Bayes_Filter}( bel(x_{t-1}), u_t, z_t):$
$\qquad \textrm{for} \: \textrm{all} \: x_t \: \textrm{do} $
$\qquad \qquad \overline{bel}(x_t) = \int p(x_t | u_t, x_{t-1}) \: bel(x_{t-1}) \: dx_{t-1}$
$\qquad \qquad bel(x_t) = \eta \: p(z_t | x_t) \: \overline{bel}(x_t) $
$ \qquad \textrm{endfor}$
$ \qquad \textrm{return} \: bel(x_t) $

Some useful tips/notes:

$\eta$ is a normalizer used to ensure that $\int_t bel(x_t) = 1$
When we compute the update rule, $\overline{bel}(x_t) = \int p(x_t | u_t, x_{t-1}) \: bel(x_{t-1}) \: dx_{t-1}$, for finite state spaces, the integral turns into a finite sum: $\overline{bel}(x_t) = \sum_{x_{t-1}} p(x_t | u_t, x_{t-1}) \: bel(x_{t-1}) $.

Class Exercise: Estimating the State of a Door (Bayes Filter Algorithm)

We'll first go over this as a class example and the belief calculation for $t = 1$ as a class.

We'll practice applying the Bayes filter on a situation where a robot is estimating the state of a door using a forward-facing camera. We will assume that the door can be in one of two states: 1) open or 2) closed. Also, we'll assume that the robot doesn't know what state the door is in, so we assign a prior probability for the two states of: $$bel(X_0 = \textrm{open}) = 0.5$$ $$bel(X_0 = \textrm{closed}) = 0.5$$

Let's also assume that the robot's camera sensor is noisy and can be characterized by the following conditional probabilities: $$p(Z_t = \textrm{sense_open} \: | \: X_t = \textrm{is_open}) = 0.6$$ $$p(Z_t = \textrm{sense_closed} \: | \: X_t = \textrm{is_open}) = 0.4$$ $$p(Z_t = \textrm{sense_open} \: | \: X_t = \textrm{is_closed}) = 0.2$$ $$p(Z_t = \textrm{sense_closed} \: | \: X_t = \textrm{is_closed}) = 0.8$$ These probabilities indicate to us that the robot's sensors have less error when sensing when the door is closed (error = 0.2) than when sensing when the door is open (error = 0.4).

Our final set of assumptions are about the robot's ability to influence the environment. Let's assume that the robot can use its arm to push open the door with a 0.8 chance: $$p(X_t = \textrm{is_open} \: | \: U_t = \textrm{push}, X_{t-1} = \textrm{is_open}) = 1$$ $$p(X_t = \textrm{is_closed} \: | \: U_t = \textrm{push}, X_{t-1} = \textrm{is_open}) = 0$$ $$p(X_t = \textrm{is_open} \: | \: U_t = \textrm{push}, X_{t-1} = \textrm{is_closed}) = 0.8$$ $$p(X_t = \textrm{is_closed} \: | \: U_t = \textrm{push}, X_{t-1} = \textrm{is_closed}) = 0.2$$ If the robot decides not to open the door: $$p(X_t = \textrm{is_open} \: | \: U_t = \textrm{do_nothing}, X_{t-1} = \textrm{is_open}) = 1$$ $$p(X_t = \textrm{is_closed} \: | \: U_t = \textrm{do_nothing}, X_{t-1} = \textrm{is_open}) = 0$$ $$p(X_t = \textrm{is_open} \: | \: U_t = \textrm{do_nothing}, X_{t-1} = \textrm{is_closed}) = 0$$ $$p(X_t = \textrm{is_closed} \: | \: U_t = \textrm{do_nothing}, X_{t-1} = \textrm{is_closed}) = 1$$

Belief calculation for $t = 1$

Now, we are going to assume that the robot executes the action $u_1 = \textrm{do_nothing}$ and receives the measurement $z_1 = \textrm{sense_open}$ from its camera. We'll walk through this example and have you calculate the belief for the next time step ($t = 2$).

$\overline{bel}(x_1) = \int p(x_1 | u_1, x_{0}) \: bel(x_{0}) \: dx_{0}$
$\qquad \quad = \sum_{x_0} p(x_1 | u_1, x_{0}) \: bel(x_{0}) $
$\qquad \quad = p(x_1 | U_1 = \textrm{do_nothing}, X_0 = \textrm{is_open}) \: bel(X_0 = \textrm{is_open}) + $ $\qquad \qquad \: p(x_1 | U_1 = \textrm{do_nothing}, X_0 = \textrm{is_closed}) \: bel(X_0 = \textrm{is_closed}) $

Now, we can calculate $\overline{bel}(x_1)$ for both $X_1 = \textrm{is_open}$ and $X_1 = \textrm{is_closed}$.

$\overline{bel}(X_1 = \textrm{is_open}) = p(X_1 = \textrm{is_open} \: | \: U_1 = \textrm{do_nothing}, X_0 = \textrm{is_open}) \: bel(X_0 = \textrm{is_open}) + $
$\qquad \qquad \qquad \qquad \quad \: p(X_1 = \textrm{is_open} \: | \: U_1 = \textrm{do_nothing}, X_0 = \textrm{is_closed}) \: bel(X_0 = \textrm{is_closed}) $
$\qquad \qquad \qquad \qquad = 1 \cdot 0.5 + 0 \cdot 0.5 = 0.5 $

$\overline{bel}(X_1 = \textrm{is_closed}) = p(X_1 = \textrm{is_closed} \: | \: U_1 = \textrm{do_nothing}, X_0 = \textrm{is_open}) \: bel(X_0 = \textrm{is_open}) + $
$\qquad \qquad \qquad \qquad \quad \: \: \: p(X_1 = \textrm{is_closed} \: | \: U_1 = \textrm{do_nothing}, X_0 = \textrm{is_closed}) \: bel(X_0 = \textrm{is_closed}) $
$\qquad \qquad \qquad \qquad \: \: = 0 \cdot 0.5 + 1 \cdot 0.5 = 0.5$

It should not surprise us that $bel(X_0) = \overline{bel}(X_1) $, since the robot action $\textrm{do_nothing}$ does not influence the state of the world. Once we do the measurement update, however, our belief will change. Our belief update takes the form: $$bel(x_1) = \eta \: p(Z_1 = \textrm{sense_open} \: | \: x_1) \: \overline{bel}(x_1)$$

We have two possible resulting states, $X_1 = \textrm{is_open}$ and $X_1 = \textrm{is_closed}$:

$bel(X_1 = \textrm{is_open}) = \eta \: p(Z_1 = \textrm{sense_open} \: | \: X_1 = \textrm{is_open}) \: \overline{bel}(X_1 = \textrm{is_open}) $
$\qquad \qquad \qquad \quad \: \: \: \: = \eta \: 0.6 \cdot 0.5 = \eta \cdot 0.3 $

$bel(X_1 = \textrm{is_closed}) = \eta \: p(Z_1 = \textrm{sense_open} \: | \: X_1 = \textrm{is_closed}) \: \overline{bel}(X_1 = \textrm{is_closed}) $
$\qquad \qquad \qquad \qquad \: \: = \eta \: 0.2 \cdot 0.5 = \eta \cdot 0.1 $

We can now calculate the normalizer ($\eta$) so that $\sum bel(x_1) = 1$ :

$ \eta = (0.3 + 0.1)^{-1} = 2.5$

So, now our belief after time step 1 is:

$bel(X_1 = \textrm{is_open}) = 0.75 $
$bel(X_1 = \textrm{is_closed}) = 0.25 $

Belief calculation for $t = 2$

For the belief calculation for $t = 2$, you'll need to work and write out the math on your own, however, feel free to talk and collaborate with one another in groups 2-3. You will likely find it helpful to use scratch paper or an equivalent.

For time step 2, we are going to assume that the robot executes the action $u_2 = \textrm{push}$ and receives the measurement $z_2 = \textrm{sense_open}$ from its camera. Your job is to calculate $\overline{bel}(x_2)$ and $bel(x_2)$.

Solution for belief calculation for $t = 2$

The solution will be posted here after Class Meeting 03 Homework is due.

Class Meeting 3 Homework

For your Class Meeting 3 Homework, submit the following via Canvas/Gradescope by 2:00pm on Wednesday, October 8th:

Your calculation of $\overline{bel}(x_2)$ and $bel(x_2)$, including all scratch work leading up to your final answer.

Acknowledgments

The content and exercises for today's class were informed by Probabilistic Robotics by Sebastian Thrun, Wolfram Burgard, and Dieter Fox.