37  Discussion 09: Particle Filtering and Naive Bayes (From Spring 2025)

Slides

37.1 Particle Filtering

Let’s use Particle Filtering to estimate the distribution of \(P(W_2 \mid O_1 = a, O_2 = b)\). Here’s the HMM again. \(O_1\) and \(O_2\) are supposed to be shaded.

\(W_1\) \(P(W_1)\)
0 0.3
1 0.7
\(W_t\) \(W_{t+1}\) \(P(W_{t+1}\mid W_t)\)
0 0 0.4
0 1 0.6
1 0 0.8
1 1 0.2
\(W_t\) \(O_t\) \(P(O_t \mid W_t)\)
0 a 0.9
0 b 0.1
1 a 0.5
1 b 0.5

We start with two particles representing our distribution for \(W_1\):

  • \(P_1: W_1 = 0\)
  • \(P_2: W_1 = 1\)

Use the following random numbers to run particle filtering:

[0.22, 0.05, 0.33, 0.20, 0.84, 0.54, 0.79, 0.66, 0.14, 0.96]

37.1.1 (a)

Observe: Compute the weight of the two particles after evidence \(O_1 = a\)

Answer

\[ w(P_1) = P(O_1 = a \mid W_1 = 0) = 0.9 \]

\[ w(P_2) = P(O_1 = a \mid W_1 = 1) = 0.5 \]

37.1.2 (b)

Resample: Using the random numbers, resample \(P_1\) and \(P_2\) based on the weights

Answer

We sample from the weighted distribution above. Using the first two random samples:

\[ P_1 = sample(\text{weights}, 0.22) = 0 \]

\[ P_2 = sample(\text{weights}, 0.05) = 0 \]

37.1.3 (c)

Predict: Sample \(P_1\) and \(P_2\) from applying the time update

Answer

\[ P_1 = sample(P(W_{t+1} \mid W_t = 0), 0.33) = 0 \]

\[ P_2 = sample(P(W_{t+1} \mid W_t = 0), 0.20) = 0 \]

37.1.4 (d)

Update: Compute the weight of the two particles after evidence \(O_2 = b\)

Answer

\[ w(P_1) = P(O_2 = b \mid W_2 = 0) = 0.1 \]

\[ w(P_2) = P(O_2 = b \mid W_2 = 0) = 0.1 \]

37.1.5 (e)

Resample: Using the random numbers, resample \(P_1\) and \(P_2\) based on the weights

Answer

Because both particles have \(W_2 = 0\), resampling leaves both particles unchanged:

\[ P_1 = 0 \]

\[ P_2 = 0 \]

37.1.6 (f)

What is our estimated distribution for \(P(W_2 \mid O_1 = a, O_2 = b)\)

Answer

\[ P(W_2 = 0 \mid O_1 = a, O_2 = b) = \frac{2}{2} = 1 \]

\[ P(W_2 = 1 \mid O_1 = a, O_2 = b) = \frac{0}{2} = 0 \]

37.2 Naive Bayes

In this question, we will train a Naive Bayes classifier to predict class labels \(Y\) as a function of input features \(A\) and \(B\). \(Y\), \(A\), and \(B\) are all binary variables, with domains 0 and 1. We are given 10 training points from which we will estimate our distribution.

\(A\) 1 1 1 1 0 1 0 1 1 1
\(B\) 1 0 0 1 1 1 1 0 1 1
\(Y\) 1 1 0 0 0 1 1 0 0 0


37.2.1 (a)

What are the maximum likelihood estimates for the tables \(P(Y)\), \(P(A|Y)\), and \(P(B|Y)\)?

\(Y\) \(P(Y)\)
0
1
\(A\) \(Y\) \(P(A|Y)\)
0 0
1 0
0 1
1 1
\(B\) \(Y\) \(P(B|Y)\)
0 0
1 0
0 1
1 1
Answer
\(Y\) \(P(Y)\)
0 3/5
1 2/5
\(A\) \(Y\) \(P(A|Y)\)
0 0 1/6
1 0 5/6
0 1 1/4
1 1 3/4
\(B\) \(Y\) \(P(B|Y)\)
0 0 1/3
1 0 2/3
0 1 1/4
1 1 3/4

37.2.2 (b)

Consider a new data point (\(A = 1\), \(B = 1\)). What label would this classifier assign to this sample?

Answer

\[ P(Y=0, A=1, B=1) = P(Y=0)P(A=1|Y=0)P(B=1|Y=0) = (3/5)(5/6)(2/3) = 1/3 \]

\[ P(Y=1, A=1, B=1) = P(Y=1)P(A=1|Y=1)P(B=1|Y=1) = (2/5)(3/4)(3/4) = 9/40 \]

Our classifier will predict label 0.

37.2.3 (c)

Let’s use Laplace Smoothing to smooth out our distribution. Compute the new distribution for \(P(A|Y)\) given Laplace Smoothing with \(k=2\).

\(A\) \(Y\) \(P(A|Y)\)
0 0
1 0
0 1
1 1
Answer
\(A\) \(Y\) \(P(A|Y)\)
0 0 3/10
1 0 7/10
0 1 3/8
1 1 5/8