Question

Consider a dataset for binary classification problem with class labels [C_1, C_0]. The features are given by F_1, F_2 and F_3. Each of these features have 2 values as given below. Apply Naive Bayes classifier by computing the probabilities to classify the example: <F_1 = x_1, F_2 = y_2, F_3 = z_1>.

    \[ \begin{tabular}{|c|c|c|c|c|} \hline Sl no. & F_1 & F_2 & F_3 & Class\\ \hline 1 & x_1 & y_2 & z_1 & C_1\\ \hline 2 & x_2 & y_1 & z_2 & C_0\\ \hline 3 & x_1 & y_1 & z_2 & C_1\\ \hline 4 & x_2 & y_2 & z_1 & C_0\\ \hline 5 & x_2 & y_1 & z_1 & C_1\\ \hline 6 & $x_1$ & $y_2$ & $z_1$ & $C_0$\\ \hline 7 & $x_1$ & $y_1$ & $z_2$ & $C_1$\\ \hline 8 & $x_1$ & $y_2$ & $z_2$ & $C_0$\\\hline \end{tabular} \]

Solution

==================================================================================================

We know that,

P(C_0) = 4/8 = 0.5, P(C_1) = 4/8 = 0.5

Let’s find the conditional probabilities of F_i given that the class is C_0.

P(F_1 = x_1 | y = C_0) = 1/4 = 0.25
P(F_2 = y_2 | y = C_0) = 3/4 = 0.75
P(F_3 = z_1 | y = C_0) = 2/4 = 0.5

Let’s find the conditional probabilities of F_i given that the class is C_1.

P(F_1 = x_1 | y = C_1) = 3/4 = 0.75
P(F_2 = y_2 | y = C_1) = 1/4 = 0.25
P(F_3 = z_1 | y = C_1) = 2/4 = 0.5

According to Naive Bayes Classification Rule:

    \begin{align*}  \boxed{Y \Leftarrow argmax_{y_k} \ P(Y = y_k) \ {\Pi}_i \ P(X_i | Y = y_k)}  \end{align*}

In order to classify the given example, we need to find the probability that if falls in either class.

Probability that <F_1 = x_1, F_2 = y_2, F_3 = z_1> is in:

    \begin{align*}     \left Class \ 0 = P(F_1 = x_1 | y = C_0) \ . \ P(F_2 = y_2 | y = C_0) \ . \ P(F_3 = z_1 | y = C_0) \ . \ P(Y = C_0) \end{align*}

    \begin{align*}     \left = 0.25 \ * \ 0.75 \ * \ 0.5 \ * \ 0.5 = 0.047 \end{align*}

    \begin{align*}     \left Class \ 1 = P(F_1 = x_1 | y = C_1) \ . \ P(F_2 = y_2 | y = C_1) \ . \ P(F_3 = z_1 | y = C_1) \ . \ P(Y = C_1) \end{align*}

    \begin{align*}     \left = 0.75 \ * \ 0.25 \ * \ 0.5 \ * \ 0.5 = 0.047 \end{align*}

Thus, it is equally probable that the given sample falls in either class C_0 or C_1.

Subscribe to Ehan Ghalib!

Invalid email address
We promise not to spam you. You can unsubscribe at any time.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>