Generate Confusion Matrix for Multinomial Regression

Question

Consider a three-class classification problem. We have a sample set of 15,000 records. The machine learning model is subject to the below conditions upon testing:

The model confuses Class 1 with Class 2 and Class 3. The error of classification is 20% with regards to C2 and 10% with regards to C3.
The model classifies Class 2 correctly.
The model confuses Class 3 with with Class 2.

Create the confusion matrix for this multi-classifier problem.

Solution

==================================================================================================

The total sample size is 9000 = N.

Let’s consider the train-test split of 80-20.

Train set size = 0.8 * N = 0.8 * 15000 = 12000
Test set size = 0.2 * N = 0.2 * 15000 = 3000

We are considering the confusion matrix from the predicted values. So only the test set data need to be considered.

Let us assume a balanced dataset. I.e. all classes have the same number of samples.

Thus, number of records belonging to C1 = C2 = C3 = $\frac{3000}{3}$ = 1000.

We know the below:

Model confuses C1 with C2 and C3. 20% of C1 are classified as C2 (this adds to FP of C2 and C3 and FN of C1) and 10% of C1 are classified as C3.
All records of C2 has been classified correctly. I.e. True positives (TP) = Actual Positives of C2.
Model confuses C3 with C2. Since no particular number is given, consider a uniform/equal distribution. This is 50%.

Theoretical Confusion Matrix

$\begin{tabular}{|c|c|c|c|c|} \hline \multicolumn{2}{c}{}&\multicolumn{2}{c}{Actual Class}&\\ \multirow{Predicted Class} \\ \hline _ & C1 & C2 & C3 & Total' \\ \hline C1 & TP1 & FN2/FP1 & FN3/FP1 & P1' \\ \hline C2 & FN1/FP2 & TP2 & FN3/FP2 & P2' \\ \hline C3 & FN1/FP3 & FN2/FP3 & TP3 & P3' \\ \hline Total & P1 & P2 & P3 & P1+P2+P3\\ \hline \end{tabular}$

Ideal Confusion Matrix with No Errors

$\begin{tabular}{|c|c|c|c|c|} \hline \multicolumn{2}{c}{}&\multicolumn{2}{c}{Actual Class}&\\ \multirow{Predicted Class} \\ \hline _ & C1 & C2 & C3 & Total' \\ \hline C1 & 1000 & 0 & 0 & 1000 \\ \hline C2 & 0 & 1000 & 0 & 1000 \\ \hline C3 & 0 & 0 & 1000 & 1000 \\ \hline Total & 1000 & 1000 & 1000 & 3000\\ \hline \end{tabular}$

Actual Confusion Matrix with Given Errors

TP1 = 70% of C1 = 0.7 * 1000 = 700
FN1 = FP2 = 20% of C1 = 0.2 * 1000 = 200
FN1 = FP3 = 10% of C1 = 0.1 * 1000 = 100
TP2 = 100% of C2 = 1000
FN3 = FP2 = 50% of C3 = 0.5 * 1000 = 500
TP3 = 50% of C3 = 0.5 * 1000 = 500

$\begin{tabular}{|c|c|c|c|c|} \hline \multicolumn{2}{c}{}&\multicolumn{2}{c}{Actual Class}&\\ \multirow{Predicted Class} \\ \hline _ & C1 & C2 & C3 & Total' \\ \hline C1 & 700 & 0 & 0 & 700 \\ \hline C2 & 200 & 1000 & 500 & 1700 \\ \hline C3 & 100 & 0 & 500 & 600 \\ \hline Total & 1000 & 1000 & 1000 & 3000\\ \hline \end{tabular}$

Thus, we have our desired confusion matrix.

Question

Solution

Theoretical Confusion Matrix

Ideal Confusion Matrix with No Errors

Actual Confusion Matrix with Given Errors

Subscribe to Ehan Ghalib!

Leave a Reply Cancel Reply