Question

Consider the distance matrix for data objects. The outlier score of an object is the inverse of density around an object. The density of an object is equal to the number of objects within distance of 3 units from the object.

    \[ \begin{tabular}{|c|c|c|c|c|c|c|} \hline _ & A & J & M & C & P & L\\ \hline A & 0 & 12 & 3 & 4 & 1 & 2\\ \hline J & 12 & 0 & 2 & 8 & 7 & 10\\ \hline M & 3 & 2 & 0 & 9 & 6 & 5\\ \hline C & 4 & 8 & 9 & 0 & 5 & 1\\ \hline P & 1 & 7 & 6 & 5 & 0 & 2\\ \hline L & 2 & 10 & 5 & 1 & 2 & 0\\ \hline \end{tabular} \]

Identify outlier using density-based outlier detection method.

Solution

==================================================================================================

The formulae to be applied as per the question are:

    \begin{align*}  \boxed{Outlier \ Score = \frac{1}{D} } \end{align*}

    \begin{align*}  \boxed{Density, \ D = \left\{ \lVert O \rVert \mid O \ \leq 3 \right\} } \end{align*}

Now, let’s go through the below steps to solve the problem.

Find the Density of the Objects:

Let density of point A be D(A).

  • D(A) = \mid \{ M, P, L \} \mid = 3
  • D(J) = \mid \{ M \} \mid = 1
  • D(M) = \mid \{ A, J \} \mid = 2
  • D(C) = \mid \{ L \} \mid = 1
  • D(P) = \mid \{ A, L \} \mid = 2
  • D(L) = \mid \{ A, C, P \} \mid = 3

Find the Outlier Score of the Objects:

Let outlier score of point A be OS(A).

  • OS(A) = \frac{1}{3} = 0.33
  • OS(J) = \frac{1}{1} = 1
  • OS(M) = \frac{1}{2} = 0.5
  • OS(C) = \frac{1}{1} = 1
  • OS(P) = \frac{1}{2} = 0.5
  • OS(L) = \frac{1}{3} = 0.33

Conclusion

Based on the outlier scores, points J and C are the outliers.

Subscribe to Ehan Ghalib!

Invalid email address
We promise not to spam you. You can unsubscribe at any time.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>