Skip to content

Ehan Ghalib

  • Home
  • About
  • Contact
  • Blog
  • Books

Tag: Data Science

Finding Clusters Using DBSCAN Method

Posted 2 years ago by Ehan Ghalib

Question Consider the points: , , , , , , , . a) Compute the distance matrix using Euclidean distance measure. b) Identify the clusters that could be formed using

Read More
Data Mining, Solution Repository Data Science, DBSCAN Algorithm, Distance Matrix, Euclidean Distance Leave a comment

Find Article Similarity with TF-IDF

Posted 3 years ago by Ehan Ghalib

Question Term frequency matrix for the five articles (A1 to A5) is shown below.     Answer the following questions: 1) What is the TF-IDF value for (A4, Corona)? 2)

Read More
Data Mining, Solution Repository Article Similarity, Cosine Similarity, Data Science, Term frequency matrix, TF-IDF Leave a comment

Counting Results in a Competition With Ties

Posted 3 years ago by Ehan Ghalib

Question In National Games Championship, a high jump competition is being conducted along with other athletic events. Only n out of m athletes were able to meet the eligibility criteria

Read More
Counting Principles, Mathematics for Data Science, Solution Repository Counting Principles, Data Science, Fubini Numbers, MFDS, Ordered Bell Numbers Leave a comment

TF-IDF Similarity in Organization with Million Documents

Posted 3 years ago by Ehan Ghalib

Question An organization has million documents in its repository. A document X has term ‘Mining’ occurring 4 times and term ‘Discovery’ occurring for 5 times. Other words occur less frequently.

Read More
Data Mining, Solution Repository Article Similarity, Data Science, Document Similarity, TF-IDF Leave a comment

Comparing Relative Exam Performance in Entrance Exam Using Statistics

Posted 3 years ago by Ehan Ghalib

Question Mike is trying to get into a Medical college for Post-graduation in India. Before applying for any college/university, he needs to take an exam for that particular college/university. Therefore,

Read More
Data Mining, Solution Repository Data Science, Summary Statistics, z-score Transformation Leave a comment

Density-Based Outlier Detection Problem

Posted 3 years ago by Ehan Ghalib

Question Consider the distance matrix for data objects. The outlier score of an object is the inverse of density around an object. The density of an object is equal to

Read More
Data Mining, Solution Repository Data Science, Density-based Outlier Detection, Distance Matrix, Outlier Detection, Outlier Score Leave a comment

Conduct Rule Analysis for Predictive Modeling

Posted 3 years ago by Ehan Ghalib

Question An FMCG Company training set has 100 records for T (tooth paste) & 400 records for competitor C. P, Q, R denote subsets of attribute values in records which

Read More
Data Mining, Solution Repository Data Mining, Data Science, FOIL Gain, Predictive Modeling, Rule Evaluation, Rule Interestingness, Rule-based Classification Leave a comment
Built with BoldGridPowered by WordPressSupport from InMotion HostingSpecial Thanks
  • facebook
  • twitter
  • linkedin
  • youtube
  • snapchat

Subscribe to Ehan Ghalib

Sign up to get the latest updates, informed analysis and opinions on what matters to you.

Invalid email address
We promise not to spam you. You can unsubscribe at any time.
Thanks for subscribing!