- Course type
- Paid course
- 2 hours
- 18 lessons
- Available on completion
- Course author
- Bassam Almogahed
- Understand the underline causes of the Class Imbalance problem
- Why it is a major challenge in machine learning and data mining fields
- Learn the different characteristics of imbalanced datasets
- Learn the state-of-the-art techniques and algorithms
- Understand couple data-based undersampling techniques and apply them.
- Understand couple data-based oversampling techniques and apply them
- Learn an algorithmic-based algorithm
There is an unprecedented amount of data available. This has caused knowledge discovery to garner attention in recent years. However, many real-world datasets are imbalanced. Learning from imbalanced data poses major challenges and is recognized as needing significant attention.
The problem with imbalanced data is the performance of learning algorithms in the presence of underrepresented data and severely skewed class distributions. Models trained on imbalanced datasets strongly favor the majority class and largely ignore the minority class. Several approaches introduced to date present both data-based and algorithmic solutions.
The specific goals of this course are:
Help the students understand the underline causes of this problem
Discuss the different characteristics of an unbalanced dataset
Highlight the severity and importance of this branch of data science
Give a general idea of the two main major state-of-the-art approaches that you developed to handle this problem.
Go over two methods in details to give an idea about some of the techniques used and hopefully motivate the students to learn more.