Course overview
- Provider
- Datacamp
- Course type
- Free trial availiable
- Deadline
- Flexible
- Duration
- 4 hours
- Certificate
- Available on completion
- Course author
- Jeroen Boeye
Description
Understand the concept of reducing dimensionality in your data, and master the techniques to do so in Python.
High-dimensional datasets can be overwhelming and leave you not knowing where to start. Typically, you’d visually explore a new dataset first, but when you have too many dimensions the classical approaches will seem insufficient. Fortunately, there are visualization techniques designed specifically for high dimensional data and you’ll be introduced to these in this course. After exploring the data, you’ll often find that many features hold little information because they don’t show any variance or because they are duplicates of other features. You’ll learn how to detect these features and drop them from the dataset so that you can focus on the informative ones. In a next step, you might want to build a model on these features, and it may turn out that some don’t have any effect on the thing you’re trying to predict. You’ll learn how to detect and drop these irrelevant features too, in order to reduce dimensionality and thus complexity. Finally, you’ll learn how feature extraction techniques can reduce dimensionality for you through the calculation of uncorrelated principal components.
Similar courses
-
English language
-
Recommended provider
-
Certificate available