Feature Engineering with PySpark

Updated on

Course overview

Provider
Datacamp
Course type
Free trial availiable
Deadline
Flexible
Duration
4 hours
Certificate
Available on completion
Course author
John Hogue

Description

Learn the gritty details that data scientists are spending 70-80% of their time on; data wrangling and feature engineering.
The real world is messy and your job is to make sense of it. Toy datasets like MTCars and Iris are the result of careful curation and cleaning, even so the data needs to be transformed for it to be useful for powerful machine learning algorithms to extract meaning, forecast, classify or cluster. This course will cover the gritty details that data scientists are spending 70-80% of their time on; data wrangling and feature engineering. With size of datasets now becoming ever larger, let's use PySpark to cut this Big Data problem down to size!

Similar courses

Foundations: Data, Data, Everywhere
  • Flexible deadline
  • 20 hours
  • Certificate
Ask Questions to Make Data-Driven Decisions
  • Flexible deadline
  • 18 hours
  • Certificate
Introduction to Statistics
  • Flexible deadline
  • 15 hours
  • Certificate
  • English language

  • Recommended provider

  • Certificate available