Course overview
- Provider
- Datacamp
- Course type
- Free trial availiable
- Deadline
- Flexible
- Duration
- 4 hours
- Certificate
- Available on completion
- Course author
- Richie Cotton
Description
Learn how to analyze huge datasets using Apache Spark and R using the sparklyr package.
R is mostly optimized to help you write data analysis code quickly and readably. Apache Spark is designed to analyze huge datasets quickly. The sparklyr package lets you write dplyr R code that runs on a Spark cluster, giving you the best of both worlds. This course teaches you how to manipulate Spark DataFrames using both the dplyr interface and the native interface to Spark, as well as trying machine learning techniques. Throughout the course, you'll explore the Million Song Dataset.
Similar courses
-
English language
-
Recommended provider
-
Certificate available