Real data science problems with Python


Updated on

Course overview

Course type
Paid course
8 hours
31 lessons
Available on completion
Course author
Francisco Juretig
  • Work with many ML techniques in real problems such as classification, image processing, regression
  • Build neural networks for classification and regression
  • Apply machine learning and data science to Audio Processing, Image detection, real time video, sentiment analysis and many more things


Practice machine learning and data science with real problems

This course explores a variety of machine learning and data science techniques using real life datasets/images/audio collected from several sources. These realistic situations are much better than dummy examples, because they force the student to better think the problem, pre-process the data in a better way, and evaluate the performance of the prediction in different ways.

The datasets used here are from different sources such as Kaggle, US, CrowdFlower, etc. And each lecture shows how to preprocess the data, model it using an appropriate technique, and compute how well each technique is working on that specific problem. Certain lectures contain also multiple techniques, and we discuss which technique is outperforming the other. Naturally, all the code is shared here, and you can contact me if you have any questions. Every lecture can also be downloaded, so you can enjoy them while travelling.

The student should already be familiar with Python and some data science techniques. In each lecture, we do discuss some technical details on each method, but we do not invest much time in explaining the underlying mathematical principles behind each method

Some of the techniques presented here are: 

  • Pure image processing using OpencCV
  • Convolutional neural networks using Keras-Theano
  • Logistic and naive bayes classifiers
  • Adaboost, Support Vector Machines for regression and classification, Random Forests
  • Real time video processing, Multilayer Perceptrons, Deep Neural Networks,etc.
  • Linear regression
  • Penalized estimators
  • Clustering
  • Principal components

The modules/libraries used here are:

  • Scikit-learn
  • Keras-theano
  • Pandas
  • OpenCV

Some of the real examples used here:

  • Predicting the GDP based on socio-economic variables
  • Detecting human parts and gestures in images
  • Tracking objects in real time video
  • Machine learning on speech recognition
  • Detecting spam in SMS messages
  • Sentiment analysis using Twitter data
  • Counting objects in pictures and retrieving their position
  • Forecasting London property prices
  • Predicting whether people earn more than a 50K threshold based on US Census data
  • Predicting the nuclear output of US based reactors
  • Predicting the house prices for some US counties
  • And much more...

The motivation for this course is that many students willing to learn data science/machine learning are usually suck with dummy datasets that are not challenging enough. This course aims to ease that transition between knowing machine learning, and doing real machine learning on real situations.

Similar courses

Real data science problems with Python
  • English language

  • Recommended provider

  • Certificate available