How to get started with ML
Contents
- Mathematics
- Introduction
- Interview Resources
- Artificial Intelligence
- Genetic Algorithms
- Statistics
- Useful Blogs
- Resources on Quora
- Resources on Kaggle
- Cheat Sheets
- Classification
- Linear Regression
- Logistic Regression
- Model Validation using Resampling
- Deep Learning
- Natural Language Processing
- Computer Vision
- Support Vector Machine
- Reinforcement Learning
- Decision Trees
- Random Forest / Bagging
- Boosting
- Ensembles
- Stacking Models
- VC Dimension
- Bayesian Machine Learning
- Semi Supervised Learning
- Optimizations
- Other Useful Tutorials
Math
Learn some math basics! Focus only on these topics, then come back later in case you need to learn more.
- Khan Academy - Multivariable Calculus (opens in a new tab)
- Khan Academy - Differential Equations (opens in a new tab)
- Khan Academy - Linear Algebra (opens in a new tab)
- Khan Academy - Statistics Probability (opens in a new tab)
- Optional: 3Blue1Brown - Essence of Linear Algebra (opens in a new tab)
- Dedicated section (opens in a new tab) If you really want to get deep into this.
Introduction
-
Machine Learning Course by Andrew Ng (Stanford University) (opens in a new tab)
-
In-depth introduction to machine learning in 15 hours of expert videos (opens in a new tab)
-
An Introduction to Statistical Learning (opens in a new tab)
-
List of Machine Learning University Courses (opens in a new tab)
-
Machine Learning for Software Engineers (opens in a new tab)
-
A curated list of awesome Machine Learning frameworks, libraries and software (opens in a new tab)
-
A curated list of awesome data visualization libraries and resources. (opens in a new tab)
-
An awesome Data Science repository to learn and apply for real world problems (opens in a new tab)
-
Machine Learning FAQs on Cross Validated (opens in a new tab)
-
Difference between Linearly Independent, Orthogonal, and Uncorrelated Variables (opens in a new tab)
-
Slides on Several Machine Learning Topics (opens in a new tab)
-
Comparison Supervised Learning Algorithms (opens in a new tab)
-
Twitter's Most Shared #machineLearning Content From The Past 7 Days (opens in a new tab)
Interview Resources
-
41 Essential Machine Learning Interview Questions (with answers) (opens in a new tab)
-
What are the key skills of a data scientist? (opens in a new tab)
-
The Big List of DS/ML Interview Resources (opens in a new tab)
Artificial Intelligence
-
Awesome Artificial Intelligence (GitHub Repo) (opens in a new tab)
-
UC Berkeley CS188 Intro to AI (opens in a new tab), Lecture Videos (opens in a new tab), 2 (opens in a new tab)
-
Programming Community Curated Resources for learning Artificial Intelligence (opens in a new tab)
-
MIT 6.034 Artificial Intelligence Lecture Videos (opens in a new tab), Complete Course (opens in a new tab)
Genetic Algorithms
-
Simple Implementation of Genetic Algorithms in Python (Part 1) (opens in a new tab), Part 2 (opens in a new tab)
-
Genetic Algorithms vs Artificial Neural Networks (opens in a new tab)
-
Genetic Algorithms Explained in Plain English (opens in a new tab)
Statistics
-
Stat Trek Website (opens in a new tab) - A dedicated website to teach yourselves Statistics
-
Learn Statistics Using Python (opens in a new tab) - Learn Statistics using an application-centric programming approach
-
Statistics for Hackers | Slides | @jakevdp (opens in a new tab) - Slides by Jake VanderPlas
-
Online Statistics Book (opens in a new tab) - An Interactive Multimedia Course for Studying Statistics
-
Tutorials
-
OpenIntro Statistics (opens in a new tab) - Free PDF textbook
Useful Blogs
-
Edwin Chen's Blog (opens in a new tab) - A blog about Math, stats, ML, crowdsourcing, data science
-
The Data School Blog (opens in a new tab) - Data science for beginners!
-
ML Wave (opens in a new tab) - A blog for Learning Machine Learning
-
Andrej Karpathy (opens in a new tab) - A blog about Deep Learning and Data Science in general
-
Colah's Blog (opens in a new tab) - Awesome Neural Networks Blog
-
Alex Minnaar's Blog (opens in a new tab) - A blog about Machine Learning and Software Engineering
-
Statistically Significant (opens in a new tab) - Andrew Landgraf's Data Science Blog
-
Simply Statistics (opens in a new tab) - A blog by three biostatistics professors
-
Yanir Seroussi's Blog (opens in a new tab) - A blog about Data Science and beyond
-
fastML (opens in a new tab) - Machine learning made easy
-
Trevor Stephens Blog (opens in a new tab) - Trevor Stephens Personal Page
-
no free hunch | kaggle (opens in a new tab) - The Kaggle Blog about all things Data Science
-
A Quantitative Journey | outlace (opens in a new tab) - learning quantitative applications
-
r4stats (opens in a new tab) - analyze the world of data science, and to help people learn to use R
-
Variance Explained (opens in a new tab) - David Robinson's Blog
-
AI Junkie (opens in a new tab) - a blog about Artificial Intellingence
-
Deep Learning Blog by Tim Dettmers (opens in a new tab) - Making deep learning accessible
-
J Alammar's Blog (opens in a new tab)- Blog posts about Machine Learning and Neural Nets
-
Adam Geitgey (opens in a new tab) - Easiest Introduction to machine learning
-
Ethen's Notebook Collection (opens in a new tab) - Continuously updated machine learning documentations (mainly in Python3). Contents include educational implementation of machine learning algorithms from scratch and open-source library usage
Resources on Quora
Kaggle Competitions WriteUp
-
Convolution Neural Networks for EEG detection (opens in a new tab)
-
How to Rank 10% in Your First Kaggle Competition (opens in a new tab)
Cheat Sheets
Classification
-
Does Balancing Classes Improve Classifier Performance? (opens in a new tab)
-
When to choose which machine learning classifier? (opens in a new tab)
-
What are the advantages of different classification algorithms? (opens in a new tab)
-
ROC and AUC Explained (opens in a new tab) (related video (opens in a new tab))
-
Simple guide to confusion matrix terminology (opens in a new tab)
Linear Regression
-
-
Assumptions of Linear Regression (opens in a new tab), Stack Exchange (opens in a new tab)
-
Linear Regression Comprehensive Resource (opens in a new tab)
-
Applying and Interpreting Linear Regression (opens in a new tab)
-
What does having constant variance in a linear regression model mean? (opens in a new tab)
-
Difference between linear regression on y with x and x with y (opens in a new tab)
-
-
Multicollinearity and VIF
-
Elastic Net (opens in a new tab) - Regularization and Variable Selection via the Elastic Net (opens in a new tab)
Logistic Regression
-
Geometric Intuition of Logistic Regression (opens in a new tab)
-
Obtaining predicted categories (choosing threshold) (opens in a new tab)
-
Difference between logit and probit models (opens in a new tab), Logistic Regression Wiki (opens in a new tab), Probit Model Wiki (opens in a new tab)
-
Pseudo R2 for Logistic Regression (opens in a new tab), How to calculate (opens in a new tab), Other Details (opens in a new tab)
-
Guide to an in-depth understanding of logistic regression (opens in a new tab)
Model Validation using Resampling
-
Cross Validation (opens in a new tab)
-
How to use cross-validation in predictive modeling (opens in a new tab)
-
Overfitting and Cross Validation
-
Deep Learning
-
fast.ai - Practical Deep Learning For Coders (opens in a new tab)
-
fast.ai - Cutting Edge Deep Learning For Coders (opens in a new tab)
-
A curated list of awesome Deep Learning tutorials, projects and communities (opens in a new tab)
-
Interesting Deep Learning and NLP Projects (Stanford) (opens in a new tab), Website (opens in a new tab)
-
Understanding Natural Language with Deep Neural Networks Using Torch (opens in a new tab)
-
Recent Reddit AMAs related to Deep Learning (opens in a new tab), Another AMA (opens in a new tab)
-
Introduction to Deep Learning Using Python (GitHub) (opens in a new tab), Good Introduction Slides (opens in a new tab)
-
Video Lectures Oxford 2015 (opens in a new tab), Video Lectures Summer School Montreal (opens in a new tab)
-
Top arxiv Deep Learning Papers explained (opens in a new tab)
-
Geoff Hinton Youtube Vidoes on Deep Learning (opens in a new tab)
-
Deep Learning Comprehensive Website (opens in a new tab), Software (opens in a new tab)
-
Train, Validation & Test in Artificial Neural Networks (opens in a new tab)
-
Deep Learning Tutorials on deeplearning.net (opens in a new tab)
-
Neural Networks and Deep Learning Online Book (opens in a new tab)
-
Neural Machine Translation
-
Deep Learning Frameworks
-
Caffe
-
TensorFlow
-
Feed Forward Networks
-
A Quick Introduction to Neural Networks (opens in a new tab)
-
Implementing a Neural Network from scratch (opens in a new tab), Code (opens in a new tab)
-
Speeding up your Neural Network with Theano and the gpu (opens in a new tab), Code (opens in a new tab)
-
Choosing number of hidden layers and nodes (opens in a new tab),2 (opens in a new tab),3 (opens in a new tab)
-
Regression and Classification with NNs (Slides) (opens in a new tab)
-
-
Recurrent and LSTM Networks
-
awesome-rnn: list of resources (GitHub Repo) (opens in a new tab)
-
Recurrent Neural Net Tutorial Part 1 (opens in a new tab), Part 2 (opens in a new tab), Part 3 (opens in a new tab), Code (opens in a new tab)
-
The Unreasonable effectiveness of RNNs (opens in a new tab), Torch Code (opens in a new tab), Python Code (opens in a new tab)
-
Intro to RNN (opens in a new tab), LSTM (opens in a new tab)
-
Using RNN to create on-the-fly dialogue (Keras) (opens in a new tab)
-
Long Short Term Memory (LSTM)
-
Implementing LSTM from scratch (opens in a new tab), Python/Theano code (opens in a new tab)
-
Torch Code for character-level language models using LSTM (opens in a new tab)
-
LSTM for Kaggle EEG Detection competition (Torch Code) (opens in a new tab)
-
Deep Learning for Visual Q&A | LSTM | CNN (opens in a new tab), Code (opens in a new tab)
-
Computer Responds to email using LSTM | Google (opens in a new tab)
-
LSTM dramatically improves Google Voice Search (opens in a new tab), Another Article (opens in a new tab)
-
Understanding Natural Language with LSTM Using Torch (opens in a new tab)
-
Torch code for Visual Question Answering using a CNN+LSTM model (opens in a new tab)
-
Gated Recurrent Units (GRU)
-
Time series forecasting with Sequence-to-Sequence (seq2seq) rnn models (opens in a new tab)
-
-
Restricted Boltzmann Machine
-
Autoencoders: Unsupervised (applies BackProp after setting target = input)
-
Convolutional Neural Networks
-
An Intuitive Explanation of Convolutional Neural Networks (opens in a new tab)
-
Awesome Deep Vision: List of Resources (GitHub) (opens in a new tab)
-
Stanford Notes (opens in a new tab), Codes (opens in a new tab), GitHub (opens in a new tab)
-
JavaScript Library (Browser Based) for CNNs (opens in a new tab)
-
Deep learning to classify business photos at Yelp (opens in a new tab)
-
-
Network Representation Learning
Natural Language Processing
-
A curated list of speech and natural language processing resources (opens in a new tab)
-
Understanding Natural Language with Deep Neural Networks Using Torch (opens in a new tab)
-
Interesting Deep Learning NLP Projects Stanford (opens in a new tab), Website (opens in a new tab)
-
Graph Based Semi Supervised Learning for NLP (opens in a new tab)
-
Topic Modeling
-
Probabilistic Topic Models Princeton PDF (opens in a new tab)
-
LDA Wikipedia (opens in a new tab), LSA Wikipedia (opens in a new tab), Probabilistic LSA Wikipedia (opens in a new tab)
-
What is a good explanation of Latent Dirichlet Allocation (LDA)? (opens in a new tab)
-
Introduction to LDA (opens in a new tab), Another good explanation (opens in a new tab)
-
Your Guide to Latent Dirichlet Allocation (LDA) (opens in a new tab)
-
Intuitive explanation of the Dirichlet distribution (opens in a new tab)
-
topicmodels: An R Package for Fitting Topic Models (opens in a new tab)
-
Online LDA (opens in a new tab), Online LDA with Spark (opens in a new tab)
-
LDA in Scala (opens in a new tab), Part 2 (opens in a new tab)
-
Segmentation of Twitter Timelines via Topic Modeling (opens in a new tab)
-
Multilingual Latent Dirichlet Allocation (LDA) (opens in a new tab). (Tutorial here (opens in a new tab))
-
Gaussian LDA for Topic Models with Word Embeddings (opens in a new tab)
-
Python
-
word2vec
-
Skip Gram Model Tutorial (opens in a new tab), CBoW Model (opens in a new tab)
-
Word Vectors Kaggle Tutorial Python (opens in a new tab), Part 2 (opens in a new tab)
-
Other Quora Resources (opens in a new tab), 2 (opens in a new tab), 3 (opens in a new tab)
-
word2vec, DBN, RNTN for Sentiment Analysis (opens in a new tab)
-
Text Clustering
-
Text Classification
-
Named Entity Recognitation
-
Language learning with NLP and reinforcement learning (opens in a new tab)
-
Kaggle Tutorial Bag of Words and Word vectors (opens in a new tab), Part 2 (opens in a new tab), Part 3 (opens in a new tab)
-
What would Shakespeare say (NLP Tutorial) (opens in a new tab)
Computer Vision
Support Vector Machine
-
Highest Voted Questions about SVMs on Cross Validated (opens in a new tab)
-
Practical Guide to SVC (opens in a new tab), Slides (opens in a new tab)
-
Comparisons
-
Optimization Algorithms in Support Vector Machines (opens in a new tab)
-
Software
-
Kernels
-
Probabilities post SVM
Reinforcement Learning
-
Awesome Reinforcement Learning (GitHub) (opens in a new tab)
-
RL Tutorial Part 1 (opens in a new tab), Part 2 (opens in a new tab)
Decision Trees
-
Thorough Explanation and different algorithms (opens in a new tab)
-
What is entropy and information gain in the context of building decision trees? (opens in a new tab)
-
How do decision tree learning algorithms deal with missing values? (opens in a new tab)
-
Using Surrogates to Improve Datasets with Missing Values (opens in a new tab)
-
Are decision trees almost always binary trees? (opens in a new tab)
-
Pruning Decision Trees (opens in a new tab), Grafting of Decision Trees (opens in a new tab)
-
What is Deviance in context of Decision Trees? (opens in a new tab)
-
Discover structure behind data with decision trees (opens in a new tab) - Grow and plot a decision tree to automatically figure out hidden rules in your data
-
Comparison of Different Algorithms
-
CART
-
CTREE
-
CHAID
-
MARS
-
Probabilistic Decision Trees
Random Forest / Bagging
-
Measures of variable importance in random forests (opens in a new tab)
-
Compare R-squared from two different Random Forest models (opens in a new tab)
-
Evaluating Random Forests for Survival Analysis Using Prediction Error Curve (opens in a new tab)
-
Why doesn't Random Forest handle missing values in predictors? (opens in a new tab)
-
How to build random forests in R with missing (NA) values? (opens in a new tab)
-
FAQs about Random Forest (opens in a new tab), More FAQs (opens in a new tab)
-
Obtaining knowledge from a random forest (opens in a new tab)
-
Some Questions for R implementation (opens in a new tab), 2 (opens in a new tab), 3 (opens in a new tab)
Boosting
-
Introduction to Boosted Trees | Tianqi Chen (opens in a new tab)
-
Gradient Boosting Machine
-
xgboost
-
AdaBoost
-
CatBoost
Ensembles
-
Ensembling models with R (opens in a new tab), Ensembling Regression Models in R (opens in a new tab), Intro to Ensembles in R (opens in a new tab)
-
Good Resources | Kaggle Africa Soil Property Prediction (opens in a new tab)
-
Resources for learning how to implement ensemble methods (opens in a new tab)
-
How are classifications merged in an ensemble classifier? (opens in a new tab)
Stacking Models
-
Stacking, Blending and Stacked Generalization (opens in a new tab)
-
Stacked Generalization: when does it work? (opens in a new tab)
Vapnik–Chervonenkis Dimension
Bayesian Machine Learning
-
Bayesian Methods for Hackers (using pyMC) (opens in a new tab)
-
Should all Machine Learning be Bayesian? (opens in a new tab)
-
Tutorial on Bayesian Optimisation for Machine Learning (opens in a new tab)
-
Bayesian Reasoning and Deep Learning (opens in a new tab), Slides (opens in a new tab)
Semi Supervised Learning
-
Wikipedia article on Semi Supervised Learning (opens in a new tab)
-
Graph Based Semi Supervised Learning for NLP (opens in a new tab)
-
Unsupervised, Supervised and Semi Supervised learning (opens in a new tab)
-
Research Papers 1 (opens in a new tab), 2 (opens in a new tab), 3 (opens in a new tab)
Optimization
-
Mean Variance Portfolio Optimization with R and Quadratic Programming (opens in a new tab)
-
Algorithms for Sparse Optimization and Machine Learning (opens in a new tab)
-
Optimization Algorithms in Machine Learning (opens in a new tab), Video Lecture (opens in a new tab)
-
Optimization Algorithms for Data Analysis (opens in a new tab)
-
Optimization Algorithms in Support Vector Machines (opens in a new tab)
-
The Interplay of Optimization and Machine Learning Research (opens in a new tab)
-
Hyperopt tutorial for Optimizing Neural Networks’ Hyperparameters (opens in a new tab)
Other Tutorials
-
For a collection of Data Science Tutorials using R, please refer to this list (opens in a new tab).
-
For a collection of Data Science Tutorials using Python, please refer to this list (opens in a new tab).
Save Cheat Sheets!
- The best Cheat Sheets for Artificial Intelligence, Machine Learning, and Python.
- Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Big Data (opens in a new tab) - Stefan Kojouharov
- Machine Learning cheatsheets for Stanford's CS 229 (opens in a new tab) - Afshine Amidi & Shervine Amidi
- Cheat Sheet of Machine Learning and Python (and Math) Cheat Sheets (opens in a new tab) - Robbie Allen
- AI Expert Roadmap (opens in a new tab) - Use it as a skillset checklist!
How to get started with ML
2. Learn Python
- 4h Beginner Course (opens in a new tab)
- 6h Intermediate Python Programming Course (opens in a new tab)
- Learn Python In X minutes (opens in a new tab)
- Cheatsheet (opens in a new tab)
3. Learn The ML Tech Stack:
- NumPy:
- Pandas:
- Matplotlib:
(Scikit-Learn and TensorFlow are taught in step 4. PyTorch is optional, maybe in step 7)
4. Machine Learning Courses
- Machine Learning Specialization Andrew Ng | Coursera (opens in a new tab) (3 Courses)
- Optional: Machine Learning From Scratch (opens in a new tab)
5. Hands-on Data Preparation
- Kaggle Intro to Machine Learning (opens in a new tab)
- Kaggle Intermediate Machine Learning (opens in a new tab)
Future [Specialize & Create Blog]
- Specialize in one field (e.g. Computer Vision, NLP, etc.)
- Look at requirements in corresponding job descriptions and learn those skills
- Tip: Create a blog and share tutorials and what you have learned!