Machine Learning

What is Machine Learning?

Machine Learning is the intersection of computer science, statistics and applied mathematics. The area of machine learning is excited about the question of how to write computer programs that can learn and improve itself over time. Instead of nurturing machines a knowledge based approach to solve a certain task, we nurture them the data driven approach to solve tasks. Most of the machine learning algorithms is based upon math and solving optimization problems.

There are 3 most popular learning that takes place in machine learning. They are,

  1. Supervised Learning
  2. Unsupervised learning
  3. Reinforcement learning

In supervised learning we get the input variable(x) and the corresponding target or output variables(y), we call it supervised learning because all of our output variables(y) supervise our learning. We often opt for supervised learning techniques for solving classification and regression tasks. Some of the most popular supervised learning algorithms are as follows.

  1. Logistic Regression: Logistic Regression can only solve classification problems even though it has regression in it’s name. Logistic regression is extensively is used in two class classification problems like spam detection, heart disease prediction, breast cancer prediction etc. The Stack overflow website uses logistic regression in multiclass and one-vs-rest setting to detect tags of a given code snippet.

  2. Linear regression: Linear regression can be used to solve only regression problems. Linear Regression is extensively used in stock price prediction, wine consumptioon prediction, suicide rate prediction, house price prediction, taxi demand prediction etc.

  3. Support Vector Machines or SVM: Support vector machines can be used for solving both classification and regression tasks. SVM is one of the most powerful machine learning technique and it can even solve multiclass classification problems unlike logistioc regression. SVM is extensively used in Credit card fraud detection, Spam detection, Face recognition systems, Handwritten digits recognition etc.

  4. Decision tree: Decision tree is one of the most popular supervised learning technique and it can be used for solving classification and regression problems. Decision tree is extensively used in solving heart disease prediction, student performance(in Exam) prediction, breast cancer prediciton etc. In general, decision tree works best in small datasets.

  5. Random Forest: Random forest can solve classification and regression problems and it uses decision tree as it’s base learners. The decision trees that random forest has as it’s base learners are shallow and sometimes the depth is one. Quora uses random forest to detect similar questions.

  6. K-Nearest neighbors: K-Nearest Neighbors is one of the most simpliest yet power ful machine learning technique. It can solve both classification and regression tasks. It’s a neighborhood based technique and it’s useful when there is major overlapping in our dataset. In haberman dataset K-Nearest neighbors works phenomenally well and it’s useful when the dataset is in low dimensional space.

  7. Naive-Bayes: Naive bayes is a simple probabilistic model and it uses the Bayes theorem extensively. Naive Bayes is one of the very first model that was used to detect spam. Naive Bayes is used in Sentiment Analysis, spam detection systems etc. Naive bayes works great on text data.

In unsupervised learning we don’t provide the output variable to the model instead we want the model to cluster the similar points and do the labelling. By applying unsupervised learning techniques we can solve more complex task compared to supervised learning. Unsupervised learning models can find and cluster those datapoints which follows same pattern. We often perform unsupervised learning techniques to perform data pre-processing. Neural network are also a part of unsupervised learning techniques. Some of the most popular unsupervised learning algorithms are-

  1. Hierarchical Clustering
  2. K-means Clustering
  3. DBSCAN
  4. Neural Networks

On the other hand, Reinforcement learning (RL) is an area of machine learning concerned with how software agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning.

Use cases of Machine learning:

1) Facebook friend recommendation system:

Facebook uses graph mining to recommend friends. Machine learning is extensively used in graph mining and in the graph mining systems every facebook user is represented as a vertex of a graph and the edge between two vertices represents their friendship. Facebook uses machine learning to predict is there an edge exist between two vertex.

2) Stack Overflow tag prediction:

Given a snippet of code, the model can determine which programming languge the programmer used to write the code. Stack overflow has this feature that it can determnine the tag if it is not given. Stack overflow used machine learning for this problem.

Stack Overflow Tag

3) Uber drunk passenger detection:

Uber is working on a feature where it can detect a drunk rider/passenger by his/her activity. Uber uses some advanced deep learing techniques for it. deep learning is hottest part of machine learning.

4) Ad click prediction:

E-commerce websites use machine learning extensively to advertise apporpriate products to the user that he/she might buy. The algorithm recommends similar products based on the buying history of the user.

4) Fraud detection:

Machine learning is very useful in frawd detection and we use some advanced cascade model systems in fraud dtection. Given a credit card transaction details, the model can predict if the transaction is made by an imposter.

5) Handwritten digits recognition:

Some advanced well engineered machine learning models or deep learning models are used extensively to recognise certain handwritten digits. Google keyboard has this feature in-built in it and it’s very useful.

Handwritten

6) Self driving car system:

Machine learning is one of the core backbone of a self driving car system. Basically deep learning and computer vision are used here where the system can improve itself as it gathers experience. Self driving car is one of the output of great machine learning engineering and research.

self-drivingwritten