Machine learning is one of the most exciting recent technologies. Every time you Google something or when Facebook recommends a photo tag, machine learning comes into picture. If there is an application which you just cannot program by hand, say writing a computer program to make a helicopter fly, we make use of machine learning by letting the computer learn how to fly the helicopter by itself. Most of natural language processing and most of computer vision today is applied machine learning.
So how do we define Machine Learning?
Arthur Samuel, an American pioneer in the field of computer gaming and artificial intelligence, defines it as:
“the field of study that gives computers the ability to learn without being explicitly programmed”
Here is a slightly more recent definition for ML algorithm by Tom Mitchell, who is currently the Chair of Machine Learning Department at Carnegie Mellon University,
“a computer program is said to learn from experience E, with respect to some task T, and some performance measure P, if its performance on T as measured by P improves with experience E”
So for example if you have a spam filter. Then its task T would be to categorize the mail as spam or no spam; the experience parameter E would be the type of mails that you classify as spam or no spam; and its performance P would be the number of mails that it is able to correctly classify as spam. If you are using an efficient algorithm then your filter must make more and more correct choices over time.
There are several different types of learning algorithms. The two main types are supervised learning and unsupervised learning.
In supervised learning we give the algorithm a data set and also tell it what are the correct answers. All the algorithm has to do is try and find a pattern in the values given so that it can predict what will be the outcome for the next set of inputs. This is also known as a regression problem. To explain this I will use the example given by Prof. Andrew in class.
So suppose you have a data set with the housing prices for houses of different sizes. You want your program to predict the prices of house if it is given their size. To do this you will first tell the algorithm the prices for the different sizes and then let it predict for any other input size the price. You then tell the algorithm whether its prediction was correct or the difference between the correct price and the predicted value. This way, the algorithm will try to correct itself and learn to draw a line connecting all the correct input values and give a correct prediction eventually.
So for each example in Supervised Learning, we are told explicitly what is the so-called right answer
In Unsupervised Learning, we’re given data that doesn’t have any labels. So we’re given the data set and we’re not told what to do with it and we’re not told what each data point is. Instead we’re just told, here’s a bunch of data. I don’t know what’s in this data. I don’t know who’s and what type. I don’t even know what the different types of people are, but can you automatically find some structure in the data? An example of this is the Google News. The algorithm automatically decides and creates clusters of similar news headlines on various topics by itself.
In my next post I will start to talk about more specific learning algorithms, how these algorithms work and how we can go about implementing them.