Month: July 2014

#2 – Linear Regression with One Variable

In my last post I talked about two types of machine learning algorithms – supervised and unsupervised. The linear regression model comes under the first category. The regression model is basically used to predict the real valued output. So suppose you have a data-set that has the price of houses for houses of different size and you want to predict the price for a house of some particular size not present in the data-set.

Lets assume that the data-set looks like the one below, and you wish to predict the price for a house of size 1250 sq. ft (I am using the lecture example here).

vlcsnap-2014-07-24-21h22m53s219

The most logical and simple way to do so would be to draw a straight line preferably through the middle of all the points on the graph. We can certainly do this easily if there were only a few points on the graph, perhaps four or maybe five. But for a larger number of data points, this task becomes too difficult to solve by trial and error.

But thankfully it turns out that there is an algorithm that can do this efficiently for us. It is the “Gradient Descent” algorithm. But before I go into its details, let me talk about another concept of “Cost Function” which will be used in the gradient descent discussion.

The Cost Function lets us figure out how to fit the best possible straight line to our data. To draw a straight line you basically need two parameters – intercept on the y-axis and the slope of the line. So lets denote our straight line with

h = to + t*x                                                         equation 1

where to is the y-intercept and t is the slope of the line (t stands for theta and h stands for hypothesis). Now we use each point on the x-axis and calculate the corresponding value of hypothesis ‘h’ using the above equation. These are the values that we predicted for some particular value of t0 and t. But they may not give us a line that correctly represents our data.

We solve this problem by using the concept of “Cost Function”. We denote our cost function by

J = 1/2m * [sum(h(i) – y(i)]^2  over all points            equation 2

where h(i) is the value of h due to the i th x point and y(i) is the original value of the point on the y axis corresponding to the i th x value. m is the total number of points on the graph or the number of training set examples.

This cost function basically tells us how good is our line is in representing the data-set. If the points predicted by our line are vary far away from the actual data-set values, then the cost function would be very high and we would have to vary the values of to and t so that we can get a new line. In the end we basically want to find those values of to and t that can minimize our cost function. This is where the gradient descent comes into picture.

Gradient Descent is our algorithm for minimizing the cost function J.

The way it does this is that it assumes a value of to and t and then keeps changing it until we hopefully reach a minimum value. The equation to calculate the subsequent values of t0 and t is,

gradient_des

The alpha in the above equation basically controls how big a step you take downwards. If alpha is too small, the algorithm would be very slow and take a long time to compute; if alpha is too large, the algorithm may overshoot the minimum and fail to converge, or may even diverge. Thus an optimal value for alpha must be selected by trial. Usually it becomes clear what is the optimal value after running the algorithm once.

As an example (the one used in lecture), this is the model when a random value of t0 and t are chosen,

initial

And this is the value after running through four iterations of varying t0 and t1. The various circular lines denote the points which have the same cost.

four iter

And this is the final result. The red crosses in the right figures show the path followed by the gradient descent to reach the minimum.

final_iter

In this way we have a final prediction model for our data-set and now we can hopefully predict the correct price model for any set of given house sizes.

This was Linear Regression with one variable. In my next post I will talk about Linear Regression with Multiple Variables.

Stay tuned….

 

 

Advertisements

Create a simple program using NI LabVIEW

The first program that my mentor told me to develop here at NI was a simple calculator that could perform basic mathematical operations. This was basically done to help the new interns get an idea of the LabVIEW environment and its various feature. Developing programs in LV is kind of different from the traditional methods of writing lines of code and compiling them. LV makes use of an interactive graphical method to develop complex software. You basically just drag and drop blocks that perform some specific functions and connect them together through wires to get the desired output.

Now there may be a lot of tutorials and VIs (programs in LV are saved with a *.vi extension and are known as Virtual Instruments) on the internet that will let you do this. But I am sure that the calculator that I am going to show you is different from them. At least I was not able to find any similar program on the internet.

So lets begin!

The first thing I did was to create a while loop with a nested event structure. All the button clicks an the inputs that user enters are handled by this event structure. I went for a simple minimalist interface with the following components on the screen:-

  1. A numeric control that the is used for taking the input from the user.
  2. A numeric indicator that displays the result.
  3. A string indicator that displays the sequence of operations.
  4. Seven buttons for performing basic mathematical operations.
  5. An Enter button to perform calculations.
  6. A Refresh button to clear the previous operations.
  7. An Exit button that will end the program.

To perform the operations I used a case structure. All the operation buttons were connected to a ‘build array’ function which was further connected to ‘Boolean array to number’ function which finally connected to the Case Selector. In this way I could control the case structure with Boolean selector variables. Below is an image which shows what I mean.

caseStruc

To implement the Refresh button I simply used a select operation which took the output of  the Refresh button as the selection criterion (s). Its implementation is shown below.

Refresh

This means that whenever I pressed the Refresh button, zeros were taken as input and the output window would show 0. Also I used a similar way for displaying the Sequence of Operations. The only difference was that since it was a String indicator, it had to select between the space character and the output from a concatenate string which basically kept on appending each successive operation into the present string. I used shift registers to keep a track of the previous values.

Also I used a shift register to insert my result back into the case structure as the second operand. The bottom-most yellow wire in the last image is actually coming from a shift register in the left. Here is the image to bring some clarity

shift

This is pretty much all there is to the operations part.

Next I decided to add nice touch to my calculator by adding a custom wallpaper. To do this you right click on the front panel scroll bar and select the Properties option. Then in the background tab you can select a custom image as your wallpaper.

 

This got over quite fast. So to explore even more features I decided to create a standalone application that could run without first starting LV in your system. Now this part requires that you have LV Run Time Engine on your system.

To do this select ‘Build Application (EXE) from VI’ option under the Tools menu. Now I could go on telling you step by step on how to do this but luckily I found a great video on how to do this on You Tube. Here is the link to it:

By the end of this video you can easily create a working standalone application.

Here is the final image of how my calculator turned out to be,

Final

 

And here is the block Diagram image,

block

I also made use of some custom buttons for my Exit button which I downloaded from ni.com (https://decibel.ni.com/content/docs/DOC-16709). You can look around on the internet for some more options and features.

You can access and download my program from https://decibel.ni.com/content/docs/DOC-38860

Stay tuned for more…

 

How to teach a metal box ?

Machine learning is one of the most exciting recent technologies. Every time you Google something or when Facebook recommends a photo tag, machine learning comes into picture. If there is an application which you just cannot program by hand, say writing a computer program to make a helicopter fly, we make use of machine learning by letting the computer learn how to fly the helicopter by itself. Most of natural language processing and most of computer vision today is applied machine learning.

So how do we define Machine Learning?

Arthur Samuel, an American pioneer in the field of computer gaming and artificial intelligence, defines it as:

“the field of study that gives computers the ability to learn without being explicitly programmed”

Here is a slightly more recent definition for ML algorithm by Tom Mitchell, who is currently the Chair of Machine Learning Department at Carnegie Mellon University,

“a computer program is said to learn from experience E, with respect to some task T, and some performance measure P, if its performance on T as measured by P improves with experience E”

So for example if you have a spam filter. Then its task T would be to categorize the mail as spam or no spam; the experience parameter E would be the type of mails that you classify as spam or no spam; and its performance P would be the number of mails that it is able to correctly classify as spam. If you are using an efficient algorithm then your filter must make more and more correct choices over time.

There are several different types of learning algorithms. The two main types are supervised learning and unsupervised learning.

In supervised learning we give the algorithm a data set and also tell it what are the correct answers. All the algorithm has to do is try and find a pattern in the values given so that it can predict what will be the outcome for the next set of inputs. This is also known as a regression problem. To explain this I will use the example given by Prof. Andrew in class. vlcsnap-2014-07-20-17h10m57s84

So suppose you have a data set with the housing prices for houses of different sizes. You want your program to predict the prices of house if it is given their size. To do this you will first tell the algorithm the prices for the different sizes and then let it predict for any other input size the price. You then tell the algorithm whether its prediction was correct or the difference between the correct price and the predicted value. This way, the algorithm will try to correct itself and learn to draw a line connecting all the correct input values and give a correct prediction eventually.

So for each example in Supervised Learning, we are told explicitly what is the so-called right answer

In Unsupervised Learning, we’re given data that doesn’t have any labels. So we’re given the data set and we’re not told what to do with it and we’re not told what each data point is. Instead we’re just told, here’s a bunch of data. I don’t know what’s in this data. I don’t know who’s and what type. I don’t even know what the different types of people are, but can you automatically find some structure in the data? An example of this is the Google News. The algorithm automatically decides and creates clusters of similar news headlines on various topics by itself.

In my next post I will start to talk about more specific learning algorithms, how these algorithms work and how we can go about implementing them.

Hello World – customary first post

So I recently started to delve into the interesting topic of Machine Learning. I have started the Machine Learning course on Coursera which is being taken by Prof. Andrew from Stanford University. Pretty soon I realized that the topic is vast and due to the weekly nature of the updates, recalling old lectures and concepts was becoming more and more difficult.

So here it is! My first blog ever.

In this blog I will not only try to keep a record of what is being taught in the Machine Learning class but also about all the other courses that I am taking through various MOOC platforms like edX, Udacity and of course, Coursera. I will also try and put some guides on certain other topics like LabVIEW since I am also an intern at National Instruments – R&D, Bangalore. There might even be some posts and updates on topics like Statistics and R Programming and maybe some posts on my life (well, its my blog after all :P).

Stay tuned for more …