Flash and JavaScript are required for this feature.
Download the video from iTunes U or the Internet Archive.
Video Description: Herb Gross illustrates how to invert a square matrix and its relevance to Calculus of Several Variables.
Instructor/speaker: Prof. Herbert Gross
Lecture 3: Inverting a Matrix
The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.
PROFESSOR: Hi. Today's lesson will complete our first excursion into the subject of linear algebra for the sake of linear algebra. And next time, we will turn our attention to applications of linear algebra and linear equations and the like to applications towards functions of several variables in the cases where the functions are not necessarily linear.
Now, you may recall that we concluded our last lesson on the note of talking about inverse matrices, pointing out that an inverse matrix existed, provided that the determinant of the entries making up the matrix was not 0, otherwise, the inverse matrix didn't exist, but that we then said, once we know this, how do we actually construct the inverse matrix? And what I hope to do in today's lesson is to show you a rather elegant manner for inverting a matrix, and at the same time, to point out several other consequences that will be helpful in our study of calculus of several variables in other contexts as we proceed.
At a rate, today's lessons is called quite simply Inverting a Matrix. And to start with a specific example in front of us, let's invert the matrix A inverse, where A is the matrix whose entries are 1, 1, 1, 2, 3, 4, 3, 4, 6. Keep in mind, among the various ways of motivating matrix algebra, we chose as our particular illustration the coding system of the matrix representing the coefficients in a system of m equations with n unknowns. In this particular case of a 3 by 3 matrix, we view the matrix A as coding the system of three linear equations and three unknowns, given by system 1. I call it equation 1. We have this system of three equations with three unknowns.
By the way, observe that this looks very much like a problem that we tackled in high school, except for one very small change. In high school, you may remember that we tackled equations of the form like this, provided that the y's were replaced by constants. In other words, we weren't used to thinking of linear systems of equations in terms of functions this way, but we were used to thinking of things like x1 plus x2 plus x3 equals 2, 2y1 plus 3x2 plus 4x3 equals 9. 3x1 plus 4x2 plus 6x3 equals 13. And then the question was to solve specifically for x1, x2, and x3. All right?
Now, the only difference we're making now is that we're not going to specify what y1, y2, and y3 are. We'll mention the relationship between our problem and the high school one later in the lecture. But what we are saying is this. We would like, in general, given these three equations and three unknowns, this system of linear equations, we would like to be able to solve for the x's in terms of the y's. And that is what I'm going to show you is identified with inverting the matrix.
What I hope to show you as we go along is that if the matrix A represents these coefficients, assuming that we find the technique whereby we can express uniquely for the x's in terms of the y's, the coefficients that express the x's as linear combinations of the y's-- those coefficients involving the y's-- will turn out to be A inverse. We'll talk about that more later. I simply wanted to tell you that now so that you'd have a head start into what my plan of attack is going to be.
First of all, let me mention a method which you may have seen previously in school, but a very effective method for systematizing or solving systems of linear equations in a way that works very, very systematically. The idea is something like this.
We know that if we replace one equation-- well, let's put it this way. If we add equals to equals, the results are equal. If we multiply equals by equals, the results are equal, et cetera. Consequently, given a system of equations like this, we know that if we replace one of the equations by that equation plus a multiple of any other equation, you see-- if we add two equations together, things of this type, whereas we change the system, we don't change the solutions to the system. In other words, there are devices whereby we can systematically replace the given system of equations, hopefully, by simpler systems of equations where the simpler system of equations has the same set of solutions as the original system.
And the method that I'm thinking of is called diagonalization. For example, what we often say is something like this. We say, look, wouldn't it be nice to get a system in which x1 appears only in the first equation? And one way of doing that, notice, is we observe that in the second equation, we have 2x1. In the first equation, we have x1. If we were to subtract twice the first equation from the second equation and use that for our new second equation, notice that the new second equation would have no x1 term in it, because 2x1 minus 2x1 is 0.
Similarly, since the leading term in the third equation is 3x1, we observe that if we were to subtract three times the first equation from the third, the resulting third equation would have no x1 term in it so that the new system would have x1 in the first equation and no place else. Similarly, we could move over now to our new second equation, and then eliminate x2 from every equation but the second by subtracting off the appropriate multiple of the second equation.
Now, what I'm going to do here is that, since it gets kind of messy working with the y1's, y2's, and y3's and it stretches out all over the place, let me invent a new type of coding system. I will take these three equations and three unknowns and represent it by a 3 by 6 matrix as follows. I will start up with the following coding system. I will use place value. The first three columns will represent x1, x2, x3, respectively. The next three columns will represent y1, y2, and y3, respectively.
I'll assume an equal sign divides this. And this will simply be a coding device for reading what? x1 plus x2 plus x3 equals y1. 2x1 plus M 2 plus 4x3 equals y2. 3x1 plus 4x2 plus M x3 equals y3. Notice why I say y3. It's 0y1 plus 0y2 plus 1y3, which, of course, is y3. Notice also that the left side of this 3 by 6 matrix is precisely the matrix A that we're investigating back here. I just want that for future reference.
Now, let me go through this procedure. What I will now do is subtract twice the first equation from the second and make that my new second one. And I will subtract three times the first equation from the third to give me my new third equation, which means, in terms of the matrix notation, that I am replacing the second row of this matrix by the second row minus twice the first row. I'm replacing the third row by the third row minus three times the first row. I now wind up with this particular 3 by 6 matrix.
Again, just by brief review, what this matrix system tells me, for example, is, among other things, that x2 plus 2x3 is minus 2y1 plus y2. You see, not only do I have a coding system here, but the right hand side allows me to check how I got the resulting matrix on the left hand side. It tells me how the y's had to be combined to give me this combination of the x's.
At any rate, I now want to eliminate 1 every place in the second column except in the second row. So looking at this matrix now, what I will do is I will replace the first row by the first minus the second-- see, 1 minus 1 is 0. So my new first row will have a 0 here. I will then replace the third row by the third minus the second, again, meaning what? That my new third row will have a 0 here. Leaving the details for you to check for yourself, the resulting 3 by 6 matrix now looks like this.
By the way, again notice what this tells me. Among other things, this tells me that x3 is minus y1 minus y2 plus y3, which somehow tells me that to find x3, I essentially have to do what? I can subtract the sum of the first and second equation from the third equation, and that will give me x3.
By the way, just as a quick look over here, just to show you how nice this technique is, maybe somebody who had been very quick could have looked at this system of equations and said, you know, if I add the first two equations, this will give me a 3x1, this will give me a 4x2, and therefore, if I subtract that from the third equation, the x1 term and the x2 terms will drop out, et cetera. The thing I want to point out is that this matrix system makes geniuses out of all of us, that we do not have to be able to see these intricacies to be able to get down to a stage like this.
By the way, again, what this tells me is I now have a simpler method for solving this original system of equations. Namely, I can now solve this system of equations-- the one that's coded by this-- and the solutions of this equation will be the same set as the solutions to my original equation.
At any rate, continuing this diagonalization method still more, what I now do is I try to replace what? In the third column, I try to get 0's everywhere except in the third row. And to put that in still other words, I guess, in the language of matrices, what I'm going to try to do is get what? The three by three identity matrix to make up the first half over here-- that's in terms of the matrix language.
Now again, to show you quickly what I'm trying to do here, I'm simply going to what over here? I'm going to replace the first row by the first plus the third. I'm going to replace the second row by the second row minus twice the third. I now wind up with this particular matrix.
I claim that the right half-- see, the left half of this matrix is the identity matrix. I claim that the left half is the identity matrix. I calm the right half is A inverse. That's what my claim is. And by the way, let's see why I say that. What system of equations does A inverse code? Just read what this says. It says x1 is equal to 2y1 minus 2y2 plus y3. x2 is equal to y2 minus 2y3. And x3 is minus y1 minus y2 plus y3.
In other words, A inverse codes this particular system. Let's call that system 2. How are system 2 and what we call system 1 related? The relationship was that system 1 expressed the y's in terms of the x's. System 2 shows what it would look like if the x's were expressed in terms of the y's. And so what we have done, in the language of-- whatever you want to call it-- but what we've done is we've inverted the role of the variables, that we started with the y's given in terms of the x's. We now have the x's expressed in terms of the y's.
Now, since you may want to see this more in matrix language, let me show you another way of seeing the same result that uses the word inverse in the way we're used to seeing inverse matrix defined in terms of our present course. The matrix algebra interpretation is simply this. Let's rewrite system 1 so that you can see it here.
And the interesting point is the following. Think of these y's as forming a column vector or column matrix-- whichever way you want to read this thing. In other words, I am going to view this as being the matrix which has three rows and one column, namely the matrix y1 y2, y3.
I am now going to take the matrix of coefficients here, which, as we recall, is the matrix A. But that's what? 1, 1, 1, 2, 3, 4, 3, 4, 6. And I am now going to write x1, x2, x3 as a column vector rather than as a row vector. And in turn, I can think of that as being a column matrix-- again, 3 by 1 matrix-- three rows, one column.
Now, what I claim is that this system of three equations and three unknowns, so to speak, is equivalent to the single matrix equation given by this. In fact, to make this easier to read, let's let capital Y denote the column matrix y1, y2, y3. Let's let capital X denote the column matrix x1, x2, x3, and capital A, as before, the original matrix.
And my claim is that this system of three equations and three unknowns is represented by the single matrix equation capital Y equals A times capital X. And just to show you by way of a very quick review why this is the case, remember how we multiply two matrices. To find the term in the first row, first column, we multiply the first row of the first by the first column of the second. And since that's supposed to equal this matrix, and since matrices are equal only if they're equal entry by entry, it means that this product must equal y1.
Notice what that says. It says x1 plus x2 plus x3 equals y1. Similarly, 2 times x1 plus 3 times x2 plus 4 times x3 is going to have to equal y2. And similarly, 3 times x1 plus 4 times x2 plus 6 times x3 is going to have to equal y3. And that's precisely the original system of equations that we began with.
Now, notice that this expresses the matrix Y as something times the matrix X. If A inverse exists, multiply both sides of this matrix equation by A inverse. On the left hand side, we get A inverse Y. On the right hand side, A inverse times AX by associativity it is A inverse A times X. A inverse times A is the identity matrix. The identity matrix doesn't change the given matrix that it's multiplying. So we have that A inverse Y is just X.
And notice that the role of A inverse is, again, just as mentioned above, that what we've done is we have started with Y given as A matrix times X. We now have X expressed as A matrix times Y. In fact, if you now take the matrix that we called A inverse originally and carry out this operation, we will see that this is precisely what does happen. This is what the identification is.
In other words, the system of equations is identifiable with a single matrix equation. And that matrix equation can be inverted, meaning we can solve for X in terms of Y if and only if A inverse exists.
There is still a third interpretation, an interpretation that goes back to the idea of mappings. And that is we can think of the original system 1 as a mapping that carries three dimensional space into three dimensional space. How does it carry three dimensional space into three dimensional space? Well, we view it as carrying the 3-tuple x1, x2, x3 into the 3-tuple y1, y2, y3 by the system of equations defined by system 1.
And to show you what I mean by that in more detail, suppose I pick x1 to be 1, x2 to be 1, x3 to be 1. Just for the sake of argument, suppose I picked that. x1, x2, x3 are all 1. What this means is-- look. Come back to the original system, replace x1, x2, and x3 by 1. If we do that, we see that y1 is 3, y2 is 9, and y3 is 13. Therefore, we would say what? That f bar maps 1, 1, 1 into 3, 9, 13.
By the way, you're going to notice this sooner or later. This bothered me when I first did this. This is just a little aside. You may have noticed that when I was arbitrarily writing down numbers to pick for y1, y2, and y3 when we first started at the beginning of the lecture, I said why don't we let y1 be 2, y2 be 9, and y3 be 13. And it looks almost like I made a mistake and meant to get 2, 9, 13 back again. I got 3, 9, 13.
Notice that to get 2, 9, 13, I couldn't pick x1, x2, and x3 all to be 1, but I could pick x1 to be minus 1, x2 to be 1, x3 to be 2. And then what that says is if I let x1 be minus 1, x2 be plus 1, and x3 be 3, it says that, under those conditions, y1 would have been 2, y2 would have been 9, and y3 would have been 13.
Of course, the question that you now may ask, which I hope gives you some sort of a cross reference to go by, is how did I know? See, starting with 1, 1, 1, it seemed very easy to find the image 3, 9, 13. Now, I say, starting with the image, how do I find that this is what mapped into 2, 9, 13. How did I know? See, here it was easy. I started with the input x bar. But here, I'm starting with the output and I'm trying to find the input. How did I find minus 1, 1, 2 that fast?
And the answer is that's precisely where the role of the inverse matrix comes in. You see, going back to what we did just a little while ago, when we inverted system 1 to get system 2, what did we do? We showed how to find x1, x2, x3 once y1, y2, and y3 were given. So all I had to do here was, knowing that I wanted y1 to be 2, y2 to be 9, and y3 to be 13, I just shoved those values in here and saw what x1, x2, and x3 were.
In fact, this is why working with the y's is an improvement over what the high school system was of picking the y's to be constant. You see, once I've done this, I can find what x1, x2, x3 look like in terms of the y's all in one shot, regardless of what values of y somebody is going to give me to play around with later on.
But returning now to our immediate problem, notice all I'm saying is that when I'm given a system of n linear equations and n unknowns, or even n equations and unknowns-- in this case we had three linear equations and three unknowns-- I can view that in a very natural way as a mapping from three dimensional space into three dimensional space.
Notice that the makers of coefficients A and the mapping f bar-- and I'll coin a word here, because I don't know exactly how I want you to see this-- but they're identifiable. They're essentially equivalent. Notice that, given x bar, I want to apply f bar to that to find y bar. Notice that, in terms of our matrix language, we show that, after all, the matrix capital Y, which had entries y1, y2, y3, is just a different numeral for writing this vector.
This is a 3-tuple. This is the same 3-couple written as a column rather as a row. x bar is a 3-tuple, but capital X that same 3-tuple written as a column vector rather than as they row vector. But notice that the identification that if y bar is f bar of x bar, that says the same information as saying that capital Y is the matrix A times capital X.
These convey equivalent pieces of information. In still other words, notice that, in finding the images of f bar here and inverting the images, I use precisely the same device that I used in the system without giving the mapping interpretation. In other words, these do convey the same information so that from a major point of view, we can identify the matrix A with the mapping f bar.
And you see, what this means is that if we can identify A with f bar, we should be able to identify A inverse with the inverse function. See, that's the other way we've used the word "inverse"-- as an inverse function. The existence of A inverse is equivalent to the existence of f bar inverse.
And by the way, we showed that. You see, using the system of equations that we started with, we showed that if x1 is minus 1, x2 is 1, and x3 is 2, that y1 would be 2, y2, would be 9, and y3 would be 13. We also showed that if we start with y1 equals 2, y2 equals 9, and y3 equals 13, then the x's had to be given by minus 1, 1, and 2.
In terms of the mapping, what we're saying is not only does minus 1, 1, 2 map into 2, 9, 13, but 2, 9, 13 can only be obtained from the 3-tuple minus 1, 1, 2. You see, in essence, in this particular case, the existence of A inverse showed that f bar was [? onto ?]. We could get any 3-tuple we wanted. And it was also 1 to 1.
The important point in terms of our theory of matrices is that, since f bar inverse need not exist for a given f, A inverse need not exist for a given matrix A. In other words, if we're identifying the matrix with the function, if the inverse function doesn't exist, the inverse matrix doesn't exist. In other words, not all matrices are invertible. And let's emphasize that.
We knew that from before, but let's emphasize this now in terms of our structure. Namely, let's take a new example. Let's try to invert a matrix which doesn't have an inverse. Obviously, if I try to do that, I should fail dismally. And let's see how that does work out.
Let me take the matrix A to be 1, 1, 1, 2, 3, 4, 3, 4, 5. I deliberately called this A again, even though we've used A before. I've deliberately made this A look very much like the A we worked with before. If you compare this with our original A, the only difference is I now have a 5 here rather than a 6. I simply wanted to show you how a subtle change can affect the existence of an inverse.
Again, by way of very quick review, before we go any further, notice that this matrix codes what system of equations? This particular system of equations. And so that you don't see any hanky panky going on here and wondering why I knew that something was going to go wrong here, let me show you in advance why I can't convert this, then show you how our matrix coding system gives us the same information, even if we weren't bright enough to see this.
Just for the sake of argument, suppose I had a keen enough eye to observe right away that if I added the first two equations, I would get that y1 plus y2 is what? 3x1 plus 4x2 plus 5x3. And now I look at this result, compare that with my third equation where the right hand sides are identical, and conclude from this that this is not really three equations with three unknowns. This is really two equations with three unknowns, because this equation here is either redundant or incompatible, meaning notice that these two facts tell me that y1 plus y2 has to equal y3.
To show you what I mean by that, let me pick specific values for y1, y2, and y3, in which y3 is not equal to y1 plus y2. And to keep this as close to the problem that we were working with originally, let me pick y1 to be 2, y2 to be 9, and y3 to be 13. Notice that this system of equations cannot possibly have a solution now. This system of equations here cannot possibly have a solution, because if it had a solution, it would imply what? That if I add these two equations, it would say that 3x1 plus 4x2 plus 5x3 equals 11-- if I add these two equations.
That would say it also has to equal 13. And since 11 is equal to 13-- and I have to correct a mistake I made in an earlier lecture-- it's for large values of 11 that 11 is equal to 13. But other than that, 11 is unequal to 13. This means that it's impossible to find solutions to this equation.
Well, we'll come back to that in a little while and we'll also emphasize this in the exercises. But the main point now is to show how we could have obtained this information by our matrix coding system. And we do exactly the same thing as we did before. We take this matrix and we augment it by the 3 by 3 identity matrix.
Remember again, what our coding system says is that the first three columns represent the x's, the next three the y's. For example, this says x1 plus x2 plus x3 equals y1, et cetera. We now go through what I call the row reducing operations, which means what? We're going to replace rows by the rows plus or minus a suitable multiple of another row so that we do what? We wind up with a 1 eventually only here and 0's elsewhere, a 1 here, 0's here, a 1 here and 0's here-- if we can do that.
Let's see what happens here. What I'm going to do, of course, is I'm going to replace the second row here by the second minus twice the first. I'll replace the third row by the third minus 3 times the first. I now wind up with this equivalent matrix, that this matrix codes the same system of equations as the original matrix.
Now, I'm going to replace the first row by the first row minus the second. I will replace the third row by the third row minus the second. And since I already have a 1 in here, I will leave the second row intact. And if I do that, I now wind up with this 3 by 6 matrix.
And now a very interesting thing has happened-- maybe alarming, maybe unhappy, but nonetheless interesting. I observe that on the left hand side of my 3 by 6 matrix here, I have a row consisting entirely of 0's, that in row reducing this, I have lost any chance of getting a 1 over here because this whole row has dropped out. See, one of two things had to happen when I row reduced. Either when I finished, I would have had the identity matrix here, or else I would have had, in this left hand half, at least one row of 0's. In this case, I got the one row of 0's.
Let's translate what this thing says. It says x1 minus x3 is 3y1 minus y2 x2 plus 2x3 is minus 2y1 plus y2. And also says that what? 0x1 plus 0x2 plus 0x3-- in other words, 0-- is equal to minus y1, minus y2 plus y3. And let's write that to emphasize this.
You see, what we really have from the first two equations is this system here-- two equations but in three unknowns-- x1, x2, x3. We're assumed that the y's are knowns now. The third piece of information I've written in an accentuated form to emphasize the fact that the third equation tells us under what conditions we're in trouble as far as the choice of the y's are concerned. Namely, what this tells us is that minus y1 minus y2 plus y3 has to be 0 to be able to solve the system at all.
That's the same as saying, of course, that y3 is y1 plus y2. So the first thing we know is that, unless y3 equals y1 plus y2, we cannot express the x's in terms of the y's. Why can't we express them? The system is incompatible. That's exactly, by the way, what we observe over here. Only now, we don't have to be that bright to notice it.
See, here we have to astutely observe that the third equation was the sum of the first two. Our matrix coding system told us right away that what went wrong here was that y3 equals y1 plus y2. So unless y3 equals y1 plus y2, we cannot express the x's in terms of the y's.
The second thing that goes wrong is that even if y3 equals y1 plus y2-- then, for example, we still can't express the x's in terms of the y's, because-- I say "for example" here, because I could have picked any of the x's as far as that's concerned. But for example, x3 is independent of y1 and y2. I add "and y3" in parentheses here, because notice in the second case y3 is not independent of y1 and y2. y3 is equal to y1 plus y2, so it's automatically specified as soon as I know y1 and y2.
But here what I'm saying. If I now take this result here and come back to this system of equations, assuming that y1 plus y2 is y3 so that I now have this constraint here met, notice that I can pick x3 completely at random, independently of what the values of y1 and y2 are and solve for x1 and x2. In other words, I have a degree of freedom here. The x's are not completely determined by the y's. One of the x's can be chosen at random. It floats around.
OK, so you see what goes wrong here from the point of view of inverting the equations? If the constraint, meaning y3 equals y1 plus y2 is not met, you can't invert at all. The system is incompatible. If the constraint is met, the x's aren't completely determined by the y's. You can pick one of the x's at random, and then solve for the remaining two.
Now, what does this mean in terms of our function interpretation? And I think you may enjoy this, because I think it'll show things very nicely. What it means, for example, is this. Suppose I now look at the mapping that's defined by the system of equations that we were just talking about. Notice that what we're saying is that, unless y3 equals y1 plus y2, there's no hope of finding x's that solve that particular equation.
In other words, if it turns out, given this mapping, that y bar is 1, 1, 3, then there is no x bar such that f bar of x bar is equal to y bar, because 3 is unequal to 1 plus 1. You see, again, what we're saying, in terms of the mapping here, is to get a solution, the y's have to be no longer arbitrary but chosen to satisfy a specific constraint. And in the system of equations 3 that we dealt with in this case, that constraint was y3 equals y1 plus y2.
In other words, what this means in terms of the mapping is that f bar is not onto. See, notice, the equation wasn't invertible. A inverse didn't exist. Therefore, f bar inverse shouldn't exist, and for f bar inverse not to exist, it's efficient that f bar be either not onto, or else not 1 to 1. And we have now shown that f bar is not onto, because nothing maps into 1, 1, 3.
We can go one step further and show that even if the constraint is met-- suppose we pick a y bar where y3 is equal to y1 plus y2, for example, 1, 1, 2. See, 2 is equal to 1 plus 1. Then the system of equations that we had before applies. What was that system of equations that we had before? It was x1 minus x3 equals 3y1 minus y2. X2 plus 2x3 equals minus 2y1 plus y2. The third equation wasn't an equation. It was the constraint that required that y3 equals y1 plus y2. So that's been met.
So we now down to what? Two equations and three unknowns-- first of all, let's take a look and see what happens over here. In this case, y1 is 1. y2 is 1. So x1 minus x3 is simply 2. x2 plus 2x3 is simply minus 1. I can now pick x3 at random and solve for x1 and x2 in terms of that, namely, what? x1 is simply x3 plus 2, and x2 is minus 2x3 minus 1.
In other words, what this tells me is that a whole bunch of vectors are mapped into 1, 1, 2. A whole bunch of 3-tuples-- if you want to use word "3-tuple" rather than vector-- are mapped into 1, 1, 2 by f bar. How are those vectors chosen? Well, you can pick the third component at random, in which case, the first component, x1, is simply x3 plus 2, and the second component, x2, is simply minus 2x3 minus 1, where the x3 is the value that you chose here.
Since there are infinitely many ways of choosing x3, there are infinitely many 3-tuples that map into 1, 1, 2. In particular, f bar is not 1 to 1 either, which means, in terms of a picture-- and in fact, let's see how we can work this very comfortably. Why not pick x3 to be 0 for a start? If we pick x3 to be 0, notice that this 3-tuple becomes 2, minus 1, 0. So 2, minus 1, 0-- 2, minus 1, 0 maps into 1, 1, 2.
If we pick x3 to be say, 1, this becomes 3, minus 3, 1. So for example 3, minus 3, 1 also maps into 1, 1, 2. So that's the demonstration that f bar is not 1 to 1. And the fact that nothing maps into 1, 1, 3 is the indication that f bar is not onto.
By the way, by choosing a three dimensional space, we can interpret this result geometrically. Now, what I'm going to do here is really optional from your point of view. If you don't like studies of three dimensional space, ignore what I'm saying, otherwise-- I don't mean ignore it. Take it with a grain of salt. Look it over. Read it. It is not crucial to understanding abstractly what's going on. But I thought a visual example like this might be helpful.
Notice that when you're mapping E3 into E3, you're mapping three dimensional space into three dimensional space. By the way, notice the n-tuple notation here. I prefer to write the x1, x2, x3 space rather than the traditional x, y, z space. In other words, f bar is mapping x1, x2, x3 space into y1, y2, y3, space.
And what we have shown so far is that with our particular choice of function under consideration now. The only way f bar maps something into anything is if that anything has what property to it? What does the point y1, y2, y3 have to satisfy in order that we can be sure that something from the x1, x2, x3 space maps into it? And the constraint was what? That y3 equals y1 plus y2.
Now, this is a good time to review our equation of planes in three dimensional space. Sure, we're used to talking about z equals x plus y. But don't worry about that part notationally. y3 equals y1 plus y2 is the equation of a plane. In fact, what plane is it? Instead of going into a long harangue about this, let me just find three points that are on this plane.
In particular, 0, 0, 0 satisfies this plane. 2, 2, 4-- see, 2 plus 2 is 4. 1 plus 3 is 4. So three quick points I could pick off are the origin, 2, 2, 4; 1, 3, 4. These are three points not on the same straight line. They determine the plane. The image of f bar is this particular plane. In other words, f bar does not use up all the free space. It, in particular, uses up this one particular plane.
In particular, let's look at our point 1, 1, 2. What did we see before mapped into 1, 1, 2? We saw that any 3-tuple-- x1, x2, x3-- where we picked x3 at random, whereupon x1 had to be x3 plus 2, x2 had to be minus 2x3 minus 1. Notice that this system of equations defines a line. You see, think of x3 as being a parameter here. If it bothers you, think of x3 is being t.
And what you have is that x1 is t plus 2, x2 is minus 2t minus 1, and x3 is t. That is the parametric form of an equation for a straight line. You see, notice you have only one degree of freedom t, and t appears only linearly.
Now, look. To find the actual straight line geometrically, all I have to do is find two points on that line. To relate this with what we were doing just previous to this, notice that two points that we know are on this line are 2, minus 1, 0 and 3, minus 3, 1.
Let's look at this graphically. Let's locate the point 2, minus 1, 0, 3, minus 3, 1 in the x1, x2, x3 space. Draw the line that joins those two points. That line l has the property that every single one of these points on l map into the point 1, 1, 2 in the y1 y2, y3 space.
In other words, our mapping in this case is neither 1 to 1 nor onto. The entire image of this three dimensional space is a single plane. And given a point not in that plane, therefore, nothing maps into that, but given a point in that plane, an entire line from the domain maps in there.
And I simply point this out to you so that for those of you would like to have a picture of what's happening, I think this is a very nice way of seeing it. The reason I don't want to emphasize the picture more is simply because we will deal in the notes and the exercises with systems of more than three equations and three unknowns. We will deal with linear mapping, say, from E5 into E5, in which case, we can't draw the picture, so the picture is only a nice mental aid in setting things up in the 2 by 2 case or the 3 by 3 case. But other than that, one has to rely on the equivalent of the matrix system for reducing these equations.
So let me summarize, then, what our main aim was and what we have accomplished. Leaving out the motivations of everything for the time being, let me just summarize how we invert a matrix, provided that the matrix inverse exists. Given the n by n matrix A, we form the n by 2n matrix,
A augmented by the n by n identity matrix. In other words, what matrix is it? The left half is a1 up to a1n, an1, ann. In other words, it's the matrix A. And the right half is just the n by n identity matrix.
We then row reduce this. Remember what mean by row reduce-- replacing rows by the rows plus or minus an appropriate multiple of another row, hoping to get the identity matrix to come over to the left hand side over here. We row reduce this matrix. And then what we saw was what?
We did it in the 3 by 3 case. We're now doing it in the n by n case. By the way, notice that if this throws you off abstractly, in the 3 by 3 case, the augmented matrix that we form was a 3 by 6. All we're saying is in the n by n case, you're tacking on each side by side, 2n by n matrices, which makes the resulting matrix n by 2n.
At any rate, what we do is we reduce this n by 2n matrix until one of two things happens. And one of the two must happen. Namely, either the left hand half-- this half-- contains at least one row of 0's, in which case A inverse doesn't exist. The other alternative is that when we row reduce, the left hand side-- the left half, in other words-- does become the identity matrix, in which case, the right half is A inverse.
That's all there is to this mechanically. You row reduce the n by 2n matrix. All right. If the left hand side contains a row of 0's, there is no inverse to the given matrix. If it reduces to the identity matrix, then the right half is the inverse matrix. Part 2 of our summary-- namely, what can we conclude if A inverse exists? In terms of matrix algebra, if A inverse exists, we can solve. In other words, we can invert the equation Y equals AX to conclude that X equals A inverse Y.
And finally, in terms of mappings, if we define f bar from E sub n into E sub n-- f bar is a mapping from n dimensional space into n dimensional space-- defined by this, then f bar inverse exists if and only if A inverse exists, where A is the matrix of coefficients over here. And if A inverse doesn't exist, then f bar inverse not only doesn't exist, but f bar itself is neither 1 to 1 nor onto.
Now, admittedly, today's lecture was a little bit on the long side. It was, though, in a sense a single concept, hopefully with many interpretations. It is a very, very crucial thing in many of our investigations to do at least one of two things. One thing is going to be we are going to want to know whether the inverse of a matrix exists.
In certain applications, we will not care to know what that inverse matrix is. All we're going to want to know is does it exist. In that case, that's when it's sufficient to compute the determinant of a matrix. Namely, if the determinant is 0, it doesn't exist. If the determinant is not 0, the inverse matrix does exist.
On other occasions, it will not be enough to know that the inverse exists. We will also require that we be able to construct A inverse from the given matrix A and that, in essence, was the lesson that we were trying to cover today. Starting with our next lesson, we will apply these results to systems of functions of several real variables. And until that time, good bye.
Funding for the publication of this video was provided by the Gabriella and Paul Rosenbaum Foundation. Help OCW continue to provide free and open access to MIT courses by making a donation at ocw.mit.edu/donate.
Study Guide for Lecture 3: Inverting a Matrix
- Chalkboard Photos, Reading Assignments, and Exercises (PDF)
- Solutions (PDF - 2.9MB)
To complete the reading assignments, see the Supplementary Notes in the Study Materials section.