Flash and JavaScript are required for this feature.
Download the video from iTunes U or the Internet Archive.
Instructor: Prof. Gilbert Strang
Lecture 32: Convolution (pa...
The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high-quality educational resources for free. To make a donation, or to view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.
PROFESSOR STRANG: So, thanks for coming today. This is a key lecture in the application of Fourier, you could say. So convolution is the big word. And a major application of convolution is filtering, signal processing. So we'll develop that application. But it's nothing but convolution. So the key idea is these convolution rules, where they come from. And what these new symbols, that's the symbol for the convolution of two functions. They could be functions here. Here they're long vectors of coefficients, and so these are the rules. So it's just a little bit of algebra. But it just is so central to all this subject. Signal processing is certainly the most important little thing to know. That if I multiply two functions, so that's where we started last time. If I have a function f with Fourier coefficients c, and a function g with coefficients d, then, oh, wrong. I convolve the coefficients, right. So f has coefficients c, g has coefficients d. So if I multiply the functions I definitely do not just multiply each c times the d. I do this convolution operation, which we have to remember. That's our main first step is to remember what that was about. And then this is the other direction. So we didn't see this before. That I mean, there's always this fantastic symmetry between physical space and frequency space. And convolution in one is multiplication in the other. It's so easy to remember. If I multiply in one space, I do a convolution in the other space. If I do a convolution of functions, I do a multiplication of coefficients. So that's the rule to know. And now if we expand on it, by sort of seeing again what it means. So let me do a quick repeat of this first step to remember what this symbol, convolution symbol, means.
And then another thing I have to do. Here, I'm talking about the infinite case, functions, and with a whole infinite sequence of coefficients. I've got to do the cyclic case, too. Which goes with the discrete transform. OK, but let's start with the infinite case as we did last time. Maybe, here's something. Here's a suggestion. We had f(x). We started with f(x), as the sum -- and remember, to have nice formulas, we're doing the complex version -- c_k*e^(ikx). Let me suggest something. Let me write z for e^(ikx). So I'm going to just write that at the sum of c_k*z^k. So z is e^(ix). This is like a good point to make anyway. When we have this e^(ikx), it's natural to think of that as a complex number on the unit circle. Right? That's always the message, e to the i real is on the unit circle. Absolute value one. And this is great for periodic functions. Because if these functions have period 2pi, and if we think of our function as being on the circle, it obviously has period 2pi. I mean, the picture says, yes. If I go 2pi, I come back. So x is the angle, right? x is the-- In this picture, a little bit unusual maybe, x there would be the angle. And it's just a little bit easier to write. And maybe it even has an official name, the Z-transform. How do you like that? You learn a transform just, in fourteen seconds. Z-transform. It'll make it easier to multiply by g. So g is going to be the sum of d_k*z^k. So just think of these as long, well I'm tempted to say long polynomials. I mean, very long, because they can be infinite series. Infinite negative powers and positive powers. But just think of it as a bunch of powers of z times a bunch of powers of z. And if you multiply a couple of polynomials, you're doing convolution. I guess my message is, you've been doing convolution since the second grade. That's the real message.
Here, let me show you. This is convolution too. Suppose I have to multiply a 123 times 456? OK, so what do you do? Remember back, it's a long way back but we can do this. One, two, three times four, five, six. So I multiply the three, oh there's a little point. Where that second-grade teacher's going to panic. I'm going to write that as 18. And that's 15, and that's 12. Sorry about that, yeah. And 12, 10 and 8, right? Four, five and six. So you see the nine multiplications that you have to do? Nine multiplications, three times three. OK, right now imagine those were, they could have been longer. But they were finite length filters, we could say. And now, what does convolution do? Just what you did in multiplication. When I add 12, 10 and 6, what am I doing? I'm putting together the three times the four, the two times the five, and the one times the six. That's what convolution, those are the things that convolution puts together. So we got an 18, a 27, 28. 13, and 4. So I guess I'm saying that, alright, here's what I really want to say. I want to say that the convolution of those, with these, four, five, six, is this sequence here-- Oh. Yeah, that's right. 4, 13, 28. 27, and 18. If you just look at that. Where did that 13 come from? Let's just remember, where does that 13 come from? That came from, this was z^0. This 13 is 13 z to the first power, right? We're just checking all the powers here. There's 13 z to the first power. Where do we get a first power? We get a z^0 times 5z^1. So that's 5z^1. And we also have 2z^1, times 4z^0. Right? Two times four gave the eight. So the eight from there and the five from there produce that 13. And that's just what you did in multiplication. Right?
So that's multiplication of two series. That's not cyclic. This is definitely not yet cyclic, but we'll make it cyclic in a minute. This is the infinite one, except that we had all zeroes beyond. OK, so that if you, in non-cyclic convolution like this, if I have length m and length n, then I get length m+n, maybe m+n-1. OK, so that's convolution. Without carrying numbers. Without doing it right. OK, so and what does that correspond to? Let me just, so you see it every way. That corresponds to 1+2z+3z^2, multiplying 4+5z+6z^2. And it gave-- That's times. And it gave this thing up to 18z^4. Just, exactly the multiplication that you've always done. OK, so that's the idea. Over on that board I'm going to put a formula for this convolution operation. But my point on this board is, you've done it always. When you multiply a couple of polynomials, you collect powers. And that's all convolution is doing, collecting each power separately. OK, let's do it. So then f(x)g(x) is, when I multiply that polynomial or series by that polynomial, I get some polynomial in, with coefficients, oh I was changed to l, just to have a different symbol there. And what was the formula for l? For h_l? What is the coefficient of z^l? If I multiply that by that, do you remember the story there? When I multiply that by that and I looked for the terms that gave me z^l? OK, well that means that this power times this power is going to be z^l. So that the index, do you remember what happened? It was a lot of different, just the way-- This h_l is here. Here's h_2 or something. I've got to do an addition. Because a bunch of c's come in with different d's, and what's the deal then? c_k comes in with which d? This is the magic number there. What's the subscript that if I look at the coefficient of z^l, I look at each of these. And then I pick out the one of these that will give me a z^l. And which one is it?
d_(l-k). It's that magic quantity that the eye spots perfectly. k and l-k, adding to l simply because z^k z^(l-k) multiplies to z^l. Same thing. Right? OK, this is, now I'll use that notation. This is c convolved with the d. c convolved with d is h. c convolved with d is h. So this is the l-th component. c convolved with d is my symbol. This h is the convolution. And that's the convolution rule. It's just whatever operation you have to do to get the right answer. The right answer when you multiply. So that one. Ready for the discrete case? The finite case? The case when, you have power, when z becomes w. The discrete case. The cyclic case, sorry. Maybe emphasize the cyclic case, meaning it circles around, is the case when z becomes this very special z, on the unit circle that we know as w. OK, and we have to say, and of course it's w_N, I have to tell you in the cyclic case how long the cycle is. So this would be a case.
Watch what you do here to make this cyclic. OK, I have three inputs, so N is three here. Now I'm going to do the cyclic. So instead of z's, I should be putting w's. I will. Just to emphasize, it's good to think of it with the w there, because the w has this special property that's critical to everything, OK so now I'm going to, I think of this as 1w^0, 2w^1, and 3 w squared. I'm thinking of the same multiplication here, but now w's. OK, so I'll end up with 18w^4, and four was the constant and 13 w's and so on. All these numbers. What's the difference? Ready for the key point? Now, what's happened in this cyclic case? Well, the difference is what is w^4? If we're in the cyclic case, N is three now, our guys have length three, our circle is, w now is 1/3 of the way around. So that's my w, here's my w squared. Here is my one. But here it is also w cubed. So w is the same as w^4, w^2 is the same as w^5. So, what's the difference? What can I do now? If I'm in this discrete case, then my inputs are a vector of length N, three. A vector of length N, three. And I want to get out to a vector of length three. I'm not happy with that in the cyclic case, because I'm not happy with w^4. So now tell me again that last, when I do the multiplication and I just do it, there's no difference except in how I write the answer. 18w^4 is the same as? w. w^4 is the same as w, when N is three. So that 18w^4 cycles back in, with this 13. So now if I do the, can I show you the symbol? I'll just do a little circle there. To say this is now the cyclic convolution.
Then this isn't the answer any more. The answer now for cyclic convolution, I only want three numbers. And if you can tell me what those three numbers are, we've got it. Move those over a little to make room for the three numbers. So there's the answer, the space for the answer. What do I write in? How many, what's the constant term? It's 31, right. It's 31. Where did 31 come from? It came from, you could say, cycling that 27w^3 back with the four. Because there's no, w^3 is the same as one. So I've gone around the circle when I come around to w^3. So that 27 and 4 combined into that 31. And let's see. So what multiplications am I doing here? One times four gives me the constant. Two times six, that's two w's and six w squareds, that's 12w^3, that's 12. And then 3 w squareds and five w's is 15w^3. But that's the same as 15. So that's why we get 31. We got 4, we've got 12 and we've got 15. OK, now what's the second component, the w component of the cyclic convolution? Tell me what number do I write in, in that middle position? How many w's do I have? 31 again. This 28, this 18 is coming back by three. To 13. You see, I could have done that multiplication. So coming back to 31, am I going to get another 31? No, what's the w^2 guy? 28. Yeah. 28, because there's no w to the fifth to come back. So 28 uses three multiplications. The 31 there used the four and these two. This came back over to here. And this 31 used, this 18 came back. I could have put the 18 here. You know, I could have lined it up just three. So I'll write a formula for it. So that's the answer. 31, 31, 28.
Could I just suggest a little check on that? Just to check on the numbers. I think that if I add up these numbers, I get six. And if I add up those numbers I get 15. And if I multiply that, I get 90. And if I add those numbers I get 90, right? So those add to 90. So I'm just saying the miracle check is add these, multiply by the sum of those. And you get the sum of those. Why is that? Somehow seems right, doesn't it? Because somehow I've taken all nine products here, and so when I add all the results, I'll have the sum of all nine of these possible products. So I'll have 6 times 15. Actually, here's a good way to look at it. In doing this, I just set w=1. I just set w=1 in the polynomial. When I set w to one, this becomes six, this becomes 15 and the answer becomes 90, when w is one. Yeah. So that's another way to see. And actually, this multiplication, the second grade version, had w, well, I'm almost going to say w=10, but not quite that. Because it's written in the opposite order, right? If w was ten, this would be one. Well, anyway, w can't be ten, it's got to stay on the unit circle, so. So somewhere in the non-cyclic case, it's something like w=10, or w=1/10, maybe. Whatever. OK. Could you take the convolution now, let me give you just another example. Do it mentally. What's the convolution of [0, 1, 0, 0], let me make a little longer.
So I take the, first of all, the non-cyclic convolution of [0, 0, 1, 0]. OK. What's the ordinary-- how long is the answer now? This is just practice. How many components am I going to have in the ordinary convolution of those two guys of length four? I think it's seven. I think it'll be seven. Because we'll have here one, we have no, we have z^0, z^1, z^2, z^3. And here we'll have again the same, it would go up to z^6, but remember there's a z^0, so that's why we have seven. And what will it be? What will it be? I guess, actually, this wasn't a brilliant example, was it? But let's finish it. So I'm just multiplying z by z squared, so what do you get for an answer? I think the one shows up in the z cubed. Is that right? Yeah. z to the first power here, z squared here, z cubed here, and we always have to remember everything in Chapter 4 starts at zero. Zero, z^0's the first one. OK, that's not too great an example, because what happens if I do the circular cyclic convolution? What would be the cyclic convolution? Now I'm expecting four guys only, right? The cyclic keeps the same length. And what would be the answer? Well, there's nobody to fold back, so it would be just [0, 0, 0, 1]. So let me update this a little bit with a one here. OK, just to practice. So suppose I do that convolution. Un-cyclic, first. What do I change here now? I've now got a z squared and I've also got a z cubed, but I only have a single one there. Let's make it a little more interesting.
OK, make it like so. Alright, z+z^2 is what we're looking at, z+z^2 is multiplying z^2+z^3. And in the long form, what do I get? Let me make space, tell me what numbers to put in. If I multiply z+z^2 times z^2+z^3, I get what? 1z^3, how many z^4? Two of them. How many z^5? One, and nobody there. OK. And now the cyclic version would be what? What's my answer now for the cyclic version? Let me take those out. So the cyclic version would bring the two back to the zero. Would bring that one back so there. That zero will still be zero. And I checked that I haven't missed anything by adding those up to get four, and adding this up to get two times two. Yeah. OK, so that's the rule. And it's a lot cleaner to see these answers than to see this formula. And I need, actually of course, now mentioning that formula, I need a cyclic formula. So can I write above it the cyclic formula? What do I get when I'm, instead of this sum, which went from k equal minus infinity to infinity, in the cyclic case, h_k is just going to be a sum from zero to N-1, and there'll be a c_k, and a d something. And now this is the cyclic case, so I guess this makes us, I think what our situation now is we understand the cyclic case from examples. And now we just have the job of how do I put it into algebra. How do I put it into symbols? What's the point? c_k d_n, let's say. But oh no, I'm looking for h_l, yeah. So what's the deal? Here l was k. That one is the sum of that one and that one. So here, l is the sum of k and n, but. What's the but? I mean somehow I've got some wraparound to do, right? When I'm doing the cyclic multiplication and I'm doing the wraparound because w^N, the wraparound comes from that, right? That's why I never get as high as N, because when I get to N I go back to the zeroth power.
OK, so what's the relation of k and n and l here? We just need the right word to express it. What's the word? Mod. So that's the word I'm looking for. k+n is l. With wraparound and wraparound means, the nice notation that people use is mod N. Let's practice. What is two plus two mod seven? Four. Two plus two is four, even in 18.085. Right, OK. But two plus two mod three is, two plus two mod three is? One. Everybody sees it? I'm taking z^2 times z^2, z^4, but I'm doing with N=3, so z^3 is one, so that z^4 is really just z to the first power. So this is the little nifty notation that says make it cyclic. Bring it back so that l only has the values here, zero up to N-1. And then stops. OK. OK, so we'll have more practice with examples when we do some filtering. Have you got that fundamental-- So we've talked about this rule one here. f times g goes to those coefficients. And if it's the cyclic case then I put a circle around that star. And I do the wraparound. But it's just, it's the Z-transform, it's polynomials in z or polynomials in w, and when it's polynomials in w, you use that special property that w^N is one. Yeah, OK.
Now, I see I've written another line, there. That I could convolve functions. Let me do a couple more examples. Couple of examples. First, before I go to that line, OK. So I'm up to this line. A couple of examples here. Let's see, what example would I want to do? Let's see, OK, I want to do one example with a delta function. One example with a delta function. One example with the delta vector. Yeah, let me take the function g(x) identically one. OK, constant function. In this rule. I want to see what happens with the rule. OK, then f(x) g(x) is the same as f(x), right? Because this function g(x) is so simple, it's just one. Now, what about the coefficient? So I have the coefficients of c, what are the Fourier coefficients, what are the d's? Ah yes, what are the d's? So I'm testing my rule on a really really simple case, g(x) identically one. What, you have to tell me, in order to check the right side of the rule you have to tell me the Fourier coefficients for that very special function. What would be the Fourier coefficients? If I expand the function one in a Fourier series, what do I see? I see a one. Yeah, that's it, I see one. So what are its coefficients? d_0, right, is one? And the other d's are all zero, right? So my vector of d, my vector of d's is a whole lot of zeroes on the negative side. A one right there in the center, and then a lot of zeroes. And now I want to convolve that with c. I'm practicing the convolution rule on a case that's so simple it's confusing, right? I mean, it's a big mess, this multiplication.
What do I get out of this? If d is this vector, if d has this property that d_0 is one and others are zero, all others are zero, so this is my little example, what does this sum boil down to? Well, I only get something when l=k, right? I only got something when l=k, because then I have d_0 and that's the only d that's around. So in this sum, something happens only when k and l are the same. And then what happens? Then I have a one, I have c_l, and that's h_l, so that's all I'm concluding then. That this h is the same as c. I'm sorry, it's so dumb. My point is that in convolution, this is the thing that acts like one. Because in multiplication, that's the thing, that's the function that acts like one. That's the function that is one. So this is the one in-- Oh, would you allow me to do this? I'm going to create a matrix with these d's. There's another way to see convolution. Yeah, there's another way to see convolution and discrete convolution. Maybe the discrete one's the better. Yeah can you stand one more way to write the formula? One more way to write, now I'm going to do, I'm going to do discrete convolution. Discrete cyclic. So how am I going to write it? I'm going to write it by a matrix multiplication. Because you know that in this course a matrix was going to show up. So it's going to be a matrix multiplication. So I just have to tell you the matrix, so this is going to be some matrix. Let me take N to be four. So then I have, you watch. So I have four d's, and the output is the four h's. And the rule I'm following is this rule. Is this, the same old rule but with the cyclic part.
And now I want to show you the matrix that'll just do this. Look, I've put the c's in the first column. And then I go, yeah, here's another. So it's a cyclic matrix. So let me finish it up. It's going to be four by four, it's going to be cyclic. So I have a c_0, c_0, c_0, c_0 on the diagonal. That's fine, that's because z to the zero is multiplying all the d's and leaving them in place. And then I have c_1's, and then I think I come around again here for a c_1. And I have c_2's, see where see c_3, c_2. And I come around again, to a c_2 and a c_2. And c_3 comes around to a c_3, a c_3 and a c_3. Well, can you, I hope you can see, this is cyclic matrix. It's only got one, it starts with a vector c, and those are on the diagonal and the diagonals wrap around. That's the other word that you often see when you see the word cyclic, wraparound. It's because you think of a circle. If you go a second time around, it's wrapped around the first time. OK, just can you look and see that this is the right formula for h_0? h_0 is c_0*d_0. Where does that come from? Remember, h_0 is the coefficient of z^0 in the answer. So it comes from c_0*d_0 to the zeroth power in the input. And then why is there is a c_3*d_1? Why is there a c_3*d_1, and then a c_2*d_2 and then a c_1*d_3 all piling up into h_0? Tell me now, why is there a c_3*d_1? Because we're doing mod four is one way to say it. Three and one add to four. Because c_3 is the w cubed guy, and d_1 is the coefficient of w^1 and w^3 times w^1 piles back into the constant. And you see the pattern of that matrix? So these matrices are very important. So they circle around.
Oh, we've actually met a matrix of this type, the first day of 18.085. What was that matrix? It was one of our four great matrices. And now here it is back again. Which one was it? Well, you remember the letter for it. Which isn't going to change. And do you remember the particular matrix? Well, everybody does remember that matrix, right? Twos were on the diagonal, minus ones were on the diagonal, and the diagonal curve continued. Minus one was on this diagonal and that continued and zeroes was on this diagonal. So this is cyclic convolution, the circulant matrix, cyclic convolution by c, what's the c that produces that convolution matrix? It's just, it's got-- well there it is. The first column is it. Right, right. And somehow I would say that that's an even vector. It's sort of, I associate it with cosine. It's an even vector. Here is the zero term, and then these are the same, not to worry about that part. Do you see that we've seen that matrix before? And the cyclic convolution means you take its second differences, of course. We're taking second differences, but everything in our world is cyclic. So the result, the x_4 is x_0. So we're taking second differences -- well, maybe I should say d -- we're taking second differences d_i, 2d_i's, minus d_(i-1) minus d_(i+1). I don't know if this is-- So there's a minus one, two, minus one. And we're cycling around so that d_0 is d_4. And d_1 is d_5, and d_(-1) is d_3, whatever. OK, I'm just reminding you, we've seen these before.
OK, so this is another way to remember the formula. OK now, can I ask you a practical question? A practical question. Let me bring back this second grade multiplication. Well, I have a granddaughter named Elizabeth, I'll have to admit I didn't think about mentioning Elizabeth. She's six. And she delights in sending me long multiplications. I mean, really long. And then every time I talk to her on the phone, she says have you done that one yet? And I say, I'm working on it. I've got MATLAB at work. Because they're ridiculous and I haven't figured out how to tell her. I mean, she just writes page after page. Times 100, plus three, minus seven, just whatever she things of. OK, now I need help from the convolution rule. here, actually. So let's suppose that Elizabeth has given me a multiplication in which I have a thousand digits times a thousand, right. Which Mathematica is prepared to do exactly, right? MATLAB will mess up, but Mathematica and Maple and symbolic packages will do exact computations. So what would be the right way, well let may make it 1,024. 1,024 digits times 1,024 digits. Let's do the cyclic version first. Elizabeth doesn't know about cyclic. Maybe I could teach her that. That'll keep her busy while I'm doing the multiplication. OK, right. Only, her older brother would explain it to her, that's the trouble. OK, so how am I going to do, or how are you going to do on the quiz, multiplication of a 1,024 digits times 1,024? And I'll make it easy by making it cyclic, so I just want 1,024 digits in the answer. OK. How would you do it? Well, before today, you would have just multiplied, right? You would have written down 1,024, two lines of 1,024, done an addition. And you would have had a million multiplications to do. But how would you do it now? Apart from giving it to Mathematica. What's a faster way to do it? What's a faster way to do a convolution? The fast way to do a convolution is to use the convolution rule, go this way. So take these numbers, these 1,024 numbers, in c and these 1,024 numbers in d, and, well what do I have to do? I want to use the convolution rule, because multiplying is fast. Now I've got functions. But I'm in the cyclic case. So I'm in the cyclic case, so what should I do? How can I change this to be the cyclic case? This is like f_j g_j. So multiplication of components of things in function space is convolution of coefficients. So now, this is the cyclic. So let me make it cyclic. So again, what's your problem? The problem is to do this cyclic multiplication. What's the idea? The idea is to transform c back to f, to transform d back to g. Do the multiplications, now I have only 1,024 multiplications. Not 1,024 squared. That's the point. And if I do this directly, I've got 1,024 squared multiplications to do. Much better. Transform back to here, do just 1,024-- what's the MATLAB command for that, when you're multiplying each component by itself? It's not the dot product, notice. It's not the dot product because I'm not summing. Do you know the MATLAB command, if I have a sequence of numbers of numbers, a vector f of length 1,024, and I want to get that result? What's the result? It's a vector of length 1,024 that takes each f times its g. But doesn't do any adds. That's what's there. What's the MATLAB command for that? Dot, yeah. Dot star, right. So that dot says component by component.
OK, so what's the plan here? I do c's back to f. By the Fourier matrix. d back to g, by the Fourier matrix, then I do a very quick multiplication. And then what? Then I mustn't forget. That I'm in frequency space, and what do I have to do? I've got to get back into coefficient space. So I do the inverse transform of-- Here's the formula, then. I'm doing the inverse transform of, so the transform of c dot star, the transform of d. To get c, d. Is that right? So I took c, and I got back into the function. I took d, and got back to its function, with the Fourier matrix. OK, I'm in the Fourier and now I'm in this space. I've added up coefficients to get in this space. Now I do the dot star, the fast one. And then I transform back. So why is that faster? Than just doing it? Because what's the cost of F times c? And how am I going to do that? I'm going to do with the fast Fourier transform, right. That's the point. I can multiply by F, or by F inverse, faster. So I have three of these transforms. I've got to get two guys into the other space, and the answer back out. So I have sort of three of these N log N's but that will easily beat N squared. Right? So if you have a convolution to do, and it's possible to do this, get into the other space where it's just an element by element multiplication. And that would apply in either direction. Because the rule goes both ways. If I have this convolution to do, I would find the coefficients here, the c's, the coefficients of d of the g's. I would do this one by one multiplication, and then I have the Fourier coefficients of the convolution. Right, OK?
Do I have a moment? Well, hardly. Just, can I write down what the formula for a convolution of two functions would look like? Sorry, f(x) convolved with g(x). Let me make it cyclic. Just to see what it would look like. What am I expecting for that convolution? I'm expecting a function, and somehow there's going to be an integral instead of a sum, where I had sums, but now I have integrals. And here's the point. I'll have f(t) times g(x-t) dt. All I'm asking you to look at is the fact that the way here I had k, and l-k. For functions, your eye sees that right away as a t, and an (x-t), dt. These add to the answer. These add to the result, that x. That would be the cyclical one, yeah. So I could go zero-- These are all periodic functions, so all 2pi periods are the same. The book will do that properly. OK, we've got the filtering to discuss on Monday. You can see that this convolution stuff just takes a little new thinking, but it comes out nicely.