Lecture 21: Convolution Formula

Flash and JavaScript are required for this feature.

Download the video from iTunes U or the Internet Archive.

Topics covered: Convolution Formula: Proof, Connection with Laplace Transform, Application to Physical Problems

Instructor/speaker: Prof. Arthur Mattuck

Today is going to be one of the more difficult lectures of the term. So, put on your thinking caps, as they would say in elementary school. The topic is going to be what's called a convolution. The convolution is something very peculiar that you do to two functions to get a third function. It has its own special symbol. f of t asterisk is the universal symbol that's used for that. So, this is a new function of t, which bears very little resemblance to the ones, f of t, that you started with. I'm going to give you the formula for it, but first, there are two ways of motivating it, and both are important. There is a formal motivation, which is why it's tucked into the section on Laplace transform. And, the formal motivation is the following. Suppose we start with the Laplace transform of those two functions. Now, the most natural question to ask is, since Laplace transforms are really a pain to calculate is from old Laplace transforms, is it easy to get new ones? And, the first thing, of course, summing functions is easy. That gives you the sum of the transforms. But, a more natural question would be, suppose I want to multiply F of t and G of t. Is there, hopefully, some neat formula? If I multiply the product of the, take the product of these two, is there some neat formula for the Laplace transform of that product? That would simply life greatly. And, the answer is, there is no such formula. And there never will be. Well, we will not give up entirely. Suppose we ask the other question. Suppose instead I multiply the Laplace transforms. Could that be related to something I cook up out of F of t and G of t? Could it be the transform of something I cook up out of F of t or G of t? And, that's what the convolution is for. The answer is that F of s times G of s turns out to be the Laplace transform of the convolution. The convolution, and that's one way of defining it, is the function of t you should put it there in order that its Laplace transform turn out to be the product of F of s times G of s. Now, I'll give you, in a moment, the formula for it. But, I'll give you one and a quarter minutes, well, two minutes of motivation as to why there should be such formula. Now, I won't calculate this out to the end because I don't have time. But, here's the reason why there should be such formula. And, you might suspect, and therefore it would be worth looking for. It's because, remember, I told you where the Laplace transform came from, that the Laplace transform was the continuous analog of a power series. So, when you ask a general question like that, the place to look for is if you know an analogous idea, say, does it work. something like that work there? So, here I have a power series summation, (a)n x to the n. Remember, you can write this in computer notation as a of n to make it look like f of n, f of t. And, the analog is turned into t when you turn a power series into the Laplace transform, and x gets turned into e to the negative s, and one formula just turns into the other. Okay, so, there's a formula for F of x. This is the analog of the Laplace transform. And, similarly, G of x here is summation (b)n x to the n. Now, again, the naīve question would be, well, suppose I multiply the two corresponding coefficients together, and add up that power series, summation (a)n (b)n times x to the n. Is that somehow, that sum related to F and G? And, of course, everybody knows the answer to that is no. It has no relation whatever. But, suppose instead I multiply these two guys. In that case, I'll get a new power series. I don't know what its coefficients are, but let's write them down. Let's just call them (c)n's. So, what I'm asking is, this corresponds to the product of the two Laplace transforms. And, what I want to know is, is there a formula which says that (c)n is equal to something that can be calculated out of the (a)i and the (b)j. Now, the answer to that is, yes, there is. And, the formula for (c)n is called the convolution. Now, you could figure out this formula yourself. You figure it out. Anyone who's smart enough to be interested in the question in the first place is smart enough to figure out what that formula is. And, it will give you great pleasure to see that it's just like the formula for the convolution of going to give you now. So, what is that formula for the convolution? Okay, hang on. Now, you are not going to like it. But, you didn't like the formula for the Laplace transform, either. You felt wiser, grown-up getting it. But it's a mouthful to swallow. It's something you get used to slowly. And, you will get used to the convolution equally slowly. So, what is the convolution of f of t and g of t? It's a function calculated according to the corresponding formula. It's a function of t. It is the integral from zero to t of f of u, -- u is a dummy variable because it's going to be integrated out when I do the integration, g of (t minus u) dt. That's it. I didn't make it up. I'm just varying the bad news. Well, what do you do when you see a formula? Well, the first thing to do, of course, is try calculating just to get some feeling for what kind of a thing, you know. Let's try some examples. Let's see, let's calculate, what would be a modest beginning? Let's calculate the convolution of t with itself. Or, better yet, let's calculate the convolution just so that you could tell the difference, t with t squared, t squared with t, to make it a little easier. By the way, the convolution is symmetric. f star g is the same thing as g star f. Let's put that down explicitly. I forgot to last period. So, tell all the guys who came to the one o'clock lecture that you know something that they don't. Now, that's a theory. It's commutative. This operation is commutative, in other words. Now, that has to be a theorem because the formula is not symmetric. The formula does not treat f and g equally. And therefore, this is not obvious. It's at least not obvious if you look at it that way, but it is obvious if you look at it that way. Why? In other words, f star g is the guy whose Laplace transform is F of s times G of s. Well, what would g star f? That would be the guy whose Laplace transform is G times F. But capital F times capital G is the same as capital G times capital F. So, it's because the Laplace transforms are commutative. Ordinary multiplication is commutative. It follows that this has to be commutative, too. So, I'll write that down, since F times G is equal to GF. And, you have to understand that here, I mean that these are the Laplace transforms of those guys. But, it's not obvious from the formula. Okay, let's calculate the Laplace transform of, sorry, the convolution of t star, let's do it by the formula. All right, by the formula, I calculate integral zero to t. Now, I take the first function, but I change its variable to the dummy variable, u. So, that's u squared. I take the second function and replace its variable by u minus t. So, this is times t minus u, sorry. Okay, do you see that to calculate this is what I have to write down? That's what the formula becomes. Anything wrong? Oh, sorry, the du, the integration's with expect to u, of course. Thanks very much. Okay, let's do it. So, it is, integral of u squared t is, remember, it's integrated with respect to u. So, it's u cubed over three times t. The rest of it is the integral of u cubed, which is u to the forth over four. All this is to be evaluated between zero and t at the upper limit. So, I put u equal t, I get t to the forth over three minus t to the forth over four. Of course, at the lower limit, u is zero. So, both of these are terms of zero. There's nothing there. And, the answer is, therefore, t to the forth divided by, a third minus a quarter is a twelfth. So, that's doing it from the formula. But, of course, there is an easier way to do it. We can cheat and use the Laplace transform instead. If I Laplace transform it, the Laplace transform of t squared is what? It's two factorial divided by s cubed. The Laplace transform of t is one divided by s squared. And so, because this is the convolution of these, it should correspond to the product of the Laplace transforms, which is two over s to the 5th power. Well, is that the same as this? What's the Laplace transform of, in other words, what's the inverse Laplace transform of two over s to the fifth? Well, the inverse Laplace transform of four factorial over s to the fifth is how much? That's t to the forth, right? Now, how does this differ? Well, to turn that into that, I should divide by four times three. So, this should be one twelfth t to the forth, one over four times three because this is 24, and that's two, so, divide by 12 to determine what constant, yeah. So, it works, at least in that case. But now, notice that this is not an ordinary product. The convolution of t squared and t is not something like t cubed. It's something like t to the forth, and there's a funny constant in there, too, very unpredictable. Let's look at the convolution. Let's take another example of the convolution. Let's do something really humble just assure you that this, even at the simplest example, this is not trivial. Let's take the convolution of f of t with one. Can you take, yeah, one is a function just like any function. But, you get something out of the convolution, yes, yes. Let's just write down the formula. Now, I can't use the Laplace transform here because you won't know what to do with it. You don't have that formula yet. It's a secret one that only I know. So, let's do it. Let's calculate it out the way it was supposed to. So, it's the integral from zero to t of f of u, and now, what do I do with that one? I'm supposed to take, one is the function g of t, and wherever I see a t, I'm supposed to plug in t minus u. Well, I don't see any t there. But that's something for rejoicing. There's nothing to do to make the substitution. It's just one. So, the answer is, it's this curious thing. The convolution of a function with one, you integrate it from zero to t. Well, as they said in Alice in Wonderland, things are getting curiouser and curiouser. I mean, what is going on with this crazy function, and where are we supposed to start with it? Well, I'm going to prove this for you, mostly because the proof is easy. In other words, I'm going to prove that that's true. And, as I give the proof, you'll see where the convolution is coming from. That's number one. And, number two, the real reason I'm giving you the proof: because it's a marvelous exercise in changing the variables in a double integral. Now, that's something you all know how to do, even the ones who are taking 18.02 concurrently, and I didn't advise you to do that. But, I've arranged the course so it's possible to do. But, I knew that by the time we got to this, you would already know how to change variables at a double integral. So, and in fact, you will have the advantage of remembering how to do it because you just had it about a week or two ago, whereas all the other guys, it's something dim in their distance. So, I'm reviewing how to change variables at a double integral. I'm showing you it's good for something. So, what we are out to try to prove is this formula. Let's put that down in, so you understand. Okay, let's do it. Now, we'll use the desert island method. So, you have as much time as you want. You're on a desert island. In fact, I'm going to even go it the opposite way. I'm going to start with-- you've got a lot of time on your hands and say, gee, I wonder if I take the product of the Laplace transforms, I wonder if there's some crazy function I could put in there, which would make things work. You've never heard of the convolution. You're going to discover it all by yourself. Okay, so how do you begin? So, we'll start with the left hand side. We're looking for some nice way of calculating that as the Laplace transform of a single function. So, the way to begin is by writing out the definitions. We couldn't use anything else since we don't have anything else to use. Now, looking ahead, I'm going to not use t. I'm going to use two neutral variables when I calculate. After all, the t is just a dummy variable anyway. You will see in a minute the wisdom of doing this. So, it's this times the integral, which gives the Laplace transform of g. So, that's e to the negative s v, let's say, times g of v, dv. Okay, everybody can get that far. But now we have to start looking. Well, this is a single integral, an 18.01 integral involving u, and this is an 18.01 integral involving v. But when you take the product of two integrals like that, remember when you evaluate a double integral, there's an easy case where it's much easier than any other case. If you could write the inside, if you are integrating over a rectangle, for example, and you can write the integral as a product of a function just of u, and a product of a function just as v, then the integral is very easy to evaluate. You can forget all the rules. You just take all the u part out, all the v part out, and integrate them separately, a to b, c to d. That's the easy case of evaluating a double integral. It's what everybody tries to do, even when it's not appropriate. Now, here it is appropriate, except I'm going to use it backwards. This is the result of having done that. If this is the result of having done it, what was the step just before it? Well, I must have been trying to evaluate a double integral as u runs from zero to infinity and v runs from zero to infinity, of what? Well, of the product of these two functions. Now, what is that? e to the minus s u times e to the minus s v. Well, I must surely want to combine those. e to the minus s u times e to the minus s v. And, what's left? Well, what gets dragged along? du dv. This is the same as that because of that law I just gave you this is the product of a function just of u, and a function just of v. And therefore, it's okay to separate the two integrals out that way because I'm integrating sort of a rectangle that goes to infinity that way and infinity that way. But, what I'm integrating is over the plane, in other words, this region of the plane as u, v goes from zero to infinity, zero to infinity. Now, let's take a look. What are we looking for? Well, we're looking for, we would be very happy if u plus v were t. Let's make it t. In other words, I'm introducing a new variable, t, u plus v, and it's suggested by the form in which I'm looking for the answer. Now, of course you then have to, we need another variable. We could keep either u or v. Let's keep u. That means v, we just gave a musical chairs. v got dropped out. Well, we can't have three variables. We only have room for two. But, we will remember it. Rest in peace, v was equal to t minus u in case we ever need him again. Okay, let's now put in the limits. Let's put in the integral, the rest of the change of variable. So, I'm now changing it to these new variables, t and u, so it's e to the negative s t. Well, f of u I don't have to do anything to. But, g of v, I'm not allowed to keep v, so v has to be changed to t minus u. Amazing things are happening. Now, I want to change this to an integral du dt. Now, for that, you have to be a little careful. We have two things to do to figure out this; what goes with that? And, we have to put in the limits, also. Now, those are the two nontrivial operations, when you change variables in a double integral. So, let's be really careful. Let's do the easier of the two, first. I want to change from du dv to du dt. And now, to do that, you have to put in the Jacobian matrix, the Jacobian determinant. Ah-ha! How many of you forgot that? I won't even bother asking. Oh, come on, you only lose two points. It doesn't matter if you put it in the Jacobian. As you see, you're going to forget something. You will lose less credit for forgetting than anything else. So, it's the Jacobian of u and v with respect to u and t. So, to calculate that, you write u equals u, v equals t minus u, and then the Jacobian is the partial of the matrix, the determinant of partial derivatives. So, it's the determinant whose entries are the partial of u with respect to u, the partial of u with respect to t, but these are independent variables. So, that's zero. The partial of v with respect to u is negative one. The partial of v with respect to t is one. So, the Jacobian is one. So, if you forgot it, no harm. So, the Jacobian is one. Now, more serious, and in some ways, I think, for most of you, the most difficult part of the operation, is putting in the new limits. Now, for that, you look at the region over which you're integrating. I think I'd better do that carefully. I need a bigger picture. That's really what I'm trying to say. So, here's the (u, v) coordinates. And, I want to change these to (u, t) coordinates. The integration is over the first quadrant. So, what you do is, when you do the integral, the first step is u is varying, and t is held fixed. So, in the first integration, u varies. t is held fixed. Now, what is holding t fixed in this picture mean? Well, t is equal to u plus v. So, u plus v is fixed, is a constant, in other words. Now, where are the curves along which u plus v is a constant? Well, they are these lines. These are the lines along which u plus v equals a constant, or t is a constant. The reason I'm holding t a constant is because the first integration only allows u to change. t is held fixed. Okay, you let u increase. As u increases, and t is held fixed, I'm traversing these lines in this direction. That's the direction on which u is increasing. I integrate from the point, from the u value where they leave the region. And, to enter the region, what's the u value where they enter the region? u is equal to zero. Everybody would know that. Not so many people would be able to figure out what to put for where it leaves the region. What's the value of u when it leaves the region? Well, this is the curve, v equals zero. But, v equals zero is, in another language, u equals t. t minus u equals zero, or u equals t. In other words, they enter the region where u equals zero, and they leave where u is t, has the value of t. And, how about the other guys? For which t's do I want to do this? Well, I want to do it for all these t values. Well, now, the t value here, that's the starting one. Here, t is zero, and here t is not zero. And, if I go out and cover the whole first quadrant, I'll be letting t increase to infinity. The sum of u and v, I will be letting increase to infinity. So, it's zero to infinity. So, all this is an exercise in taking this double integral in (u, v) coordinates, and changing it to this double integral, an equivalent double integral over the same region, but now in (u, t) coordinates. And now, that's the answer. Somewhere here is the answer because, look, since the first integration is with respect to u, this guy can migrate outside because it doesn't involve u. That only involves t, and t is only caught by the second integration. So, I can put this outside. And, what do I end up with? The integral from zero to infinity of e to the negative s t times, what's left? A funny expression, but you're on your desert island and found it. This funny expression, integral from zero to t, f of u, g of t minus u vu, in short, the convolution, exactly the convolution. So, all you have to do is get the idea that there might be a formula, sit down, change variables and double integral it, ego, you've got your formula. Well, I would like to spend much of the rest of the period--- in other words, that's how it relates to the Laplace transform. That's how it comes out of the Laplace transform. Here's how to use it, calculate it either with the Laplace transform or directly from the integral. And, of course, you will solve problems, Laplace transform problems, differential equations using the convolution. But, I have to tell you that most people, convolution is very important. And, most people who use it don't use it in connection with the Laplace transform. They use it for its own sake. The first place I learned that outside of MIT people used a convolution was actually from my daughter. She's an environmental engineer, an environmental consultant. She does risk assessment, and stuff like that. But anyway, she had this paper on acid rain she was trying to read for a client, and she said something about calculating acid rain falls on soil. And then, from there, the stuff leeches into a river. But, things happen to it on the way. Soil combines in various ways, reduces the acidity, and things happen. Chemical reactions take place, blah, blah, blah, blah. Anyways, she said, well, then they calculated in the end how much the river gets polluted. But, she said it's convolution. She said, what's the convolution? So, I told her she was too young to learn about the convolution. And she knows that I thought I'd better look it up first. I mean, I, of course, knew the convolution was, but I was a little puzzled at that application of it. So, I read the paper. It was interesting. And, in thinking about it, other people have come to me, some guy with a problem about, they drilled ice cores in the North Pole, and from the radioactive carbon and so on, deducing various things about the climate 60 billion years ago, and it was all convolution. He asked me if I could explain that to him. So, let me give you sort of all-purpose thing, a simple all-purpose model, which can be adapted, which is very good way of thinking of the convolution, in my opinion. It's a problem of radioactive dumping. It's in the notes, by the way. So, I'm just, if you want to take a chance, and just listen to what I'm saying rather that just scribbling everything down, maybe you'll be able to figure it out for the notes, also. So, the problem is we have some factory, or a nuclear plant, or some thing like that, is producing radioactive waste, not always at the same rate. And then, it carts it, dumps it on a pile somewhere. So, radioactive waste is dumped, and there's a dumping function. I'll call that f of t, the dump rate. That's the dumping rate. Let's say t is in years. You like to have units, and quantity, kilograms, I don't know, whatever you want. Now, what does the dumping rate mean? The dumping rate means that if I have two times that are close together, for example, two successive days, midnight on two successive days, then there's a time interval between them. I'll call that delta t. To say the dumping rate is f of t means that the amount dumped in this time interval, in the time interval from t1 to t1 plus one is approximately, not exactly, because the dumping rate isn't even constant within this time interval. But it's approximately the dumping rate times the time over which the dumping is taking place. That's what I mean by the dump rate. And, it gets more and more accurate, the smaller the time interval you take. Okay, now here's my problem. The problem is, you start dumping at time t equals zero. At time t equal t, how much radioactive waste is in the pile? Now, what makes that problem slightly complicated is radioactive waste decays. If I put some at a certain day, and then go back several months later and nothing's happened in between, I don't have the same amount that I dumps because a fraction of it decayed. I have less. And, our answer to the problem must take account of, for each piece of waste, how long it has been in the pile because that takes account of how long it had to decay, and what it ends up as. So, the calculation, the essential part of the calculation will be that if you have an initial amount of this substance, and it decays for a time, t, this is the amount left at time t. This is the law of radioactive decay. You knew that coming into 18.03, although, it's, of course, a simple differential equation which produces it, but I'll assume you simply know the answer. k depends on the material, so I'm going to assume that the nuclear plant dumps the same radioactive substance each time. It's only one substance I'm calculating, and k is it. So, assume the k is fixed. I don't have to change from one k from one material to a k for another because it's mixing up the stuff, just one material. Okay, and now let's calculate it. Here's the idea. I'll take the t-axis, but now I'm going to change its name to the u-axis. You will see why in just a second. It starts at zero. I'm interested in what's happening at the time, t. How much is left at time t? So, I'm going to divide up the interval from zero to t on this time axis into, well, here's u0, the starting point, u1, u2, let's make this u1. Oh, curses! u1, u2, u3, and so on. Let's call this (u)n. So they're u(n + 1), not that it matters. It doesn't matter. Okay, now, the amount, so, what I'm going to do is look at the amount, take the time interval from ui to ui plus one. This is a time interval, delta u. Divide it up into equal time intervals. So, the amount dumped in the time interval from u(i) to u(i plus one) is equal to approximately f of u(i), the dumping function there, times delta u. We calculated that before. That's what the meaning of the dumping rate is. By time t, how much has it decayed to? It has decayed. How much is left, in other words? Well, this is the starting amount. So, the answer is going to be it's f of (u)i times delta u times this factor, which tells how much it decays, so, time. So, this is the starting amount at time (u)i. That's when it was first dumped, and this is the amount that was dumped, times, multiply that by e to the minus k times, now, what should I put up in there? I have to put the length of time that it had to decay. What is the length of time that it had to decay? It was dumped at u(i). I'm looking at time, t, it decayed for time length t minus u i, the length of time it had all the pile. So, the stuff that was dumped in this time interval, at time t when I come to look at it, this is how much of it is left. And now, all I have to do is add up that quantity for this time, the stuff that was dumped in this time interval plus the stuff dumped in, and so on, all the way up to the stuff that was dumped yesterday. And, the answer will be the total amount left at time, t, that is not yet decayed will be approximately, you add up the amount coming from the first time interval plus the amount coming, and so on. So, it will be f of u(i), I'll save the delta u for the end, times e to the minus k times t minus u(i) times delta u. So, these two parts represent the amount dumped, and this is the decay factor. And, I had those up as I runs from, well, where did I start? From one to n, let's say. And now, let delta t go to zero, in other words, make this delta u go to zero, make this more accurate by taking finer and finer subdivisions. In other words, instead of looking every month to see how much was dumped, let's look every week, every day, and so on, to make this calculation more accurate. And, the answer is, this approach is the exact amount, which will be the integral. This sum is a Riemann sum. It approaches the integral from zero to, well, I'm adding it up from u1 equals zero to un equals t, the final value. So, it will be the integral from the starting point to the ending point of f of u e to the minus k times t minus u to u. That's the answer to the problem. It's given by this rather funny looking integral. But, from this point of view, it's entirely natural. It's a combination of the dumping function. This doesn't care what the material was. It only wants to know how much was put on everyday. And, this part, which doesn't care how much was put on each day, it just is an intrinsic constant of the material involving its decay rate. And, this total thing represents the total amount. And that is, what is it? It's the convolution of f of t with what function? e to the minus k t. It's the convolution of the dumping function and the decay function. And, the convolution is exactly the operation that you have to have to do that. Okay, so, I think this is the most intuitive physical approach to the meaning of the convolution. In this particular, you can say, well, that's very special. Okay, so it tells you what the meaning of the convolution with an exponential is. But, what about the convolution with all the other functions we're going to have to use in this course. They can all be interpreted just by being a little flexible in your approach. I'll give you two examples of this, well, three. First of all, I'll use it for, in the problem set I ask you about a bank account. That's not something any of you are interested in. Okay, so, suppose instead I dumped garbage -- -- undecaying. So, something that doesn't decay at all, what's the answer going to be? Well, the calculation will be exactly the same. It will be the convolution of the dumping function. The only difference is that now the garbage isn't going to decay. So, no matter how long it's left, the same amount is going to be left at the end. In other words, I don't want to exponential decay function. I want to function, one, the constant function, one, because once I stick it on the pile, nothing happens to it. It just stays there. So, it's going to be the convolution of this one because this is constant. It's undecaying -- -- by the identical reasoning. And so, what's the answer going to be? It's going to be the integral from zero to t of f of u du. Now, that's an 18.01 problem. If I dump with a dumping rate, f of u, and I dump from time zero to time t, how much is on the pile? They don't give it. They always give velocity problems, and problems of how to slice up bread loaves, and stuff like that. But, this is a real life problem. If that's the dumping rate, and you dump for t days from zero to time t, how much do you have left at the end? Answer: the integral of f of u du from zero to t. I'll give you another example. Suppose I wanted a dumping function, suppose I wanted a function, wanted to interpret something which grows like t, for instance. All I want is a physical interpretation. Well, I have to think, I'm making a pile of something, a metaphorical pile, we don't actually have to make a physical pile. And, the thing should be growing like t. Well, what grows like t? Not bacteria, they grow exponentially. Before the lecture, I was trying to think of something. So, I came up with chickens on a chicken farm. Little baby chickens grow linearly. All little animals, anyway, I've observed that babies grow linearly, at least for a while, thank God. After a while, they taper off. But, at the beginning, they eat every four hours or whatever. And they eat the same amount, pretty much. And, that adds up. So, let's suppose this represents the linear growth of chickens, of baby chicks. That makes them sound cuter, less offensive. Okay, so, a farmer, chicken farmer, whatever they call them, is starting a new brood. So anyway, the hens lay at a certain rate, and each of those are incubated. And after a while, little baby chicks come out. So, this will be the production rate for new chickens. Okay, and it will be the convolution which will tell you at time, t, the number of kilograms. We'd better do this in kilograms, I'm afraid. Now, that's not as heartless as it seems. The number of kilograms of chickens times t. [LAUGHTER] It really isn't heartless because, after all, why would the farmer want to know that? Well, because a certain number of pounds of chicken eat a certain number of pounds of chicken feed, and that's how much he has to dump, must have to give them every day. That's how he calculates his expenses. So, he will have to know the convolution is, or better yet, he will hire you, who knows what the convolution is. And you'll be able to tell him. Okay, why don't we stop there and go to recitation tomorrow. I'll be doing important things.