Lecture 13: Classical Statistical Mechanics Part 2

Flash and JavaScript are required for this feature.

Download the video from iTunes U or the Internet Archive.

Description: This is the second of three lectures on Classical Statistical Mechanics.

Instructor: Mehran Kardar

The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.

PROFESSOR: So we've been wondering how a gas, such as the one in this room, with particles following Newtonian equations of motion, comes to equilibrium. We decided to explain that by relying on the Boltzmann equation. Essentially, we said, let's look at the density, or probability, that we will find particles with momentum p at location q at time t, and we found that within some approximations, we could represent the evolution of this through a linear operator on the left-hand side equal to some collision second order operator on the right-hand side.

The linear operator is a bunch of derivatives. There is the time derivative, then the coordinate moves according to the velocity, which is P over m. And summation over the index alpha running from 1 to 3, or x, y, z, is assumed. And at this stage, there is symmetry within the representation in terms of coordinates and momenta. There is also a force that causes changes of momentum.

Actually, for most of today, we are going to be interested in something like the gas in this room far away from the walls, where essentially there is no external potential. And for the interest of thinking about modes such as sound, et cetera, we can remove that.

The second order operator-- the collision operator-- however, explicitly because we were thinking about collisions among particles that are taking place when they are in close contact, breaks the symmetry between coordinate and momenta that were present here. So this collision operator was an integral over the momentum of another particle that would come at some impact parameter with relative velocity, and then you had a subtraction due to collisions, and then addition due to reverse collisions.

Now, what we said is because of the way that this symmetry is broken, we can distinguish between averages in position and in momentum. For example, typically we are interested in variations of various quantities in space. And so what we can do is we can define the density, let's say at some particular point, by integrating over momentum. So again, I said I don't really need too much about the dependence on momentum, but I really am interested in how things vary from position to position.

Once we have defined density, we could define various averages, where we would multiply this integral by some function of P and q, and the average was defined in this fashion. What we found was that really what happens through this collision operator-- that typically is much more important, as far as the inverse time scales are concerned, than the operators on the left, it has a much bigger magnitude-- is that momenta are very rapidly exchanged and randomized. And the things that are randomized most slowly are quantities that are conserved in collision.

And so we focused on quantities that were collision conserved, and we found that for each one of these quantities, we could write down some kind of a hydrodynamic equation. In particular, if we looked at number conservation-- two particles come in, two particles go out-- we found that the equation that described that was that the time derivative of the density plus moving along the streamline had a special form.

And here, I defined the quantity u alpha, which is simply the average of P alpha over m, defined in the way that averages defined over here. And that this is the number of particles, how it changes if I move along the streamline And variations of this, if you are incompressible, comes from the divergence of your flow velocity. And actually, this operator we call the total, or streamline, derivative.

Now, the next thing that we said is that OK, number is conserved, but momentum is also conserved. So we can see what equation I get if I look at momentum. But in fact, we put the quantity that was momentum divided by mass to make it like a velocity, and how much you deviate from the average that we just calculated. And this quantity we also call c. And when we looked at the equation that corresponded to the conservation of this quantity in collisions, we found that we had something like mass times acceleration along the streamline. So basically this operator multiplied by mass acting on this was external force.

Well, currently we've said this external force to be zero-- so when we are inside the box. Here there was an additional force that came from variations of pressure in the gas. And so we had a term here that was 1 over n d alpha of P alpha beta, and we needed to define a pressure tensor P alpha beta, which was nm expectation line of c alpha c.

Finally, in collisions there is another quantity that is conserved, which is the kinetic energy. So here the third quantity could be the kinetic energy-- or actually, we chose the combination mc squared over 2, which is the additional kinetic energy on top of this. And it's very easy to check that if kinetic energy is conserved, this quantity is also conserved. We call the average of this quantity in the way that we have defined above as epsilon.

And then the hydrodynamic equation is that as you move along the streams-- so you have this derivative acting on this quantity epsilon-- what we are going to get is something like minus 1 over n d alpha of a new vector, h alpha. h alpha, basically, tells me how this quantity is transported.

So all I need to do is to have something like mc squared over 2 transported along direction alpha. And then there was another term, which was P alpha beta-- the P alpha beta that we defined above-- times u alpha beta. U alpha beta was simply the derivative of this quantity, symmetrized.

So the statement is that something like a gas-- or any other fluid, in fact-- we can describe through these quantities that are varying from one location to another location. There is something like a density, how dense it is at this location. How fast particles are streaming from one location to another location. And the energy content that is ultimately related to something like the temperature, how hot it is locally.

So you have these equations. Solving these equations, presumably, is equivalent, in some sense, to solving the Boltzmann equation. The Boltzmann equation, we know, ultimately reaches an equilibrium, so we should be able to figure out how the system, such as the gas in this room, is if disturbed, comes to equilibrium, if we follow the density, velocity, and temperature.

Now, the problem with the equations as I have written is that they are not closed in terms of these three quantities, because I need to evaluate the pressure, I need to evaluate the heat transfer vector. And to calculate these quantities, I need to be able to evaluate these averages. In order to evaluate these averages, I need to know f. So how did we proceed? We said well, let's try to find approximate solutions for f. So the next task is to find maybe some f, which is a function of p, q, and t, and we can substitute over there.

Now, the first thing that we said was, OK, maybe what I can do is I can look at the equation itself. Notice that this part of the equation is order of 1 over the time it takes for particles to move in the gas and find another particle to collide with, whereas the left-hand side is presumably something that is related to how far I go before I see some variation due to the external box.

And we are really thinking about cases where the gas particles are not that dilute, in the sense that along the way to go from one side of the room to another side of the room, you encounter many, many collisions. So the term on the right-hand side is much larger.

If that is the case, we said that maybe it's justifiable to just solve this equation on the right-hand side as a zeroth order. And to solve that, I really have to set, for example, the right-hand side that I'm integrating here, to 0. And I know how to do that. If log f involves collision conserved quantities, then ff before is the same as ff after.

And the solution that I get by doing that has the form of exponential involving conserved quantities, which are the quantities that I have indicated over here-- let's say, such as c. And so log of that would be something that involves-- I can write as mc squared over 2 with some coefficient that, in principle, varies from location to another location.

I want to integrate this and come up with the density, so I put the density out here, and I normalize the Gaussian. And so this is a reasonable solution. Indeed, this is the zeroth order solution for f-- I'll call that f0.

So once you have the zeroth order solution, from that you can calculate these two quantities. For example, because the zeroth order solution is even in c, the heat vector will be 0, because it is symmetric in the different components. The pressure tensor will be proportional to delta alpha beta. OK? We started with that.

Put those over here, and we found that we could get some results that were interesting. For example, we could see that the gas can have sound modes. We could calculate the speed of sound. But these sound modes were not damped. And there were other modes, such as sheer modes, that existed forever, confounding our expectation that these equations should eventually come to an equilibrium.

So we said, OK, this was a good attempt. But what was not good enough to give us complete equilibrium, so let's try to find a better solution. So how did we find the better solution? We said that the better solution-- let's assume that the solution is like this, but is slightly changed by a correction. And the correction comes because of the effect of the left-hand side, which we had ignored so far. And since the left-hand side is smaller than the right-hand side by a factor involving tau x, presumably the correction will involve this tau x. OK. Good?

We said that in order to see that as a correction, what I need to do is to essentially linearize this expression. So what we did was we replaced this f's with f0's 1 plus g, 1 plus g, and so forth. The zeroth term, by construction, is 0. And so if we ignore terms that are order of g squared, we get something that is linear in g. OK?

Now, it's still an integral operator, but we said let's approximate that integration and let's do a linearized one collision time approximation. What that approximation amounted to was that, independent of what this deviation is, if we are relaxed to the zeroth order solution over time scale that is the same, and that times scale we'll call tau x. So essentially we wrote this as minus f0, essentially g, which is the difference between f and f0, divided by tau x. f0, this was g.

No. I guess we don't have this. We wrote it in this fashion. It's just writing the way that I wrote before. We wrote this as g over tau x, where g is the correction that I have to write here. And you can see that the correction is obtained by multiplying minus tau x with l acting on f0 divided by f0, which is l acting on log of f0. So I would have here 1 minus tau x-- let's make this curly bracket-- and then in the bracket over here, I have to put the action of l on the log of f0.

So I have to-- this is my f0. I take its log. So the log will have minus mc squared over 2 kt and the log of this combination, and I do d by dt plus P alpha over m acting on this log, and then a bunch of algebra will leave you to the following answer. Not surprisingly, you're going to get factors of mkT. So you will get m over kT. You get c alpha c beta minus delta alpha beta c squared over 3 acting on this rate of strength tensor that we have defined over here.

And then there are derivatives that will act on temperature-- because temperature is allowed to vary from position to position-- so there will be a term that will involve the derivative of temperature. In fact, it will come in the form over T c alpha multiplying another combination which is mc squared over 2kT minus 5/2.

Let me see if I got all of the factors of 1/2 correct. Yeah. OK. So this is the first order term. Presumably, there will be high order corrections, but this is the improved solution to the Boltzmann equation beyond the zeroth order approximation. It's a solution that involves both sides of the equation now. We relied heavily on the right-hand side to calculate f0, and we used the left-hand side-- through this log l acting on log of f0-- to get the correction that is order of tau x that is coming to this equation.

So now, with this improved solution, we can go back and re-check some of the conclusions that we had before. So let's, for example, start by calculating this pressure tensor P alpha beta, and see how it was made different. So what I need to do is to calculate this average. How do I calculate that average? I essentially multiply this f, as I have indicated over there, by c alpha c beta, and then I integrate over all momenta.

Essentially, I have to do integrations with this Gaussian weight. So when I do the average of c alpha c beta with this Gaussian weight, not surprisingly, I will get the delta alpha beta, and I will get from here kT over m. Multiplying by nm will give me mkT. So this is the diagonal form, where the diagonal elements are our familiar pressures.

So that was the zeroth order term. Essentially, that's the 1 over here, multiplying c alpha c beta before I integrate. So that's the 1 in this bracket. But there will be order of tau x corrections, because I will have to multiply two more factors of c with the c's that I have over here.

Now, none of these terms are important, because these are, again, odd terms. So when I multiply two c's with three or one c the average will be zero. So all of the averages are really coming from these terms. Now, these terms involve four factors of c's. Right? There's the two c's that I put out here-- and actually, I really have to think of these as different indices, let's say nu nu, nu nu. Summation convention again assumed. And then when I multiply by c alpha c beta, I will have, essentially, four indices to play with-- c alpha c beta, c nu c nu. But it's all done with the Gaussian weight out here.

And we showed and discussed how there was this nice fixed theorem that enabled you to rapidly calculate these Gaussian weights with four factors of c, or more factors of c-- doesn't really matter. And so in principle you know how to do that, and I'll skip the corresponding algebra and write the answer. It is proportional to minus tau x. And what it gives you, once you do these calculations, is a factor of u alpha beta minus delta alpha beta over 3 nu gamma gamma. And again, let me make sure I did not miss out anything. And apparently I missed out the factor of 2.

So really, the only thing that's happened, once we went and included this correction, we added this term. But the nice thing about this term is that potentially, it has off-diagonal terms. This matrix initially was completely diagonal. The corrections that we have calculated are potentially off-diagonal coming from this term that corrected the original Gaussian weight.

So where is that useful? Well, one of the problems that we discussed was that I can imagine a configuration of velocities, let's say close to a wall-- but it does not have to be a wall, but something where I have a profile of velocities which exist with components only along the x direction, but vary along the y direction. So this ux is different from this ux, because they correspond to different y's. So essentially, this is a pattern of sheering a gas, if you like.

And the question is, well, hopefully, this will come to relax and eventually give us, let's say, uniform velocity, or zero velocity, even better-- if there's a wall, and the wall velocity is zero. So how does that happen? Well, the equation that I have to satisfy is this one that involves u. So I have m.

Well, what do I have to have? I have du by dt plus something that acts on ux. ux only has variations along the y direction. So this term, if it was there, had to involve a derivative along the y direction multiplying uy, but it's not present, because there's no uy.

On the right-hand side of the equation, I have to put minus 1 over n. Again, the only variations that I'm allowed to have are along the y direction. So I have, when I do the summation over alpha, the only alpha that contributes is y. And so I need the pressure y, but I'm looking for velocities along the x direction, so the other index better be x. OK? So the time course of the velocity profile that I set up over here is determined by the y derivative of the yx component of the pressures tensor.

Now, previously our problem was that we stopped at the zeroth order, and at the zeroth order, the pressure tensor was diagonal. It didn't have a yx component. So this profile would stay forever. But now we do have a yx component, and so what do I get? I will get here minus 1 over n dy dy.

The yx component will come from minus 2 tau x multiplying nkT and then multiplying-- well, this term, again, is diagonal. I can forget about that term. So it comes from the uxy. What is uxy? It is 1/2 of the x derivative of uy that doesn't exist, and the y derivative of ux that does exist. OK?

So what we have, once we divide by m, is that the time derivative of ux is given by a bunch of coefficients. The n I can cancel if it does not vary. What I have is the 2's cancel. The n's cancel. I will have tau x kT over m. tau x kT over m, that's fine. And then the second derivative along the y direction of ux.

So suddenly, I have a different equation. Rather than having the time derivative does not change, I find that the time derivative of ux is proportional to Laplacian. It's a diffusion equation. And we know the solution to the diffusion equation, how it looks qualitative if I have a profile such as this. Because of the fusion, eventually it will become more and more uniform in time.

And the characteristic time over which it does so, if I assume that, let's say, in the y direction, I have a pattern that has some characteristic size lambda, then the characteristic relaxation time for diffusion will be proportional to lambda squared. There's a proportionality here. The constant of proportionality is simply this diffusion coefficient. So it is inversely. So it is m kT tau x.

Actually, we want to think about it further. kT over m is roughly the square of the terminal velocities of the particles. So lambda squared divided by v squared is roughly the time that you would have ballistically traveled over this line scale of the variation. The square of that time has to be provided by the characteristic collision time, and that tells you the time scale over which this kind of relaxation occurs. Yes?

AUDIENCE: So if x depends linearly, why? Would still get 0 on the-- why can't--

PROFESSOR: OK. So what you are setting up is a variation where there is some kind of a sheer velocity that exists forever. So indeed, this kind of pattern will persist, unless you say that there's actually another wall at the other end. So then you will have some kind of a pattern that you would relax. So unless you're willing to send this all the way to infinity, is will eventually relax. Yes?

AUDIENCE: Can you just say again the last term on the second line, the 1/2 term. Where did that come from? Can you just explain that?

PROFESSOR: This term?

AUDIENCE: Yeah. The last half of it.

PROFESSOR: The last half of it. So what did we have here? So what I have is that over here, in calculating the pressure, when I'm looking at the xy component, I better find some element here that is off diagonal. What's the off diagonal element here? It is u xy What is u xy? u xy is this, calculated for. So it is 1/2 of dx uy, which is what I don't have, because my uy is 0. And the other half or it, of symmetrization, is 1/2 the value of x. Yes?

AUDIENCE: [INAUDIBLE] derivative [INAUDIBLE] for ux. You're starting out the second term uy uy. Also, shouldn't there be a term ux dx?

PROFESSOR: Yes. Yeah. So this is a-- yes. There is a term that is ux dx, but there is no variation along the x direction. I said that I set up a configuration where the only non-zero derivatives are along the y direction. But in general, yes. You are right. This is like a divergence. It has three terms, but the way that I set it up, only one term is non-zero.

AUDIENCE: Also, if you have [INAUDIBLE] one layer moving fast at some point, the other layer moving slower than that. And you potentially can create some kind of curl? But as far as I understand, this would be an even higher order effect? Like turbulence.

PROFESSOR: You said two things. One of them was you started saying viscosity. And indeed, what we've calculated here, this thing is the coefficient of viscosity. So this is really the viscosity of the material. So actually, once I include this term, I have the full the Navier-Stokes equations with viscosity. So all of the vortices, et cetera, should also be present and discussed, however way you want to do it with Navier-Stokes equation into this equations.

AUDIENCE: So components of speed which are not just x components, but other components which are initially zero, will change because of this, right?

PROFESSOR: Yes, that's right. That's right.

AUDIENCE: You just haven't introduced them in the equation?

PROFESSOR: Yes. So this is I'm looking at the initial profile in principle. I think here I have set up something that because of symmetry will always maintain this equation. But if you put a little bit bump, if you make some perturbation, you will certainly generate other components of velocity. OK?

And so this resolved one of the modes that was not relaxing. There was another mode that we were looking at where I set up a situation where temperature and density were changing across the system, but their products-- that is, the pressure-- was uniform. So you had a system that was always there in the zeroth order, despite having different temperatures at two different points. Well, that was partly because the heat transport vector was zero.

Now, if I want to calculate this with this more complicated equation, well, what I need-- the problem before with the zeroth order was that I had three factors of c, and that was odd. But now, I have a term in the equation this is also odd. So from here, I will get a term, and I will in fact find eventually that the flow of heat is proportional to gradient of temperature. And one can compute this coefficient, again, in terms of mass, density, et cetera, just like we calculated this coefficient over here. This will give you relaxation.

We can also look at the sound modes including this, and you find that the wave equation that I had before for the sound modes will also get a second derivative term, and that will lead to damping of the modes of sound. So everything at this level, now we have some way of seeing how the gas will eventually come to equilibrium. And given some knowledge of rough parameters of the gas, like-- and most importantly-- what's the time between collisions, we can compute the typical relaxation time, and the relaxation manner of the gas.

So we are now going to change directions, and forget about time dependents. So if you have questions about this section, now may be a good time. Yes?

AUDIENCE: [INAUDIBLE] what is lambda?

PROFESSOR: OK. So I assume that what I have to do is to solve the equation for some initial condition. Let's imagine that that initial condition, let's say, is a periodic pattern of some wavelength lambda. Or it could be any other shape that has some characteristic dimension.

The important thing is that the diffusion constant has units of length squared over time. So eventually, you'll find that all times are proportional to some lengths where times the inverse of the diffusion constant. And so you have to look at your initial system that you want to relax, identify the longest length scale that is involved, and then your relaxation time would be roughly of that order.

OK. So now we get to the fourth section of our course, that eventually has to do with statistical mechanics. And at the very, very first lecture, I wrote the definition for you that I will write again, that statistical mechanics is a probabilistic approach to equilibrium-- also microscopic-- properties of large numbers of degrees of freedom.

So what we did to thermodynamics was to identify what equilibrium microscopic properties are. They are things such as identifying the energy, volume, number of particles of the gas. There could be other things, such as temperature, pressure, number, or other collections of variables that are independently sufficient to describe a macrostate. And what we're going to do is to indicate that macroscopic set of parameters that thermodynamically characterize the equilibrium system by big M. OK?

Clearly, for a probabilistic approach, you want to know something about the probability. And we saw that probabilities involving large numbers-- and actually various things involving large numbers-- had some simplified character that we are going to exploit. And lastly, in the last section, we have been thinking about microscopic description.

So these large number of degrees of freedom we said identify some point, let's say, for particles in a six-n dimensional phase space. So this would for gas particles be this collection p and q. And that these quantities we know are in fact subject to dynamics that is governed by some Hamiltonian. And we saw that if we sort of look at the entirety of the probability in this six n dimensional phase space, that it is subject to Liouville's question that said that dp by dt is a Poisson bracket of H and p.

And if the only thing that we take from equilibrium is that things should not change as a function of time, then requiring this probability in phase space to be independent of time would then require us to have a p which is a function of H, which is defined in p and q and potentially other conserved quantities.

So what we are going to do in statistical mechanics is to forget about how things eventually reach equilibrium. We spent a lot of time and energy thinking about how a gas reaches equilibrium. Having established what it requires to devise that solution in one particular case, and one of few cases where you can actually get far, you're going to ignore that.

We say that somehow my system reached this state that is time independent and is equilibrium, and therefore the probability should somehow have this character. And it depends, however, on what choice I make for the macrostate. So what I need here-- this was a probability of a microstate. So what I want to do is to have a statement about the probability of a microstate given some specification of the macrostate that I'm interested in. So that's the task.

And you're going to do that first in the ensemble-- and I'll tell you again what ensemble means shortly. That's called microcanonical.

And if you recall, when we were constructing our approach to thermodynamics, one of the first things that we did was we said, there's a whole bunch of things that we don't know, so let's imagine that our box is as simple as possible. So the system that we are looking is completely isolated from the rest of the universe. It's a box. There's lots of things in the box.

But the box has no contact with the rest of the universe. So that was the case where essentially there was no heat, no work that was going into the system and was being exchanged, and so clearly this system has a constant energy. And so you can certainly prescribe a particular energy content to whatever is in the box.

And that's the chief identity of the microcanonical ensemble. It's basically a collection of boxes that represent the same equilibrium. So if there is essentially a gas, the volume is fixed, so that there would be no work that will be done. The number of particles is fixed, so that there is no chemical work that is being done.

So essentially, in general, all of the quantities that we identified with displacements are held fixed, as well as the energy in this ensemble. So the microcanonical ensemble would be essentially E, x, and N would be the quantities that parametrize the equilibrium state. Of course, there's a whole collection of different microstates that would correspond to the same macrostate here.

So presumably, this bunch of particles that are inside this explore a huge multidimensional microstate. And what I want to do is to assign a probability that, given that I have fixed E, x, and n, that a particular microstate occurs. OK? How do I do that?

Well, I say that OK, if the energy of that microstate that I can calculate is not equal to the energy that I know I have in the box, then that's not one of the microstates that should be allowed in the box. But presumably there's a whole bunch of micro states whose energy is compatible with the energy that I put in the box.

And then I say, OK, if there is no other conserved quantity-- and let's assume that there isn't-- I have no way a priori of distinguishing between them. So they are just like the faces of the dice, and I say they're all equally likely. For the dice, I would give 1/6 here. There's presumably some kind of a normalization that depends on E, x, and n, that I have to put in.

And again, note that this is kind of like a delta function that I have. It's like a delta of H minus E, and it's therefore consistent with this Liouville equation. It's one of these functions that is related through H to the probability on the microstate.

Now, this is an assumption. It is a way of assigning probabilities. It's called assumption of equal a priori probabilities. It's like the subjective assignment of probabilities and like the faces of the dice, it's essentially the best that you can do without any other information. Now, the statement is that once I have made this assumption, I can derive the three laws of thermodynamics. No, sorry. I can derive two of the three laws of thermodynamics. Actually, three of the four laws of thermodynamics, since we had the zeroth law. So let's proceed. OK?

So we want to have a proof of thermodynamics. So the zeroth law had something to do with putting two systems in contact, and when they were in equilibrium, there was some empirical temperature from one that was the same as what you had for the other one. So basically, let's pick our two systems and put a wall between them that allows the exchange of energy. And so this is my system one, and they have an spontaneous energy bond. This is the part that is two, it has energy E2.

So I start with an initial state where when I look at E2 and E2, I have some initial value of E1, 0, let's say, and E2, 0. So my initial state is here, in these two. Now, as I proceed in time, because the two systems can exchange energy, E1 and E2 can change. But certainly, what I have is that E1 plus E2 is something E total that is E1,0 plus E2,0. Which means that I'm always exploring the line that corresponds to E1 plus E2 is a constant, which it runs by 45-degree along this space.

So once I remove the constraint that E1 is fixed and E2 is fixed, they can exchange. They explore a whole bunch of other states that is available to them. And I would probably say that the probability of the microstates is the same up to some normalization that comes from E1. So the normalization-- not the entirety 1 plus 2-- is a microcanonical ensemble but with energy E total.

So there is a corresponding omega that is associated with the combined system. And to obtain that, all I need to do is to sum or integrate over the energy, let's say, that I have in the first one. The number of states that I would have, or the volume of phase space that I would have if I was at E1, and then simultaneously multiplying by how many states the second part can have at the energy that corresponds to E total minus E1.

So what I need to do is to essentially multiply the number of states that I would encounter going along this axis, and the number of states that I would multiply going along the other axis. So let's try to sort of indicate those things with some kind of a color density. So let's say that the density is kind of low here, it comes kind of high here, and then goes low here. If I move along this axis, let's say that along this axis, I maybe become high here, and stay high, and then become low later. Some kind of thing.

So all I need to do is to multiply these two colors and generate the color along this-- there is this direction, which I will plot going, hopefully, coming out of the board. And maybe that product looks something like this. Stating that somewhere there is a most probable state, and I can indicate the most probable state by E1 star and E2 star, let's say. So you may say, OK, it actually could be something, who says it should have one maximum. It could have multiple maxima, or things like that. You could certainly allow all kinds of things.

Now, my claim is that when I go and rely on this particular limit, I can state that if I explore all of these states, I will find my system in the vicinity of these energies with probability that in the n goes to infinity, limit becomes 1. And that kind of relies on the fact that each one of these quantities is really an exponentially large quantity.

Before I do that, I forgot to do something that I wanted to do over here. I have promised that as we go through the course, at each stage we will define for each section its own definition of entropy. Well, almost, but not quite that. Once I have a probability, I had told you how to define an entropy associated with a probability.

So here, I can say that when I have a probability p, I can identify the average of log p, the factor of minus p log p, to be the entropy of that probability. Kind of linked to what we had for mixing entropy, except that in thermodynamics, entropy had some particular units. It was related to heat or temperature. So we multiplied by a quantity kb that has the right units of energy divided by degrees Kelvin.

Now, for the choice of the probability that we have, it is kind of like a step function in energy. The probability is either 0 or 1 over omega. So when you're at 0, this p log p will give you a 0. When you're over here, p log p will give you log of omega. So this is going to give you kb log of omega.

So we can identify in the macrocanonical ensemble, once we've stated what E, x, and n are, what the analog of the six that we have for the throwing of the dices, what's the number of microstates that are compatible with the energy. We will have to do a little bit of massaging that to understand what that means in the continuum limit. We'll fix that. But once we know that number, essentially the log of that number up to a factor would give something that would be the entropy of that probability that eventually we're going to identify with the thermodynamic entropy that corresponds to this system.

But having done that definition, I can rewrite this quantity as E1 e to the 1 over kb S1 plus 1 over kb S2. This one is evaluated at E1. This one is evaluated at E2, which is E total minus E1.

Now, the statement that we are going to gradually build upon is that these omegas are these types of quantities that I mentioned that depend exponentially on the number of particles. So let's say, if you just think about volume, one particle then, can be anywhere in this. So that's a factor of v. Two particles, v squared. Three particles, v cubed. n particles, v to the n. So these omegas have buried in them an exponential dependence on n. When you take the log, these are quantities that are extensing. They're proportionate to this number n that becomes very large.

So this is one of those examples where to calculate this omega in total, I have to evaluate an integral where the quantities are exponentially large. And we saw that when that happens, I could replace this integral essentially, with its largest value. So I would have 1 over kb S1 of E1 star plus S2 of E2 star.

Now, how do I identify where E1 star and E2 star are? Well, given that I scale along this axis or along that axis, essentially I want to find locations where the exponent is the largest. So how do I find those locations? I essentially set the derivative to 0. So if I take the derivative of this with respect to E1, what do I get? I get dS1 by dE1.

And from here, I get dS2 with respect to an argument that is in fact has a minus E1, so I would have minus dS2 with respect to its own energy. With respect to its own energy argument, but evaluated with an energy argument that goes opposite way with E1. And all of these are calculated at conditions where the corresponding x's and n's are fixed. And this has to be 0, which means that at the maxima, essentially, I would have this condition.

So again, because of the exponential character, there could be multiple of these maxima. But if one of them is slightly larger than the other in absolute terms, in terms of the intensive quantities, once I multiply by these n's, it will be exponentially larger than the others. We sort of discussed that when we were doing the saddle point approximation, how essentially the best maximum is exponentially larger than all the others. And so in that sense, it is exponentially much more likely that you would be here as opposed to here and anywhere else.

So the statement, again, is that just a matter of probabilities. I don't say what's the dynamics by which the energies can explore these two axes. But I imagine like shuffling cards, the red and the black cards have been mixed sufficiently, and then you ask a question about the typical configuration.

And in typical configurations, you don't expect a run of ten black cards, or whatever, because they're exponentially unlikely. So this is the same statement, that once you allow this system to equilibrate its energy, after a while you look at it, and with probability 1 you will find it at a location where these derivatives are the same.

Now, each one of these derivatives is completely something that pertains to its own system, so one of them could be a gas. The other could be a spring. It could be anything. And these derivatives could be very different quantities. But this equality would hold. And that is what we have for the zeroth law.

That when systems come into equilibrium, there is a function of parameters of one that has to be the same as the function of the parameters of the other, which we call the empirical temperature. So in principal, we could define any function of temperature, and in practice, for consistently, that function is one over the temperature.

So the zeroth law established that exists this empirical function. This choice of 1 over T is so that we are aligned with everything else that we have done so far, as we will see shortly. OK?

The next thing is the first law, which had something to do with the change in the energy of the system. When we go from one state to another state had to be made up by a combination of heat and work. So for this, let's expand our system to allow some kind of work. So let's imagine that I have something-- let's say if it is a gas-- and the quantity that would change if there's work done on the gas is the volume.

So let's imagine that there is a piston that can slide. And this piston exerts for the gas a pressure, or in general, whatever the conjugate variable is to the displacement that's we now allow to change. So there's potentially a change in the displacement.

So what we are doing is there's work that is done that corresponds to j delta x. So what happens if this j delta x amount of work is done on the system? Then the system goes from one configuration that is characterized by x to another configuration that is characterized by x plus delta x.

And let's see what the change in entropy is when that happens. Change in entropy, or the log of the number of states that we have defined. And so this is essentially the change starting from E and x to going to the case where x changed by an amount dx. But through this process I did work, and so the amount of energy that is inside the system increases by an amount that is j delta x.

So if all of these quantities are infinitesimal, I have two arguments of S now have changed infinitesimally, and I can make the corresponding expansion in derivatives. There is a dS with respect to dE at constant x. The corresponding change in E is j delta x. But then there's also a dS by dx at constant E with the same delta x, so I factored out the delta x between the two of them.

Now, dS by dE at constant x is related to the empirical temperature, which we now set to be 1 over T. Now, the claim is that if you have a situation such as this, where you have a state that is sitting in equilibrium-- let's say a gas with a piston-- then by definition of equilibrium, the system does not spontaneously change its volume.

But it will change its volume because the number of states that is availability to it increases, or S increases. So in order to make sure that you don't go to a state that is more probable, because there are more possibilities, there are more microstates, I better make sure that this first derivative is 0. Otherwise, depending on whether this is plus or minus, I could make a corresponding change in delta x that would increase delta S.

So what happens here if I require this to be 0 is that I can identify now that the derivative of this quantity dS by dx at constant E has to be minus j over T. Currently, I have only these two variables, and quite generically, I can say that dS is dS by dE at constant x dE plus dS by dx at constant E dx. And now I have identified these two derivatives. dS by dE is 1 over T, so I have dE over T. dS by dx is minus j over T, so I have minus j dx over T. And I can rearrange this, and I see that dE is T dS plus j dx.

Now, the j dx I recognize as before, it's the mechanical work. So I have identified that generically, when you make a transformation, in addition to mechanical work, there's a component that changes the energy that is the one that we can identify with the heat. Yes?

AUDIENCE: Could you explain why you set that S to 0?

PROFESSOR: Why did I set delta S to 0? So I have a box, and this box has a certain volume, and there's a bunch of particles here, but the piston that is holding this is allowed to slide to go up and down. OK? Now, I can ask what happens if the volume goes up and down.

If the volume goes up and down, how many states are available? So basically, the statement has been that for each configuration, there is a number of states that is available. And if I were to change that configuration, the number of states will change. And hence the log of it, which is the entropy, will change.

So how much does it change? Well, let's see what arguments changed. So certainly, the volume changed. So x went to x plus dx. And because I did some amount of work, pdv, the amount of energy that was in the box also changed. OK? So after this transformation with this thing going up or down, what is the new logarithm of the number of states? How much has it changed? And how much of the change is given by this?

Now, suppose that I make this change, and I find that suddenly I have a thousand more states available to me. It's a thousand times more likely that I will accept this change. So in order for me to be penalized-- or actually, not penalized, or not gain-- because I make this transformation, this change better be 0. If it is not 0, if this quantity is, let's say, positive, then I will choose a delta v that is positive, and delta S will become positive. If this quantity happens to be negative, then I will choose a delta v that is negative, and then delta S will be positive again.

So the statement that this thing originally was sitting there by itself in equilibrium and did not spontaneously go up or down is this statement this derivative 0.

AUDIENCE: That's only 0 when the system is in equilibrium?

PROFESSOR: Yes. Yes. So indeed, this is then a relationship that involves parameters of the system in equilibrium. Yeah? So there is thermodynamically, we said that once I specify what my energy and volume and number of particles are in equilibrium, there is a particular S. And what I want to know is if I go from one state to another state in equilibrium, what is the change in dS? That's what this statement is. Then I can rearrange it and see that when I go from one equilibrium state to another equilibrium state, I have to change internal energy, which I can do either by doing work or by doing heat. OK?

Now, the second law is actually obvious. I have stated that I start from this configuration and go to that configuration simply because of probability. The probabilities are inverse of this omega, if you like. So I start with some location-- like the pack of cards, all of the black on one side, all of the red on the other side-- and I do some dynamics, and I will end up with some other state.

I will, because why? Because that state has a much more volume of possibilities. There is a single state that we identify as all-black and all-red, and there's a myriad of states that we say they're all randomly mixed, and so we're much more likely to find from that subset of states.

So I have certainly stated that S1 evaluated at E1 star, plus S2 evaluated at E2 star, essentially the peak of that object, is much larger than S1 plus E1 plus 0 plus S2 [INAUDIBLE].

And more specifically, if we sort of follow the course of the system as it starts from here, you will find that it will basically go in the direction always such that the derivatives that are related to temperature are such that the energy will flow from the hot air to the colder body, consistent with what we expect from thermodynamics.

The one law of thermodynamic that I cannot prove from what I have given you so far is the third law. There is no reason why the entropy should go to 0 as you go to 0 temperature within this perspective.

So it's good to look at our canonical example, which is the ideal gas. Again, for the ideal gas, if I am in a microcanonical ensemble, it means that I have told you what the energy of the box is, what the volume of the box is, and how many particles are in it. So clearly the microstate that corresponds to that is the collection of the 6N coordinates and momenta of the particles in the box.

And the energy of the system is made up by the sum of the energies of all of the particles, and basically, ideal gas means that the energy that we write is the sum of the contributions that you would have from individual particles. So individual particles have a kinetic energy and some potential energy, and since we are stating that we have a box of volume v, this u essentially represents this box of volume. It's a potential that is 0 inside this volume and infinity outside of it. Fine.

So you ask what's the probability of some particular microstate, given that I specified E, v, and n. I would say, well, OK. This is, by the construction that we had, 0 or 1 over some omega, depending on whether the energy is right. And since for the box, the energy is made up of the kinetic energy, if sum over i p i squared over 2m is not equal to E, or particle q is outside the box. And it is the same value if sum over i p i squared over 2m is equal to v and q i's are all inside the box.

So a note about normalization. I kind of skipped on this over the last one. p is a probability in this 6N dimensional phase space. So the normalization is that the integral over all of the p i's and qi's should be equal to 1.

And so this omega, more precisely, when we are not talking about discrete numbers, is the quantity that I have to put so that when I integrate this over phase space, I will get 1. So basically, if I write this in the form that I have written, you can see that omega is obtained by integrating over p i q i that correspond to this accessible states.

It's kind of like a delta function. Basically, there's huge portions of phase space that are 0 probability, and then there's a surface that has the probability that is 1 over omega and it is the area of that surface that I have to ensure is appearing here, so that when I integrate over that area of one over that area, I will get one.

Now, the part that corresponds to the q coordinates is actually very simple, because the q coordinates have to be inside the box. Each one of them has a volume v, so the coordinate part of the integral gives me v to the m. OK? Now, the momentum part, the momenta are constrained by something like this. I can rewrite that as sum over i p i squared equals to 2mE.

And if I regard this 2mE as something like r squared, like a radius squared, you can see that this is like the equation that you would have for the generalization of the sphere in 3N dimensions. Because if I had just x squared, x squared plus y squared is r squared is a circle, x squared plus y squared plus z squared is r squared is a sphere, so this is the hypersphere in 3N dimensions.

So when I do the integrations over q, I have to integrate over the surface of a 3N dimensional hypersphere of radius square root of 2m. OK? And there is a simple formula for that. This surface area, in general, is going to be proportional to r raised to the power of dimension minus 1.

And there's the generalization of 2pi r over 4pi r squared that you would have in two or three dimensions, which we will discuss next time. It is 2 to the power of 3n over 2 divided by 3n over 2 factorial. So we have the formula for this omega. We will re-derive it and discuss it next time.