Friday, October 9, 2015

thermodynamics - Boltzmann distribution derivation from maximum entropy principle


I'm stuck halfway through a derivation of the Boltzmann distribution using the principle of maximum entropy.


Let us consider a particle that may occupy any discrete energy level Ei. The probability of occupying this level is pi. I wish to now maximize the entropy H=ipilog(pi), subject to constraints ipi=1 and ipiEi=μ. That is, the average energy is known.


I write the Lagrangian L=ipilog(pi)+η(ipi1)+λ(ipiEiμ). With the method of Lagrange multipliers, I can set Lpj=0, Lη=0 and Lλ=0.


From the derivatives, I get


pi=eη1eλEi


ieη1eλEiEi=μ


ieη1eλEi=1



1) I'm not sure how to complete the argument and find λ and η. I seem to get a sign error or something else is wrong. How do I get to λ=1μ?


2) Can someone show how to extend this argument for continuous energies?



Answer



Your target formula λ=1/μ, where \mu is the average energy, is wrong.


(NB throughout this answer, I'm following your usage of \mu as the specified "mean value" of the energy; any casual readers should note that it is not the chemical potential).


This derivation goes back to two papers of Jaynes: Phys Rev, 106, 620 (1957) and Phys Rev, 108, 171 (1957). I've also seen copies of these papers made available online, perhaps you can search for them. It is also fairly standard in textbooks.


As with many Lagrange multiplier questions, direct solution of the equations to get values of the multipliers themselves is avoided. For the "normalization" multiplier, \eta in your notation, it is more convenient to replace it by the partition function Z, writing p_i = \frac{\exp(-\lambda \mathcal{E}_i)}{Z}, \quad\text{where}\quad Z = \sum_i \exp(-\lambda \mathcal{E}_i) . The remaining multiplier \lambda is eventually related to the derivative of the maximized entropy with respect to the average energy \mu. In the language of those two papers, given constraints on the average values of several functions \langle f_1(x)\rangle, \langle f_2(x)\rangle, etc, where x takes discrete values x_i characterizing each state i, and \langle f(x)\rangle is short for \sum_i p_i \, f(x_i), the probability distribution which maximizes the entropy can be written p_i = \exp[-(\lambda_0+\lambda_1f_1(x_i)+\lambda_2f_2(x_i)+\ldots)] where \lambda_0=\ln Z and the other \lambda_i are the remaining Lagrange multipliers. Noting from this that -\ln p_i = \lambda_0+\lambda_1f_1(x_i)+\lambda_2f_2(x_i)+\ldots and substituting into S=-\sum_i p_i \ln p_i (your function H), it follows that the maximum value of the entropy can be expressed S_{\text{max}} = \lambda_0 + \lambda_1 \langle f_1(x)\rangle + \lambda_2 \langle f_2(x)\rangle + \ldots . From this, \lambda_i = \frac{\partial S_{\text{max}}}{\partial \langle f_i(x)\rangle} . In your case, S_{\text{max}} = \ln Z + \lambda \langle E\rangle = \ln Z + \lambda \mu this eventually leads to \lambda =\frac{\partial S_{\text{max}}}{\partial \mu} =\frac{\partial S_{\text{max}}}{\partial \langle E\rangle} = \frac{1}{T} thereby defining the temperature, as it is used in standard thermodynamics (all in units of k_B=1).


Generalizing to the continuum case is discussed on the relevant Wikipedia page, and involves calculus of variations, with summations replaced by integrals, and Lagrange multipliers playing their usual role. It is necessary, though, to introduce a measure, for the integration. You might formulate the problem as an integral over coordinates x, or coordinates-and-momenta, (x,p). Or you might formulate it as an integral over energies. In any case, you need to consider the density of states (per unit whatever the integration variable is).


No comments:

Post a Comment

classical mechanics - Moment of a force about a given axis (Torque) - Scalar or vectorial?

I am studying Statics and saw that: The moment of a force about a given axis (or Torque) is defined by the equation: $M_X = (\vec r \times \...