Saturday, January 11, 2020

thermodynamics - Information entropy and physics correlation



I'm starting to study Information Theory, and the concept of entropy is not clear to me; I have already checked other posts in this category, but none of them seems to answer my indagations, so, I have some questions:





  1. Is the Shannon entropy equivalent to the thermodynamics entropy? I have read both answers, yes and no, and could not understand yet; many people say "yes", but the Shannon entropy is related to symbols, and the thermodynamic entropy is related to micro-states, and these are related to temperature (and Shannon entropy couldn't care less about temperature). However, the Gibbs' entropy formula is exactly the same to Shannon entropy, then I could not form a consensus in my head about the topic.




  2. What exactly is the difference between Boltzmann, Shannon, Gibbs and von Neumann concepts of entropy? I have read in one topic here that Shannon entropy measures the "minimum amount of yes/no questions to fully specify a system", but how does a physical system could obey this? For example, if the entropy of a volume of gas is $x$, what questions could I make to "fully specify" this gas?




  3. If these entropies are related, how could I convert J/K (thermodynamic unit) to bit (Shannon unit)? And if one use $\ln$ instead of $\log_{2}$, the unit would be nats; I understand that information is a way to measure differences between things, and is clear to me that a bit is the minimum amount of information, once it distinguish between 2 things, but what would a nat measure in this case? If a bit distinguishes between 2 things, a nat would distinguish between 2,718 things (can't understand that).





I've already searched in many books and sites, and questioned my professor, but still don't have a clue in this topics, so any hint will be much apreciated...



Answer



I hope that my answers below will all be helpful.




  1. There are more than one way to think about this, but the one I find most helpful is to think of thermodynamic entropy as a specific instance of Shannon entropy. Shannon entropy is defined by the formula $$ H = -\sum_i p_i \log p_i, $$ but this formula has many different applications, and the symbols $p_i$ have different meanings depending on what the formula is used for. Shannon thought of them as the probabilities of different messages or symbols being sent over a communication channel, but Shannon's formula has since found plenty of other applications as well. One specific thing you can apply it to are the microscopic states of a physical system. If the probabilities $p_i$ represent the equilibrium probabilities for a thermodynamic system to be be in microscopic states $i$, then you have the thermodynamic entropy. (Very often it is multiplied by Boltzmann's constant in this case, to put it into units of $JK^{-1}$ --- see below.) If they represent something else (such as, for example, a non-equilibrium ensemble) then you just have a different instance of the Shannon entropy. So in short, the thermodynamic entropy is a Shannon entropy, but not necessarily vice versa.


    (One should note, though, that this isn't the way it developed historically --- the formula was in use in physics before Shannon realised that it could be generalised, and the entropy was a known quantity before that formula was invented. For a very good overview of the historical development of information theory and physics, see Jaynes' paper "Where do we stand on maximum entropy?" It is very long, and quite old, but well worth the effort.)




  2. The paper linked above will also help with this. Essentially, the Shannon entropy is the formula quoted above; the Gibbs entropy is that same formula applied to the microscopic states of a physical system (so that sometimes it's called the Gibbs-Shannon entropy); the Boltzmann entropy is $\log W$, which is a special case of the Gibbs-Shannon entropy that was historically discovered first; and the von Neumann entropy is the quantum version of the Gibbs-Shannon entropy.





  3. This is straightforward. The physical definition of the entropy is $$ S = -k_B \sum_i p_i \log p_i, $$ where the logarithms have base $e$, and $k_B \approx 1.38\times 10^{-23} JK^{-1}$ is Boltzmann's constant. Physicists generally consider $\log p_i$ to be unitless (rather than having units of nats), so the expression has units of $JK^{-1}$ overall. Comparing this to the definition of $H$ above (with units of nats) we have $$ 1\,\mathrm{nat} = k_B\,JK^{-1}, $$ i.e. the conversion factor is just Boltzmann's constant.


    If we want to express $H$ in bits then we have to change the base of the logarithm from $e$ to 2, which we do by dividing by $\ln 2$: $$ H_\text{bit} = -\sum_i p_i \log_2 p_i = -\sum_i p_i \frac{\ln p_i}{\ln 2} = \frac{H_\text{nat}}{\ln 2}. $$ So we have $$ 1\,\mathrm{bit} = \ln 2\,\,\mathrm{nat}, $$ and therefore $$ 1\,\mathrm{bit} = k_B\ln 2\,JK^{-1} \approx 9.57\times 10^{-24} JK^{-1}. $$


    You will see this conversion factor, for example in Landauer's principle, in which erasing one bit requires $k_B T \ln 2$ joules of energy. This is really just saying that that deleting a bit (and therefore lowering the entropy by one bit) requires raising the entropy of the heat bath by one bit, or $k_B \ln 2$. For a heat bath of temperature $T$ this can be done by raising its energy by $k_B T \ln 2\,\, J$.


    As for the intuitive interpretation of nats, this is indeed a little tricky. The reason nats are used is that they're mathematically more convenient. (If you take the derivative you won't get factors of $\ln 2$ appearing all the time.) But it doesn't make nice intuitive sense to think of distinguishing between 2.718 things, so it's probably better just to think of a nat as $\frac{1}{\ln 2}$ bits, and remember that it's defined that way for mathematical convenience.




No comments:

Post a Comment

classical mechanics - Moment of a force about a given axis (Torque) - Scalar or vectorial?

I am studying Statics and saw that: The moment of a force about a given axis (or Torque) is defined by the equation: $M_X = (\vec r \times \...