I have been trying to understand general relativity from a first-principles perspective in my spare time, and I have been unable to find a convincing derivation of the Einstein equations. The most complete one I can find is the one on Wikipedia, but it has a big mathematical gap that I can't figure out. Namely, when computing the variation of the Riemann curvature tensor, the author assumes that the variation operator is a derivation, i.e. satisfies the product rule for derivatives. This seems to be false, because the variation in question is not itself an ordinary derivative, but rather the Euler-Lagrange "derivative", whose definition for a function of the (inverse) metric and its first two partials (like the Riemann tensor) is
$$ \frac{\delta \mathcal{L}(g^{ij}, \partial_k g^{ij}, \partial_l \partial_k g^{ij})}{\delta g^{ij}} = \frac{\partial \mathcal{L}}{\partial g^{ij}} - \partial_k \frac{\partial \mathcal{L}}{\partial(\partial_k g^{ij})} + \partial_l \partial_k \frac{\partial \mathcal{L}}{\partial(\partial_l \partial_k g^{ij})}. $$
The second and third terms do not satisfy the product rule. It appears almost as though in the linked derivation the author is taking simple partials with respect to the inverse metric, which is entirely wrong. And yet, that derivation is linked to Carroll's textbook, so it must have some credibility. I don't have the textbook, so I can't check whether it explains this logic more completely. Therefore I turn to Physics.SE. What's going on here?
Answer
After pondering Michael Seifert's answer, I have realized what the full resolution of my problem is. The issue is that the expression $\delta \mathcal{L}$, which is defined to be
$$ \delta \mathcal{L} = \frac{\partial \mathcal{L}}{\partial g^{ij}} \delta g^{ij} + \frac{\partial \mathcal{L}}{\partial (\partial_k g^{ij})} \partial_k (\delta g^{ij}) + \frac{\partial \mathcal{L}}{\partial (\partial_l \partial_k g^{ij})} \partial_l \partial_k (\delta g^{ij}),$$
cannot be confused with $\frac{\delta\mathcal{L}}{\delta g^{ij}}$, unlike with differentials. This is because we don't have the linear approximation
$$ \delta \mathcal{L} = \frac{\delta\mathcal{L}}{\delta g^{ij}} \delta g^{ij} $$
as, again, we do have for differentials, but rather
$$ \delta \mathcal{L} = \frac{\delta\mathcal{L}}{\delta g^{ij}} \delta g^{ij} + \partial_i f^i, $$
for some vector $f^i$. This difference is what prevents $\frac{\delta \mathcal{L}}{\delta g^{ij}}$ from being a derivation. Doing the whole computation with the operator $\delta$ rather than the functional derivative $\frac{\delta}{\delta g^{ij}}$ works out just fine. This is actually what is depicted on the Wikipedia page; I simply assumed that the $\delta$-differential notation was a shorthand.
No comments:
Post a Comment