As far as I understand, one requires that in order for the scalar product between two vectors to be invariant under Lorentz transformations xμ→xμ′=Λμ′αxα, we require that the metric ημν transform as ημν→ημ′ν′=Λαμ′ηαβΛβν′.
[Since we require that xμ′xμ′=xαxα⇒xμ′xμ′=ημ′ν′xμ′xν′=ημ′ν′Λμ′αΛν′βxαxβ=xαxα=ηαβxαxβ].
What confuses me, is that I've been reading up on the cosmological constant problem and in several sets of notes it is claimed that the contribution of the vacuum energy density to the energy-momentum tensor should be of the form Tvacμν=−ρvacgμν
I don't see how this is the case by looking at ημ′ν′=Λαμ′ηαβΛβν′, how is it obvious that this is Lorentz invariant? Shouldn't it be something like ημ′ν′=ημν?
Apologies if this is a stupid question, but I'm just having a mental block over it.
Answer
I believe it can be useful to define the following concepts (I won't be very formal here for pedagogical reasons):
Any event can be described through four real numbers, which we take to be: the moment in time it happens, and the position in space where it takes place. We call this four numbers the coordinates of the event. We collect these numbers in a tuple, which we call x≡(t,r). These numbers depend, of course, on which reference frame we are using: we could, for example, use a different origin for t or a different orientation for r. This means: for x to make sense, we must pick a certain reference frame. Call it S for example.
Had we chosen a different frame, say S′, the components of the same event would be x′, i.e., four real numbers, in principle different from those before. We declare that the new reference frame is inertial if and only if x′ and x are related through x′=Λx
We define a vector to be any set of four real numbers such that, if its components in S are v=(v0,v), then in S′ its components must be v′=Λv
For example, the coordinates x of an event are, by definition, a vector, because of (1). There are more examples of vectors in physics, for example, the electromagnetic potential, or the current density, the momentum of a particle, etc.
It turns out that it is really useful to define the following operation for vectors: if u,v are two vectors, then we define u⋅v≡u0v0−u⋅v
We define the operation ⋅ through the components of the vectors, but we know these components are frame-dependent, so if ⋅ is to be a well-defined operation, we must have u⋅v=u′⋅v′
This relation (4) won't be true in general, but only for some matrices Λ. Thus, we declare that the matrices Λ can only be those which make (4) to be true. This is a restriction on Λ: only some matrices will represent changes of reference frames. Note that in pure mathematics, any invertible matrix defines a change of basis. In physics only a subset of matrices are acceptable changes of basis.
So, what are the possible Λ's that satisfy (4)? Well, the easier way to study this is to rewrite (3) using a different notation: define η=(1−1−1−1)
This is just a matrix that will simplify our discussion. We should not try to find a deep meaning for η (it turns out there is a lot of geometry behind η, but this is not important right now). Using η, its easy to check that (3) can be written as u⋅v=uTηv
This is a relation that defines Λ: any possible change of reference frame must be such that (7) is satisfied. If it is not, the Λ cannot relate two different frames. This relation is not in fact a statement of how η transforms (as you say in the OP), but actually a restriction of Λ. It is customary to say that η transforms as (7), which will be explained in a moment. For now, just think of (7) as what are the possible matrices Λ.
At this point, it is useful to introduce index notation. If v is a vector, we call its components vμ, with μ=0,1,2,3. On the other hand, we write the components of changes of frames Λμν. With this notation, (2) can be written as v′μ=Λμνvν
Also, using index notation, the product of two vectors can be written as u⋅v=ημνuμvν
Index notation is useful because it allows us to define the following concept: a tensor is an object with several indices, e.g. Aμν. But not any object with indices is a tensor: the components of a tensor must change in different frames of reference, such that they are related through A′μν=ΛμρΛνσ AρσB′μν=Λμρ(ΛT)νσ BρσC′μνπτ=ΛμρΛνσ(ΛT)πψΛτω Cρσψω
I don't like to use index notation too much: v′=Λv is easier that v′μ=Λμνvν, don't you think?. But sometimes we have to use index notation, because matrix notation is not possible: when using tensors with three or more indices, matrices cannot be used. Tensors with one index are just vectors. You'll hear sometimes that matrices are tensors with two indices, which is not quite true: if you remember from your course on linear algebra, you know that when you make a change of basis, matrices transform like M→CTMC, which is like (10) in the case of one upper/one lower index. Therefore, matrices are like tensors with one uppe/one lower index. This is the reason we wrote Λ as Λμν. This is a matrix, but it is also a tensor.
Also, (7) pretty much looks like (10), right? This is the reason people say (7) expresses the transformation properties of η. While not false, I you recommend not to take this too seriously: formally, it is right, but in principle η is just a set of numbers that simplifies our notation for scalar products. It turns out you can think of it as a tensor, but only a-posteriori. In principle, it is not defined as a tensor, but it turns out it is. Actually, it is a trivial tensor (the only one!) whose components are the same in every frame of reference (by definition). If you were to calculate what are the components of η in another frame of reference using (10), you'll find out that they are the same. This is stated as the metric is invariant. We actually define it to be invariant. We define what a change of reference frame through the restriction of η being invariant. It doesn't make sense to try to prove η is invariant, as this is a definition. (7) doesn't really prove η is invariant, but actually defines what a change of reference is.
For completeness I'd like to make the following definitions:
We say an object is invariant if it takes the same value on any frame of reference. You can check that if v is a vector, then v⋅v takes the same value on any frame, i.e., v2 is invariant.
We say an object is covariant if it doesn't take the same value on every frame of reference, but the different values are related in a well defined way: the components of a covariant object must satisfy (10). This means tensors are covariant by definition.
For example, a vector is not invariant because its components are frame-dependent. But as vectors are tensors, they are covariant. We really like invariant objects because they simplify a lot of problems. We also like covariant objects because, even though these objects are frame-dependent, they transform in a well-defined way, making them easy to work with. You'll understand this better after you solve many problems in SR and GR: in the end you will be thankful for covariant objects.
So, what does it mean for η to be invariant? It means its components are the same in every (inertial) frame of reference. How do we prove this? we actually can't, because we define this to be true. How can we prove η is the only invariant tensor? We can't, because it is not actually true. The most general invariant tensor is proportional to the metric. Proof: let Nμν be an invariant tensor by definition. Then, as it is a tensor, we have N′=ΛTNΛ
But we also must have N′=N for it to be invariant. This means ΛTNΛ=N. Multiply on the right by ηΛTη and use (7) to get [N,ΛT]=0. By Shur's Lemma, N must be proportional to the identity. QED.
And what about the Levi-Civita symbol? we are usually told that it is also an invariant tensor, which is not actually true: it is invariant, but it is not a tensor, it is a pseudo-tensor. In SR it doesn't satisfy (10) for any Λ, but only for a certain subset of matrices Λ (check Proper Orthochronus Lorentz Group), and in GR it is a tensor density (discussed in many posts on SE).
The proof of the covariance of the LC symbol is usually stated as follows (you'll have to fill in the details): the definition of the determinant of a matrix is can be stated as det(A)εμνσρ=εabcdAμaAνbAρcAσd. The proper Orthochronus Lorentz Group consists of the subset of matrices with unit determinant, i.e., det(Λ)=1. If you use this together with the definition of det, you get εμνρσ=εabcdΛμaΛνbΛρcΛσd, which is the same as (10) for the object εμνρσ. This proves that, when restricted to this subset of the Lorentz Group, the Levi-Civita symbol is a tensor.
Raising and Lowering indices: this is something that is usually made more important that it really is. IMHO, we can fully formulate SR and GR without even mentioning raising and lowering indices. If you define an object with its indices raised, you should keep its indices where they are. In general there is no good reason as why would someone want to move an index. That being said, I'll explain what these are, just for completeness.
The first step is to define the inverse of the metric. Using matrix notation, the metric is its own inverse: ηη=1. But we want to use index notation, so we define another object, call it ζ, with components ζμν=ημν. With this, you can check that ηη=1 can be writen as ημνζνρ=δμρ, where δ is the Kronecker symbol. For now, δ is just a symbol that simplifies the notation. Note that ζ is not standard notation, but we will keep it for the next few paragraphs.
(People usually use the same letter for both η and ζ, and write ημν=ημν; we'll discuss why in a moment. For now, note that these are different objects, with different index structure: η has lower indices and ζ has upper indices)
We can use η and ζ to raise and lower indices, which we now define.
Let's say you have a certain tensor Aμνρ. We want to define what it means to raise the index ρ: it means to define a new object ˉA with components ˉAμνρ≡ζρσAμνσ
Using (10) you can prove that this new object is actually a tensor. We usually drop the bar ˉA and write Aμνρ. We actually shouldn't do this: these objects are different. We can tell them apart from the index placement, so we relax the notation by not writing the bar. In this post, we'll keep the bar for pedagogical reasons.
In an analogous way, we can lower an index, for example the μ index: we define another object ˜A, with components ˜Aμνρ≡ημσAσνρ
This new object is also a tensor. The three objects A,ˉA,˜A are actually different, but we can tell them apart through the indices placement, so we can drop the tildes and bars. For now, we won't.
We'll discuss the usefulness of these operations in a moment. For now, note that if you raise both indices of the metric, you get ˉˉημν≡ζμρζνσηρσ=ζμρδνρ=ζμν
With this in mind, we get the following important result: ημνηνρ=δρμ
So, what is the use of these operations? for example, what do we get if we lower the index of a vector v? Well, we get a new tensor, but it is not a vector (you can check that (2) is not satisfied), so we call it a covector. This is not really important in SR, but in other branches of physics vectors and covectors are really really different.
So, what is the covector associated to v? Call this covector ˉv. Its components will be ˉvμ=ημνvν by definition. Why is this useful? Well, one reason is that by lowering an index, the scalar product ⋅ turns into standard matrix product: u⋅v=ˉuv
The following fact is rather interesting: we know that if we raise both indices of the metric we get the metric again. But what do we get if we raise only one index to the metric? that is, what is ˉη?, or, put it another way, what is ημν? Well, according to the definition, it is ημν=ηνρημρ=δμν
As a side note: you (as many people) write prime marks on the indices, while I (as many others) write the primes on the tensors. IMHO the latter convention is the best, because it is the tensor what is changing, not the indices. For example, what you wrote ημ′ν′=ημν looks better when written η′μν=ημν, because the μν component of both objects are equal, and not the μ′ is equal to the μ component (which actually makes no sense and makes the indices mismatched).
No comments:
Post a Comment