Articles

7: Eigenvalues and Eigenvectors - Mathematics


In this chapter we study linear operators (T : V o V) on a finite-dimensional vector space (V). For example, quantum mechanics is largely based upon the study of eigenvalues and eigenvectors of operators on finite- and infinite-dimensional vector spaces.


Eigenvalues and Eigenvectors

We review here the basics of computing eigenvalues and eigenvectors. Eigenvalues and eigenvectors play a prominent role in the study of ordinary differential equations and in many applications in the physical sciences. Expect to see them come up in a variety of contexts!

Definitions

Let $A$ be an $n imes n$ matrix. The number $lambda$ is an eigenvalue of $A$ if there exists a non-zero vector $<f v>$ such that $ A <f v>= lambda <f v>. $ In this case, vector $<f v>$ is called an eigenvector of $A$ corresponding to $lambda$.

Computing Eigenvalues and Eigenvectors

We can rewrite the condition $A <f v>= lambda <f v>$ as $ (A- lambda I) <f v>= <f 0>. $ where $I$ is the $n imes n$ identity matrix. Now, in order for a non-zero vector $<f v>$ to satisfy this equation, $A – lambda I$ must not be invertible.

Otherwise, if $A – lambda I$ has an inverse, egin (A – lambda I)^<-1>(A – lambda I) <f v>& = & (A – lambda I)^<-1> <f 0> <f v>& = & <f 0>. end But we are looking for a non-zero vector $<f v>$. That is, the determinant of $A – lambda I$ must equal 0. We call $p(lambda )= det (A – lambda I)$ the characteristic
polynomial
of $A$. The eigenvalues of $A$ are simply the roots of the characteristic polynomial of $A$.

Example

Let $A = left[ egin 2 & -4 -1 & -1 end ight] $. Then $egin p(lambda) & = & det left[egin 2-lambda & -4 -1 & -1-lambda end ight] & = & (2-lambda)(-1-lambda)-(-4)(-1) & = & lambda^ <2>-lambda -6 & = & (lambda -3)(lambda +2). end$ Thus, $lambda_1 =3$ and $lambda_2 = -2$ are the eigenvalues of $A$.

To find eigenvectors $ <f v>= left[ egin v_1 v_2 vdots v_n end ight] $ corresponding to an eigenvalue $lambda$, we simply solve the system of linear equations given by $ (A-lambda I) <f v>= <f 0>. $

Example

The matrix $A = left[ egin 2 & -4 -1 & -1 end ight] $ of the previous example has eigenvalues $lambda_1 =3$ and $lambda_2 = -2$. Let’s find the eigenvectors corresponding to $lambda_1 =3$. Let $<f v>= left[ ight]$. Then $(A-3I)<f v>=<f 0>$ gives us $ left[egin 2-3 & -4 -1 & -1-3 end ight]left[egin v_1 v_2 end ight] = left[egin 0 0 end ight], $ from which we obtain the duplicate equations egin -v_1-4v_2 & = & 0 -v_1-4v_2 & = & 0. end If we let $v_2=t$, then $v_1=-4t$. All eigenvectors corresponding to $lambda_1 =3$ are multiples of $left[<-4 atop 1> ight] $ and thus the eigenspace corresponding to $lambda_1 =3$ is given by the span of $left[<-4 atop 1> ight] $. That is, $left ight] ight>$ is a basis of the eigenspace corresponding to $lambda_1 =3$.

Repeating this process with $lambda_2 = -2$, we find that egin 4v_1 -4V_2 & = & 0 -v_1 + v_2 & = & 0 end If we let $v_2=t$ then $v_1=t$ as well. Thus, an eigenvector corresponding to $lambda_2 = -2$ is $left[<1 atop 1> ight]$ and the eigenspace corresponding to $lambda_2 = -2$ is given by the span of $left[<1 atop 1> ight]$. $left ight] ight>$ is a basis for the eigenspace corresponding to $lambda_2 = -2$.

In the following example, we see a two-dimensional eigenspace.

Example

Let $A=left[egin 5 & 8 & 16 4 & 1 & 8 -4 & -4 & -11 end ight]$. Then $p(lambda ) = detleft[egin 5-lambda & 8 & 16 4 & 1-lambda & 8 -4 & -4 & -11-lambda end ight] = (lambda-1)(lambda+3)^<2>$ after some algebra! Thus, $lambda_1 = 1$ and $lambda_2=-3$ are the eigenvalues of $A$. Eigenvectors $ <f v>= left[egin v_1 v_2 v_3 end ight]$ corresponding to $lambda_1=1$ must satisfy

Letting $v_3=t$, we find from the second equation that $v_1=-2t$, and then $v_2=-t$. All eigenvectors corresponding to $lambda_1=1$ are multiples of $left[egin -2 -1 1 end ight]$, and so the eigenspace corresponding to $lambda_1=1$ is given by the span of $left[egin -2 -1 1 end ight]$. $left -2 -1 1 end ight] ight>$ is a basis for the eigenspace corresponding to $lambda_1=1$.

Eigenvectors corresponding to $lambda_2=-3$ must satisfy

The equations here are just multiples of each other! If we let $v_3 = t$ and $v_2 = s$, then $v_1 = -s -2t$. Eigenvectors corresponding to $lambda_2=-3$ have the form $ left[egin -1 1 0 end ight]s+left[egin -2 0 1 end ight]t. $ Thus, the eigenspace corresponding to $lambda_2=-3$ is two-dimensional and is spanned by $left[egin -1 1 0 end ight]$ and $left[egin -2 0 1 end ight]$. $left -1 1 0 end ight],left[egin -2 0 1 end ight] ight>$ is a basis for the eigenspace corresponding to $lambda_2=-3$.

Notes

  • Eigenvalues and eigenvectors can be complex-valued as well as real-valued.
  • The dimension of the eigenspace corresponding to an eigenvalue is less than or equal to the multiplicity of that eigenvalue.
  • The techniques used here are practical for $2 imes 2$ and $3 imes 3$ matrices. Eigenvalues and eigenvectors of larger matrices are often found using other techniques, such as iterative methods.

Key Concepts

Let $A$ be an $n imes n$ matrix. The eigenvalues of $A$ are the roots of the characteristic polynomial $ p(lambda )= det (A – lambda I). $ For each eigenvalue $lambda$, we find eigenvectors $ <f v>=left[ egin v_1 v_2 vdots v_n end ight] $ by solving the linear system $ (A – lambda I) <f v>= <f 0>. $ The set of all vectors $<f v>$ satisfying $A<f v>= lambda <f v>$ is called the eigenspace of $A$ corresponding to $lambda$.


7: Eigenvalues and Eigenvectors - Mathematics

If you get nothing out of this quick review of linear algebra you must get this section. Without this section you will not be able to do any of the differential equations work that is in this chapter.

So, let’s start with the following. If we multiply an (n imes n) matrix by an (n imes 1) vector we will get a new (n imes 1) vector back. In other words,

What we want to know is if it is possible for the following to happen. Instead of just getting a brand new vector out of the multiplication is it possible instead to get the following,

In other words, is it possible, at least for certain (lambda ) and (vec eta ), to have matrix multiplication be the same as just multiplying the vector by a constant? Of course, we probably wouldn’t be talking about this if the answer was no. So, it is possible for this to happen, however, it won’t happen for just any value of (lambda ) or (vec eta ). If we do happen to have a (lambda ) and (vec eta ) for which this works (and they will always come in pairs) then we call (lambda) an eigenvalue of (A) and (vec eta ) an eigenvector of (A).

So, how do we go about finding the eigenvalues and eigenvectors for a matrix? Well first notice that if (vec eta = vec 0) then (eqref) is going to be true for any value of (lambda ) and so we are going to make the assumption that (vec eta e vec 0). With that out of the way let’s rewrite (eqref) a little.

[eginAvec eta - lambda vec eta & = vec 0 Avec eta - lambda vec eta & = vec 0 left( > ight)vec eta & = vec 0end]

Notice that before we factored out the (vec eta ) we added in the appropriately sized identity matrix. This is equivalent to multiplying things by a one and so doesn’t change the value of anything. We needed to do this because without it we would have had the difference of a matrix, (A), and a constant, (lambda ), and this can’t be done. We now have the difference of two matrices of the same size which can be done.

So, with this rewrite we see that

is equivalent to (eqref). In order to find the eigenvectors for a matrix we will need to solve a homogeneous system. Recall the fact from the previous section that we know that we will either have exactly one solution ((vec eta = vec 0)) or we will have infinitely many nonzero solutions. Since we’ve already said that we don’t want (vec eta = vec 0) this means that we want the second case.

Knowing this will allow us to find the eigenvalues for a matrix. Recall from this fact that we will get the second case only if the matrix in the system is singular. Therefore, we will need to determine the values of (lambda ) for which we get,

Once we have the eigenvalues we can then go back and determine the eigenvectors for each eigenvalue. Let’s take a look at a couple of quick facts about eigenvalues and eigenvectors.

If (A) is an (n imes n) matrix then (det left( ight) = 0) is an (n^< ext>) degree polynomial. This polynomial is called the characteristic polynomial.

To find eigenvalues of a matrix all we need to do is solve a polynomial. That’s generally not too bad provided we keep (n) small. Likewise this fact also tells us that for an (n imes n) matrix, (A), we will have (n) eigenvalues if we include all repeated eigenvalues.

If (>, >, ldots ,>) is the complete list of eigenvalues for (A) (including all repeated eigenvalues) then,

    If (lambda ) occurs only once in the list then we call (lambda ) simple.

The usefulness of these facts will become apparent when we get back into differential equations since in that work we will want linearly independent solutions.

Let’s work a couple of examples now to see how we actually go about finding eigenvalues and eigenvectors.

The first thing that we need to do is find the eigenvalues. That means we need the following matrix,

In particular we need to determine where the determinant of this matrix is zero.

So, it looks like we will have two simple eigenvalues for this matrix, (> = - 5) and (> = 1). We will now need to find the eigenvectors for each of these. Also note that according to the fact above, the two eigenvectors should be linearly independent.

To find the eigenvectors we simply plug in each eigenvalue into GOTOBUTTON ZEqnNum594711 * MERGEFORMAT REF ZEqnNum594711 ! * MERGEFORMAT (2) and solve. So, let’s do that.

(> = - 5) :
In this case we need to solve the following system.

Recall that officially to solve this system we use the following augmented matrix.

Upon reducing down we see that we get a single equation

that will yield an infinite number of solutions. This is expected behavior. Recall that we picked the eigenvalues so that the matrix would be singular and so we would get infinitely many solutions.

Notice as well that we could have identified this from the original system. This won’t always be the case, but in the (2 imes 2) case we can see from the system that one row will be a multiple of the other and so we will get infinite solutions. From this point on we won’t be actually solving systems in these cases. We will just go straight to the equation and we can use either of the two rows for this equation.

Now, let’s get back to the eigenvector, since that is what we were after. In general then the eigenvector will be any vector that satisfies the following,

To get this we used the solution to the equation that we found above.

We really don’t want a general eigenvector however so we will pick a value for (>) to get a specific eigenvector. We can choose anything (except (> = 0)), so pick something that will make the eigenvector “nice”. Note as well that since we’ve already assumed that the eigenvector is not zero we must choose a value that will not give us zero, which is why we want to avoid (> = 0) in this case. Here’s the eigenvector for this eigenvalue.

Now we get to do this all over again for the second eigenvalue.

(> = 1) :
We’ll do much less work with this part than we did with the previous part. We will need to solve the following system.

Clearly both rows are multiples of each other and so we will get infinitely many solutions. We can choose to work with either row. We’ll run with the first because to avoid having too many minus signs floating around. Doing this gives us,

Note that we can solve this for either of the two variables. However, with an eye towards working with these later on let’s try to avoid as many fractions as possible. The eigenvector is then,

Note that the two eigenvectors are linearly independent as predicted.

This matrix has fractions in it. That’s life so don’t get excited about it. First, we need the eigenvalues.

So, it looks like we’ve got an eigenvalue of multiplicity 2 here. Remember that the power on the term will be the multiplicity.

Now, let’s find the eigenvector(s). This one is going to be a little different from the first example. There is only one eigenvalue so let’s do the work for that one. We will need to solve the following system,

So, the rows are multiples of each other. We’ll work with the first equation in this example to find the eigenvector.

Recall in the last example we decided that we wanted to make these as “nice” as possible and so should avoid fractions if we can. Sometimes, as in this case, we simply can’t so we’ll have to deal with it. In this case the eigenvector will be,

Note that by careful choice of the variable in this case we were able to get rid of the fraction that we had. This is something that in general doesn’t much matter if we do or not. However, when we get back to differential equations it will be easier on us if we don’t have any fractions so we will usually try to eliminate them at this step.

Also, in this case we are only going to get a single (linearly independent) eigenvector. We can get other eigenvectors, by choosing different values of (>). However, each of these will be linearly dependent with the first eigenvector. If you’re not convinced of this try it. Pick some values for (>) and get a different vector and check to see if the two are linearly dependent.

Recall from the fact above that an eigenvalue of multiplicity (k) will have anywhere from 1 to (k) linearly independent eigenvectors. In this case we got one. For most of the (2 imes 2) matrices that we’ll be working with this will be the case, although it doesn’t have to be. We can, on occasion, get two.

So, we’ll start with the eigenvalues.

This doesn’t factor, so upon using the quadratic formula we arrive at,

In this case we get complex eigenvalues which are definitely a fact of life with eigenvalue/eigenvector problems so get used to them.

Finding eigenvectors for complex eigenvalues is identical to the previous two examples, but it will be somewhat messier. So, let’s do that.

(> = - 1 + 5,i) :
The system that we need to solve this time is

Now, it’s not super clear that the rows are multiples of each other, but they are. In this case we have,

This is not something that you need to worry about, we just wanted to make the point. For the work that we’ll be doing later on with differential equations we will just assume that we’ve done everything correctly and we’ve got two rows that are multiples of each other. Therefore, all that we need to do here is pick one of the rows and work with it.

We’ll work with the second row this time.

Now we can solve for either of the two variables. However, again looking forward to differential equations, we are going to need the “(i)” in the numerator so solve the equation in such a way as this will happen. Doing this gives,

So, the eigenvector in this case is

As with the previous example we choose the value of the variable to clear out the fraction.

Now, the work for the second eigenvector is almost identical and so we’ll not dwell on that too much.

(> = - 1 - 5,i) :
The system that we need to solve here is

Working with the second row again gives,

The eigenvector in this case is

There is a nice fact that we can use to simplify the work when we get complex eigenvalues. We need a bit of terminology first however.

If we start with a complex number,

then the complex conjugate of (z) is

To compute the complex conjugate of a complex number we simply change the sign on the term that contains the “(i)”. The complex conjugate of a vector is just the conjugate of each of the vector’s components.

We now have the following fact about complex eigenvalues and eigenvectors.

If (A) is an (n imes n) matrix with only real numbers and if (> = a + bi) is an eigenvalue with eigenvector (>). Then (> = overline <>> = a - bi) is also an eigenvalue and its eigenvector is the conjugate of (>).

This fact is something that you should feel free to use as you need to in our work.

Now, we need to work one final eigenvalue/eigenvector problem. To this point we’ve only worked with (2 imes 2) matrices and we should work at least one that isn’t (2 imes 2). Also, we need to work one in which we get an eigenvalue of multiplicity greater than one that has more than one linearly independent eigenvector.

Despite the fact that this is a (3 imes 3) matrix, it still works the same as the (2 imes 2) matrices that we’ve been working with. So, start with the eigenvalues

So, we’ve got a simple eigenvalue and an eigenvalue of multiplicity 2. Note that we used the same method of computing the determinant of a (3 imes 3) matrix that we used in the previous section. We just didn’t show the work.

Let’s now get the eigenvectors. We’ll start with the simple eigenvector.

This time, unlike the (2 imes 2) cases we worked earlier, we actually need to solve the system. So let’s do that.

Going back to equations gives,

So, again we get infinitely many solutions as we should for eigenvectors. The eigenvector is then,

Now, let’s do the other eigenvalue.

Okay, in this case is clear that all three rows are the same and so there isn’t any reason to actually solve the system since we can clear out the bottom two rows to all zeroes in one step. The equation that we get then is,

So, in this case we get to pick two of the values for free and will still get infinitely many solutions. Here is the general eigenvector for this case,

Notice the restriction this time. Recall that we only require that the eigenvector not be the zero vector. This means that we can allow one or the other of the two variables to be zero, we just can’t allow both of them to be zero at the same time!

What this means for us is that we are going to get two linearly independent eigenvectors this time. Here they are.

Now when we talked about linear independent vectors in the last section we only looked at (n) vectors each with (n) components. We can still talk about linear independence in this case however. Recall back with we did linear independence for functions we saw at the time that if two functions were linearly dependent then they were multiples of each other. Well the same thing holds true for vectors. Two vectors will be linearly dependent if they are multiples of each other. In this case there is no way to get (>) by multiplying (>) by a constant. Therefore, these two vectors must be linearly independent.

So, summarizing up, here are the eigenvalues and eigenvectors for this matrix


Here is the most important definition in this text.

Definition

The German prefix “eigen” roughly translates to “self” or “own”. An eigenvector of

is a vector that is taken to a multiple of itself by the matrix transformation

which perhaps explains the terminology. On the other hand, “eigen” is often translated as “characteristic” we may think of an eigenvector as describing an intrinsic, or characteristic, property of

Eigenvalues and eigenvectors are only for square matrices.

Eigenvectors are by definition nonzero. Eigenvalues may be equal to zero.

We do not consider the zero vector to be an eigenvector: since

the associated eigenvalue would be undefined.

If someone hands you a matrix

On the other hand, given just the matrix

it is not obvious at all how to find the eigenvectors. We will learn how to do this in Section 5.2.

Example (Verifying eigenvectors)
Example (Verifying eigenvectors)
Example (An eigenvector with eigenvalue

are collinear with the origin. So, an eigenvector of

lie on the same line through the origin. In this case,

the eigenvalue is the scaling factor.

For matrices that arise as the standard matrix of a linear transformation, it is often best to draw a picture, then find the eigenvectors and eigenvalues geometrically by studying which vectors are not moved off of their line. For a transformation that is defined geometrically, it is not necessary even to compute its matrix to find the eigenvectors and eigenvalues.


Contents

If T is a linear transformation from a vector space V over a field F into itself and v is a nonzero vector in V , then v is an eigenvector of T if T(v) is a scalar multiple of v . This can be written as

where λ is a scalar in F , known as the eigenvalue, characteristic value, or characteristic root associated with v .

There is a direct correspondence between n-by-n square matrices and linear transformations from an n-dimensional vector space into itself, given any basis of the vector space. Hence, in a finite-dimensional vector space, it is equivalent to define eigenvalues and eigenvectors using either the language of matrices, or the language of linear transformations. [3] [4]

If V is finite-dimensional, the above equation is equivalent to [5]

where A is the matrix representation of T and u is the coordinate vector of v .

Eigenvalues and eigenvectors feature prominently in the analysis of linear transformations. The prefix eigen- is adopted from the German word eigen (cognate with the English word own) for "proper", "characteristic", "own". [6] [7] Originally used to study principal axes of the rotational motion of rigid bodies, eigenvalues and eigenvectors have a wide range of applications, for example in stability analysis, vibration analysis, atomic orbitals, facial recognition, and matrix diagonalization.

In essence, an eigenvector v of a linear transformation T is a nonzero vector that, when T is applied to it, does not change direction. Applying T to the eigenvector only scales the eigenvector by the scalar value λ, called an eigenvalue. This condition can be written as the equation

referred to as the eigenvalue equation or eigenequation. In general, λ may be any scalar. For example, λ may be negative, in which case the eigenvector reverses direction as part of the scaling, or it may be zero or complex.

The Mona Lisa example pictured here provides a simple illustration. Each point on the painting can be represented as a vector pointing from the center of the painting to that point. The linear transformation in this example is called a shear mapping. Points in the top half are moved to the right, and points in the bottom half are moved to the left, proportional to how far they are from the horizontal axis that goes through the middle of the painting. The vectors pointing to each point in the original image are therefore tilted right or left, and made longer or shorter by the transformation. Points along the horizontal axis do not move at all when this transformation is applied. Therefore, any vector that points directly to the right or left with no vertical component is an eigenvector of this transformation, because the mapping does not change its direction. Moreover, these eigenvectors all have an eigenvalue equal to one, because the mapping does not change their length either.

Linear transformations can take many different forms, mapping vectors in a variety of vector spaces, so the eigenvectors can also take many forms. For example, the linear transformation could be a differential operator like d d x >> , in which case the eigenvectors are functions called eigenfunctions that are scaled by that differential operator, such as

Alternatively, the linear transformation could take the form of an n by n matrix, in which case the eigenvectors are n by 1 matrices. If the linear transformation is expressed in the form of an n by n matrix A, then the eigenvalue equation for a linear transformation above can be rewritten as the matrix multiplication

where the eigenvector v is an n by 1 matrix. For a matrix, eigenvalues and eigenvectors can be used to decompose the matrix—for example by diagonalizing it.

Eigenvalues and eigenvectors give rise to many closely related mathematical concepts, and the prefix eigen- is applied liberally when naming them:

  • The set of all eigenvectors of a linear transformation, each paired with its corresponding eigenvalue, is called the eigensystem of that transformation. [8][9]
  • The set of all eigenvectors of T corresponding to the same eigenvalue, together with the zero vector, is called an eigenspace, or the characteristic space of T associated with that eigenvalue. [10]
  • If a set of eigenvectors of T forms a basis of the domain of T, then this basis is called an eigenbasis.

Eigenvalues are often introduced in the context of linear algebra or matrix theory. Historically, however, they arose in the study of quadratic forms and differential equations.

In the 18th century, Leonhard Euler studied the rotational motion of a rigid body, and discovered the importance of the principal axes. [a] Joseph-Louis Lagrange realized that the principal axes are the eigenvectors of the inertia matrix. [11]

In the early 19th century, Augustin-Louis Cauchy saw how their work could be used to classify the quadric surfaces, and generalized it to arbitrary dimensions. [12] Cauchy also coined the term racine caractéristique (characteristic root), for what is now called eigenvalue his term survives in characteristic equation. [b]

Later, Joseph Fourier used the work of Lagrange and Pierre-Simon Laplace to solve the heat equation by separation of variables in his famous 1822 book Théorie analytique de la chaleur. [13] Charles-François Sturm developed Fourier's ideas further, and brought them to the attention of Cauchy, who combined them with his own ideas and arrived at the fact that real symmetric matrices have real eigenvalues. [12] This was extended by Charles Hermite in 1855 to what are now called Hermitian matrices. [14]

Around the same time, Francesco Brioschi proved that the eigenvalues of orthogonal matrices lie on the unit circle, [12] and Alfred Clebsch found the corresponding result for skew-symmetric matrices. [14] Finally, Karl Weierstrass clarified an important aspect in the stability theory started by Laplace, by realizing that defective matrices can cause instability. [12]

In the meantime, Joseph Liouville studied eigenvalue problems similar to those of Sturm the discipline that grew out of their work is now called Sturm–Liouville theory. [15] Schwarz studied the first eigenvalue of Laplace's equation on general domains towards the end of the 19th century, while Poincaré studied Poisson's equation a few years later. [16]

At the start of the 20th century, David Hilbert studied the eigenvalues of integral operators by viewing the operators as infinite matrices. [17] He was the first to use the German word eigen, which means "own", [7] to denote eigenvalues and eigenvectors in 1904, [c] though he may have been following a related usage by Hermann von Helmholtz. For some time, the standard term in English was "proper value", but the more distinctive term "eigenvalue" is the standard today. [18]

The first numerical algorithm for computing eigenvalues and eigenvectors appeared in 1929, when Richard von Mises published the power method. One of the most popular methods today, the QR algorithm, was proposed independently by John G. F. Francis [19] and Vera Kublanovskaya [20] in 1961. [21] [22]

Eigenvalues and eigenvectors are often introduced to students in the context of linear algebra courses focused on matrices. [23] [24] Furthermore, linear transformations over a finite-dimensional vector space can be represented using matrices, [25] [4] which is especially common in numerical and computational applications. [26]

Consider n -dimensional vectors that are formed as a list of n scalars, such as the three-dimensional vectors

These vectors are said to be scalar multiples of each other, or parallel or collinear, if there is a scalar λ such that

Now consider the linear transformation of n -dimensional vectors defined by an n by n matrix A ,

If it occurs that v and w are scalar multiples, that is if

then v is an eigenvector of the linear transformation A and the scale factor λ is the eigenvalue corresponding to that eigenvector. Equation (1) is the eigenvalue equation for the matrix A .

Equation (1) can be stated equivalently as

where I is the n by n identity matrix and 0 is the zero vector.

Eigenvalues and the characteristic polynomial Edit

Equation (2) has a nonzero solution v if and only if the determinant of the matrix (AλI) is zero. Therefore, the eigenvalues of A are values of λ that satisfy the equation

Using Leibniz' rule for the determinant, the left-hand side of Equation (3) is a polynomial function of the variable λ and the degree of this polynomial is n, the order of the matrix A. Its coefficients depend on the entries of A, except that its term of degree n is always (−1) n λ n . This polynomial is called the characteristic polynomial of A. Equation (3) is called the characteristic equation or the secular equation of A.

The fundamental theorem of algebra implies that the characteristic polynomial of an n-by-n matrix A, being a polynomial of degree n, can be factored into the product of n linear terms,

where each λi may be real but in general is a complex number. The numbers λ1, λ2, …, λn, which may not all have distinct values, are roots of the polynomial and are the eigenvalues of A.

As a brief example, which is described in more detail in the examples section later, consider the matrix

Taking the determinant of (AλI) , the characteristic polynomial of A is

Setting the characteristic polynomial equal to zero, it has roots at λ=1 and λ=3 , which are the two eigenvalues of A. The eigenvectors corresponding to each eigenvalue can be found by solving for the components of v in the equation ( A − λ I ) v = 0 =mathbf <0>> . In this example, the eigenvectors are any nonzero scalar multiples of

If the entries of the matrix A are all real numbers, then the coefficients of the characteristic polynomial will also be real numbers, but the eigenvalues may still have nonzero imaginary parts. The entries of the corresponding eigenvectors therefore may also have nonzero imaginary parts. Similarly, the eigenvalues may be irrational numbers even if all the entries of A are rational numbers or even if they are all integers. However, if the entries of A are all algebraic numbers, which include the rationals, the eigenvalues are complex algebraic numbers.

The non-real roots of a real polynomial with real coefficients can be grouped into pairs of complex conjugates, namely with the two members of each pair having imaginary parts that differ only in sign and the same real part. If the degree is odd, then by the intermediate value theorem at least one of the roots is real. Therefore, any real matrix with odd order has at least one real eigenvalue, whereas a real matrix with even order may not have any real eigenvalues. The eigenvectors associated with these complex eigenvalues are also complex and also appear in complex conjugate pairs.

Algebraic multiplicity Edit

Let λi be an eigenvalue of an n by n matrix A. The algebraic multiplicity μA(λi) of the eigenvalue is its multiplicity as a root of the characteristic polynomial, that is, the largest integer k such that (λλi) k divides evenly that polynomial. [10] [27] [28]

Suppose a matrix A has dimension n and dn distinct eigenvalues. Whereas Equation (4) factors the characteristic polynomial of A into the product of n linear terms with some terms potentially repeating, the characteristic polynomial can instead be written as the product of d terms each corresponding to a distinct eigenvalue and raised to the power of the algebraic multiplicity,

If d = n then the right-hand side is the product of n linear terms and this is the same as Equation (4). The size of each eigenvalue's algebraic multiplicity is related to the dimension n as

If μA(λi) = 1, then λi is said to be a simple eigenvalue. [28] If μA(λi) equals the geometric multiplicity of λi, γA(λi), defined in the next section, then λi is said to be a semisimple eigenvalue.

Eigenspaces, geometric multiplicity, and the eigenbasis for matrices Edit

Given a particular eigenvalue λ of the n by n matrix A, define the set E to be all vectors v that satisfy Equation (2),

On one hand, this set is precisely the kernel or nullspace of the matrix (AλI). On the other hand, by definition, any nonzero vector that satisfies this condition is an eigenvector of A associated with λ. So, the set E is the union of the zero vector with the set of all eigenvectors of A associated with λ, and E equals the nullspace of (AλI). E is called the eigenspace or characteristic space of A associated with λ. [29] [10] In general λ is a complex number and the eigenvectors are complex n by 1 matrices. A property of the nullspace is that it is a linear subspace, so E is a linear subspace of ℂ n .

Because the eigenspace E is a linear subspace, it is closed under addition. That is, if two vectors u and v belong to the set E, written u, vE , then (u + v) ∈ E or equivalently A(u + v) = λ(u + v) . This can be checked using the distributive property of matrix multiplication. Similarly, because E is a linear subspace, it is closed under scalar multiplication. That is, if vE and α is a complex number, (αv) ∈ E or equivalently A(αv) = λ(αv) . This can be checked by noting that multiplication of complex matrices by complex numbers is commutative. As long as u + v and αv are not zero, they are also eigenvectors of A associated with λ.

The dimension of the eigenspace E associated with λ, or equivalently the maximum number of linearly independent eigenvectors associated with λ, is referred to as the eigenvalue's geometric multiplicity γA(λ). Because E is also the nullspace of (AλI), the geometric multiplicity of λ is the dimension of the nullspace of (AλI), also called the nullity of (AλI), which relates to the dimension and rank of (AλI) as

Because of the definition of eigenvalues and eigenvectors, an eigenvalue's geometric multiplicity must be at least one, that is, each eigenvalue has at least one associated eigenvector. Furthermore, an eigenvalue's geometric multiplicity cannot exceed its algebraic multiplicity. Additionally, recall that an eigenvalue's algebraic multiplicity cannot exceed n.

Additional properties of eigenvalues Edit

Left and right eigenvectors Edit

Many disciplines traditionally represent vectors as matrices with a single column rather than as matrices with a single row. For that reason, the word "eigenvector" in the context of matrices almost always refers to a right eigenvector, namely a column vector that right multiplies the n × n matrix A in the defining equation, Equation (1),

The eigenvalue and eigenvector problem can also be defined for row vectors that left multiply matrix A . In this formulation, the defining equation is

Diagonalization and the eigendecomposition Edit

Suppose the eigenvectors of A form a basis, or equivalently A has n linearly independent eigenvectors v1, v2, …, vn with associated eigenvalues λ1, λ2, …, λn. The eigenvalues need not be distinct. Define a square matrix Q whose columns are the n linearly independent eigenvectors of A,

Since each column of Q is an eigenvector of A, right multiplying A by Q scales each column of Q by its associated eigenvalue,

With this in mind, define a diagonal matrix Λ where each diagonal element Λii is the eigenvalue associated with the ith column of Q. Then

Because the columns of Q are linearly independent, Q is invertible. Right multiplying both sides of the equation by Q −1 ,

or by instead left multiplying both sides by Q −1 ,

A can therefore be decomposed into a matrix composed of its eigenvectors, a diagonal matrix with its eigenvalues along the diagonal, and the inverse of the matrix of eigenvectors. This is called the eigendecomposition and it is a similarity transformation. Such a matrix A is said to be similar to the diagonal matrix Λ or diagonalizable. The matrix Q is the change of basis matrix of the similarity transformation. Essentially, the matrices A and Λ represent the same linear transformation expressed in two different bases. The eigenvectors are used as the basis when representing the linear transformation as Λ.

Conversely, suppose a matrix A is diagonalizable. Let P be a non-singular square matrix such that P −1 AP is some diagonal matrix D. Left multiplying both by P, AP = PD . Each column of P must therefore be an eigenvector of A whose eigenvalue is the corresponding diagonal element of D. Since the columns of P must be linearly independent for P to be invertible, there exist n linearly independent eigenvectors of A. It then follows that the eigenvectors of A form a basis if and only if A is diagonalizable.

A matrix that is not diagonalizable is said to be defective. For defective matrices, the notion of eigenvectors generalizes to generalized eigenvectors and the diagonal matrix of eigenvalues generalizes to the Jordan normal form. Over an algebraically closed field, any matrix A has a Jordan normal form and therefore admits a basis of generalized eigenvectors and a decomposition into generalized eigenspaces.


This is one of over 2,400 courses on OCW. Explore materials for this course in the pages linked along the left.

MIT OpenCourseWare is a free & open publication of material from thousands of MIT courses, covering the entire MIT curriculum.

No enrollment or registration. Freely browse and use OCW materials at your own pace. There's no signup, and no start or end dates.

Knowledge is your reward. Use OCW to guide your own life-long learning, or to teach others. We don't offer credit or certification for using OCW.

Made for sharing. Download files for later. Send to friends and colleagues. Modify, remix, and reuse (just remember to cite OCW as the source.)


Chapter 5 Eigenvalues and Eigenvectors ¶ permalink

This chapter constitutes the core of any first course on linear algebra: eigenvalues and eigenvectors play a crucial role in most real-world applications of the subject.

Example

In a population of rabbits,

  1. half of the newborn rabbits survive their first year
  2. of those, half survive their second year
  3. the maximum life span is three years
  4. rabbits produce 0, 6, 8 baby rabbits in their first, second, and third years, respectively.

What is the asymptotic behavior of this system? What will the rabbit population look like in 100 years?

In Section 5.1, we will define eigenvalues and eigenvectors, and show how to compute the latter in Section 5.2 we will learn to compute the former. In Section 5.4 we will use eigenvalues and eigenvectors to determine when a matrix is “similar” to a diagonal matrix, and we will see that the algebra and geometry of such a matrix is much simpler to understand. Finally, we spend Section 5.6 presenting a common kind of application of eigenvalues and eigenvectors to real-world problems, including searching the Internet using Google’s PageRank algorithm.


Mathematics Prelims

Now since is a change of basis matrix, each of its columns gives the coordinates to a basis vector of some basis. Let’s call that basis and let through be the elements of that basis. Now, if we take the above equation and multiply by on the right, notice that

That is, the -th column of is equal to the -th column of , which is just times the -th column of . Since each column of is just a linear combination of the columns of , though, we have

This means that when we plug in the -th column of to the linear transformation represented by , we get back a multiple of that column. Calling the linear transformation , we have that

Vectors such as whose image under is just a multiple of the vector are called eigenvectors of . That multiple, the above, is called an eigenvalue of . These eigenvectors and eigenvalues are associated with a particular linear transformation, so when we talk about the eigenvectors and eigenvalues of a matrix, we really mean the eigenvectors and eigenvalues of the transformation represented by that matrix. Notice that this means that eigenvalues are independent of the chosen basis since similar matrices represent the same transformation just with respect to different bases, similar matrices have the same eigenvalues.

We assumed that was similar to a diagonal matrix above, but this isn’t always true. If is similar to a diagonal matrix, say , then as we’ve just shown, the columns of are eigenvectors of . Since these form the columns of a non-singular matrix, the eigenvectors of form a basis for the vector space. Also, if the eigenvectors of form a basis, let’s take those basis vectors as columns of .

So a matrix is diagonalizable (similar to a diagonal matrix) if and only if its eigenvectors form a basis for the vector space.


Computing Eigenvalues and Eigenvectors

It is not too difficult to compute eigenvalues and their corresponding eigenvectors when the matrix transformation at hand has a clear geometric interpretation. For examples, consider the diagonal matrix discussed above and the reflection matrix below:

For arbitrary matrices, however, a more general procedure is necessary.

In computations, the characteristic polynomial is extremely useful. To determine the eigenvalues of a matrix A A A , one solves for the roots of p A ( x ) p_ (x) p A ​ ( x ) , and then checks if each root is an eigenvalue.


Eigenvectors

Geometrically, the equation $Avec=lambdavec$ means that the vectors $vec$ and $Avec$ lie on the same line. Take a look at the applet below. The default matrix, as in the previous applet, is $ A = egin 0 & 2 2 & 0 end. $ Leave $A$ set as it is for now. Notice that there are two vectors on the screen, the red vector is $vec$ and the blue is $Avec$. You can interact with this applet by moving the red arrow around. (You can click and drag. Or, if you want more control, click first to highlight the arrow (it will glow slightly), and then use the arrow buttons on the keyboard. If you're using the keyboard, holding down the shift button will allow finer-scale movement.)

If you move the red arrow around for a while, you'll notice that usually the red and blue arrows do not lie on the same line. (The line through the red arrow is shown to help you see this better.) However, if you set $vec=(1,1)$, you'll see that the blue arrow does lie on the same line as the red. Moreover, the blue arrow points in the same direction as the red one and is twice as long. This shows graphically that $(1,1)$ is an eigenvector for the eigenvalue $lambda=2$. Notice that $(1,1)$ is not unique. Try $(2,2)$ and $(-0.5,-0.5)$. In fact, for any vector of the form $vec=(s,s)$ the blue arrow will point in the same direction as the red and be twice as long. Algebraically, this is because of the following fact. If $Avec=lambdavec$, then $ A(svec) = s(Avec) = s(lambdavec) = lambda(svec). $

Similarly, if you set the red arrow to $(1,-1)$, you'll see that now the blue arrow is on the same line as the red, is twice as long as the red, and points in the opposite direction. This shows that $(1,-1)$ is an eigenvector for the eigenvalue $lambda=-2$. (The minus sign signifies "opposite direction.")

In Exercise 2, we saw that the eigenvalues of $ A = egin 2 & 0 1 & 1 end. $ are $lambda=1$ and $lambda=2$. Use the applet above to find the eigenvectors. (You may want to use the arrows on the keyboard along with the shift button when you get close.) Be sure to say which eigenvalue each of the eigenvectors corresponds to.

Now use the applet above to try to find the eigenvectors for the two matrices in Exercise 3. Describe what you find. (Since this isn't a linear algebra class, you might not yet have the tools or vocabulary necessary to do this perfectly. Just give it your best shot.)

Now put the matrix from Exercise 4 into the applet. Describe the relationship between $vec$ and $Avec$ as you move $vec$ around. Why does this mean that $A$ can't have any real eigenvalues?


Watch the video: Eigenwerte und Eigenvektoren berechnen + wichtige Eigenschaften von EWu0026EV (November 2021).