A system of linear equations with variables is simply a set of equations, in which the variables appear without any powers or multiplication between them:
where are some real numbers. E.g. the following is a system of 3 linear equations with four variables:
A solution of such a system of linear equations is a tuple of four numbers (four is the number of variables in this case), which satisfies all the equations in the system (after substitution of those numbers under the variables). E.g. (2,0,0,0) in our case. But a system can have more than one solutions. For example, (22,-7,1,0) is also a solution to the above system. More precisely a system of linear equations can have 0, 1, or infinitely many solutions.
A general solution of a system of linear equations is just a description of a set of all its solutions (which sometimes may mean saying that there are no solutions at all, or pointing out the only one). How to find a general solution? We will use so called Gaussian elimination method. Less formally we will call it transformation of our system of equations into a echelon form.
The first step is to write down a matrix of a given system of equations. A matrix is simply an rectangular table with numbers. Matrices will play more and more important role in this course, but now we can just think of a matrix of a system of equations as of an abbreviated notation of the system itself. Simply we write down the coefficients separating the column of free coefficients with a line:
Next we make some operations on this matrix. Those operations simplify the matrix, but we shall make sure that they do not change the set of solutions of the corresponding system of equations. Therefore only three types of operations are permitted:
- subtracting from a row another row multiplied by a number (corresponds to subtracting an equation from another one),
- swapping two rows (corresponds to swapping two equations)
- multiplying a row by a non-zero number (corresponds to multiplying both sides of an equation by a non-zero number)
And our aim is to achieve a staircase-like echelon form of a matrix (meaning that you can draw a stairs through a matrix, under which you will have only zeros). It means that in each corresponding equation there will be less variables than in the previous one.
What should we do to achieve such a form? The best method is to generate the echelon from the left side using the first row. Under 1 in the left upper corner we would like to have zeros. To achieve this, we have to subtract the first row multiplied by 2 from the second one and also subtract the first row multiplied by 4 from the third one. So:
So now we would like to have a zero under 1 in the second row. Therefore we subtract the second row from the third:
And so we have achieved an echelon form! We have leading coefficient in the first row on the first variable and a leading coefficient in the second row on the second variable.
The next step is to reduce the matrix (transform it into a reduced echelon form). We would like to have zeros also above the leading coefficients. In our case we only need to do something with 3 on the second place in the first row. To have a zero there we have to subtract the second row multiplied by 3 from the first one.
We need also to make sure that the leading coefficients are always ones (sometimes we have to multiply a row by a fraction number). But in our case it is already done. We have achieved a reduced echelon form. To have a general solution it suffices to write down corresponding equations moving the free variables to the right side of the equations:
We can also write it in parametrized form substituting to a vector all we know in the general solution, so: . Notice that by substituting and by any real numbers we will get a tuple (sequence of four numbers) which is a solution to our system of equations (it has infinitely many solutions). E.g. substituting we get the solution , which was mentioned above.
Finally let us introduce some terminology. A system of linear equations is:
- homogeneous if all constant terms are zero,
- inconsistent if it has no solutions.
For example the following set:
is not homogeneous, has exactly one solution, so it is not inconsistent. Meanwhile:
is homogeneous and has exactly one solution, so is not inconsistent. On the other hand, the following set of equations is inconsistent:
Note that a system of linear equations can have exactly one, infinitely many or no solutions.
In many applications we will use the notion of determinant of a matrix. The determinant of a matrix makes sense for square matrices only and is defined recursively:
where is matrix with -th row and -th column crossed out. So (the determinant is denoted by or by using absolute value style brackets around a matrix):
And so on. E.g.:
The above definition is only a special case of a more general fact called Laplace expansion. Instead of using the first row we can use any row or column (choose always the one with most zeros). So:
for any row . Analogical fact is true for any column.
E.g. for the below matrix it is easiest to use the third column:
Determinant and operations on a matrix
Notice first that from the Laplace expansion we easily get that if a matrix has a row of zeros (or column) its determinant equals zero.
Consider now different operations on rows of a matrix, which we use to calculate a ,,stair-like” form of a matrix. Using Laplace expansion we can prove that swapping two rows multiplies the determinant by — indeed calculating the determinant using the first column we see that the signs in the sum may change, but also the rows in the minor matrices get swapped.
Immediately we can notice that multiplying a row by a number multiplies also the determinant by this number — you can see it easily calculating Laplace expansion using this row.
Therefore multiplying whole matrix by a number multiplies the determinant by this number many times, precisely:
where is a matrix of size .
Notice also, that the determinant of a matrix with two identical rows equals zero, because swapping those rows does not change the matrix but multiplies the determinant by , so , therefore . So because of the row multiplication rule, if two rows in a matrix are linearly dependent, then its determinant equals .
Also the Laplace expansion implies that if matrices , , differ only by -th row in the way that this row in matrix is a sum of -th rows in matrices and , then the determinant of is the sum of determinants of and , e.g.:
But it can be easily seen that in general !
Finally, consider the most important operation of adding to a row another row multiplied by a number. Then we actually deal with the situation described above. The resulting matrix is matrix , which differs from and only by the row we sum to. Matrix is the original matrix and matrix is matrix , in which we substitute the row we sum to with the row we are summing multiplied by a number. Therefore , but has two linearly dependent rows, so and . Therefore the operation of adding a row multiplied by a number to another row does not change the determinant of a matrix.
Calculating the determinant via triangular form of matrix
If you look closely enough you will see that the Laplace expansion also implies that the determinant of a matrix in an echelon form (usually called triangular for square matrices) equals the product of elements on the diagonal of the matrix, so e.g.:
Because we know how the elementary operations change, to calculate the determinant of a matrix we can calculate a triangular form, calculate its determinant and recreate the determinant of the original matrix. This method is especially useful for large matrices, e.g.:
Therefore, the determinant of the last matrix is . On our way we have swapped rows once and we have multiplied one row by , therefore the determinant of the first matrix equals .
The above fact also implies how to calculate the determinant of a matrix which is in the block form: with left bottom block of zeros. The determinant of such a matrix equals , e.g.:
Given a system of equations with variables we may try to solve it with Cramer’s rule. Let be the matrix of this system without the column of free coefficients. Let be the matrix , in which instead of -th column we put the column of free coefficients. Then:
- if , the system has exactly one solution. The solution is given by the following formula: ,
- if , and at least one of is not equal to , the system has no solutions,
- if and for every , , there can be zero or infinitely many solutions — Cramer’s method does not give any precise answer.
E.g. let us solve the following system of equations:
Since , this system has exactly one solution. To determine it we calculate the other determinants:
And so , , .
Generally speaking a vector space is a ,,space” consisting of vectors, which can be:
- added and addition is associative () and commutative ()
- multiplied by a number and multiplication is distributive with respect to the addition(), and with respect to the addition of numbers () and compatible with multiplication of numbers ()
in which there exists a zero vector such that , for any vector and for each vector there exists a vector (inverse to ) such that .
The plane is a classic example of a vector space. Vectors have a form of a pair of numbers , and can be added one to another and multiplied by a number . There exists a zero vector and for each vector there is an inverse vector, .
But obviously a space consisting of vectors of the form of sequence of larger number of numbers, e.g. 4, is also a vector space. Although it is not easy to imagine this space geometrically, addition and multiplication works in the same way as in the two-dimensional case.
A vector subspace is a subset of a vector space closed under the operations on vectors, meaning that if vectors are in , then is also in and for any number , also is in .
To prove that a subset is a vector subspace we have to prove that for any two vectors in this subset their sum is also in this subset and that for any vector in the subset and any number, their multiplication is in the subset. To prove that a subset is not a vector subspace it suffices to find two vectors in the subset with the sum outside it or a vector in the subset and a number such that their multiplication does not belong to the subset.
For example, the line is a vector subspace of the plane, since if belong to this line, then they are of form i and their sum is and also belongs to the line. Similarly for every number a, belongs to this line.
On the other hand the set is not a vector subspace, because vector , which is in the set, multiplied by gives , which is not in the set.
Given a finite set of vectors , any vector of form , where are some numbers is called their linear combination. For example vector (-2,1) is a linear combination of vectors and because . How to calculate those coefficients without simply guessing them? Notice that we look for numbers , such that . This is actually a system of two linear equations:
it suffices to solve it:
so . If this system were contradictory, it would mean that our vector is not a linear combination of those two vectors.
The set of all linear combinations of vectors is denoted as .
A set of vectors will be called linearly independent, if none of them is a linear combination of others. It is quite an impractical definition, because to check whether a set of four vectors is linearly independent directly from this definition we have to check that four systems of equations are contradictory. We need a different but equivalent definition.
Notice that if a vector is a linear combination of others, e.g. , then the zero vector can be written as , so the zero vector is a linear combination of vectors from our set in which at least one coefficient is non-zero. It seems that we are close to the another definition of linear independence: a set of vectors is linearly independent, if in every their combination giving the zero vector all the coefficients are zeros. In other words the only way to get the zero vector is the trivial way: multiplying them all by zero.
How to check that a given set of vectors is linearly independent? Obviously we have to calculate the coefficients of their linear combination giving the zero vector (if the only solution is that all the coefficients are zero, then the set is linearly independent). So we need to check whether a system of linear equations has only one solution.
For example, let us check whether are linearly independent. We solve a system of equations:
We already know that the system has only one solution because we have a stair in every row. Therefore this set of vectors is linearly independent.
There is also a second method of checking whether a set of vector is linearly independent. Notice that the operations on rows of a matrix are just creating linear combinations of the vectors written in the rows of the matrix. If we manage to get a row of zeros, it means we have a non-trivial linear combination of rows giving the zero vector, and so the set of vectors written in the rows is not linearly independent. Let us check whether are linearly independent.
we get a row of zeros, so are not linearly independent.
Describing a vector space
How to describe mathematically a given vector subspace? There are few ways:
- list a set of vectors which spans it,
- write out its basis,
- give a homogeneous system of linear equations which describes it.
The above needs few more words of explanation. We would also like to know how to move from one of the descriptions to another.
A spanning system of vectors
It is easy to notice that the set of all linear combinations of a given set of vectors is always a vector space. We check it in the case of three vectors, let , then , for some numbers a, b, c, d, f, g. Therefore, and for any number k, are also linear combinations of , so they are in , so we are dealing with a vector space.
Therefore, , or are examples of vector spaces, defined by listing the sets of vectors which span them.
A system of vectors which spans a given vector space and additionally is linearly independent is called its basis. Basis of a vector space is not unique, e.g. (so called standard basis), and also are both bases of the plane. But in every basis of a given vector space you will find the same number of vectors. This number is called the dimension of the given vector space and denoted as .
Obviously, a basis is in particular a system of vector which spans a given space.
Every vector in the given space is a linear combination of vectors from the basis. Furthermore, the coefficients in this combinations are uniquely determined and called coordinates. So, e.g. the vector (1,1) has coordinates 1,1, with respect to the basis and (0,-1) with respect to the same basis has coordinates -2,-1. Obviously, we calculate the coordinates of a given vector with respect to a given basis by solving a system of equations. We have done it already before while determining whether a vector is a linear combination of other vectors. This time the system will always have exactly one solution.
A system of equations describing a space
Notice that given a homogeneous system of linear equations and two its solutions, both their sum and a solution multiplied by any number are also solutions of the system. Let us check it in the following example. Vectors (1,-1,0) and (0,2,2) are solutions of the equation . Indeed, and , so also and .
Therefore, a set of solutions to a homogeneous system of linear equations is always a vector space! And a homogeneous system itself is one of possible descriptions of a vector space.
Now let us learn how to when given one form of description find another one.
Finding a basis from a set of vectors which spans the space
Given a set of vectors we would like to find a linearly independent set, which spans the same space. We have mentioned that in a matrix when we carry out the operations on rows we do not change the set of possible linear combinations of vectors written in the rows. Furthermore, notice that in a echelon form of matrix all vectors from the non-zero rows are linearly independent. Indeed, to get the zero vector in the combination of them, we need to multiply the first one by zero because of the first place. Therefore, we need to multiply the second one by zero because of the second place. An so on. Thus, all we need to do is to write the given vectors in the rows of a matrix and calculate its echelon form. The non-zero rows form a basis of the space.
For example, let us find a basis of the space .
So is a basis of the given space, and the space has dimension 2.
Finding a basis of a space described by a system of equations
We start by finding a general solution to the given system in the parametrized form. E.g.
so the general solution is of form: , so every vector in the given space is of form , for some . But notice that . Therefore every vector in the space is a linear combination of vectors , which I have calculated by substituting for one parameter and zero for others. It is also easy to see that those three vectors are linearly independent. So a is a basis of the space and the space has dimension . Notice also that , and 5 is the number of the variables, and 2 is the number of independent equations in the system.
Finding a system of equations describing a vector space given by a system of vectors which spans it (or a basis)
Assume that we study the vector space . Therefore, our system of equations will have 4 variables. Assume that there will be an equation of form in it. The three given vectors have to satisfy this equation. So , which actually gives a system of equations which has to be fulfilled by the coefficients of the system we are looking for. A system of equations describing the coefficients of a system of equations, looks a bit confusing, but bear with me. Let us write this system down:
And solve it:
So the general solution, which will tell us what are the coefficients in the system we are looking for, is the following: . Notice that the system we are trying to find is not unique, many systems of equations are equivalent. Actually, the set of possible coefficients of an equation which is fulfilled by any vector in the given space forms also a vector space. We write down its basis using the calculated general solution: . If we take those two vectors and write down two equations with such coefficients we will be able to get any equation which is fulfilled by any vector of our given space by combining them linearly. It means the system we look for is the system with coefficients from the calculated basis, namely:
And we are done!
To multiply matrices, the first has to have the same number of columns as the second one of rows. The resulting matrix will have as many rows as the first one, and as many columns as the second one. We multiply the rows of the first matrix by the columns of the second in the sense that in the resulting matrix in a place in -th row and -th column we write the result of multiplication of -th row of the first matrix with the -th column of the second one, where by multiplication of row and column we mean multiplication of pairs of subsequent numbers summed up. E.g.:
Matrix multiplication is associative, which means that for any , . But is not commutative! Notice that if the matrices are not square it is impossible to multiply them conversely. If they are square the result may be different.
A matrix is inverse to matrix , if , where is the identity matrix (the matrix with ones on the diagonal and zeros everywhere else). The inverse matrix is denoted as . Since and , we see that . This implies that only matrices with non-zero determinants can have their inverses. Therefore we call such matrices invertible.
How to calculate the inverse of a given matrix? We have mentioned recently that the operations on rows of a matrix leading to the reduced “stair-like” for is actually multiplication by a matrix. Imagine that we transform the matrix consisting of matrix along with the identity matrix into the reduced “stair-like” form. Since is a square matrix with non-zero determinant, we will get identity matrix on the left side: . But notice that if is the matrix of rows operations, then . Therefore and . The first equation implies that . The second that . So we get the inverse matrix on the right after those operations!
E.g. let us calculate the inverse of the following matrix:
A linear map is a map which maps vectors from a given space , to vectors from another linear space (), and satisfying the linear condition, which says that for every vectors and numbers we have . E.g. a rotation around is a linear map . Given two vectors and scalars we will get the same vector regardless of whether we rotate the vector first and then multiply by numbers and add them, or multiply by numbers, add and then rotate.
Therefore, to prove that a given map is a linear map we need to prove that for any two vectors and any two numbers it satisfies the linear condition. E.g. given as is a linear map because if and , then .
Meanwhile, to disprove that a map is linear we need to find an example of two vectors along with two numbers such that the linear condition fail for them. E.g. given as is not a linear map because .
Usually, linear maps will be given by their formulas. E.g. , . Then to see what does to a given vector, e.g. , we substitute it to the formula: . By the way, it is easy to see, that this is simply the rotation around by 90 degrees clockwise. Less geometrical example: let , , therefore maps vector to vector .
Sometimes we can define a linear map by giving its values on the vectors from a given basis only. This suffices to determine this map. E.g. let be given in the following form: (vectors , constitute a basis of the plane). We can calculate the formula of this map. First calculate the coefficients of the standard basis in the given one (by solving a system of equations or by guessing). In this case we see that (coefficients: and (coefficients: ). Therefore, .
In the above example we can also come to the conclusion what is and by writing down a matrix consisting of the vectors of the given basis on the left hand side and values on the right hand side. Each operation on the rows of the matrix does not change the principle that the vector is on the left, and its value on the right, because is a linear map. So after transforming the matrix to the reduced echelon form on the right hand side one will find and .
so as before
Given two linear maps and a number , their sum and are linear maps . Obviously we add and multiply by coefficients. So e.g. if , then .
It means that the set of all linear maps between fixed vector spaces and is a vector space itself. It is denoted by .
Kernel and image of a linear transformation
If is a linear transformation, then it is easy to see that
called the kernel of and
called the image of are linear subspaces of and respectively. The dimension of the image of is also called the rank of and denoted by . Notice that if spans , then spans . Moreover, if , and is a basis of , then is a basis of . Therefore,
Matrix of a linear map
Matrices will be important for us for yet one more reason. It turns out that multiplying a matrix by a vector is actually the same as calculating a value of a linear map. Observe the following:
is exactly the same as .
More precisely, given a linear map , basis of and basis of , a matrix which multiplied from the right by a (vertical) vector of coordinates of a vector in basis will give the coordinates of in will be called the matrix of in basis and .
In particular in the above example:
is the matrix of in standard basis — and can be easily written out of the formula defining simply by writing its coefficients in rows.
Changing bases of a matrix — first method: calculate coordinates.
Assume that we are given a formula defining , as above (or its matrix in the standard basis) and basis and . E.g.: and . We would like to calculate .
Notice that if we multiply this matrix from the right by (vertical) vector , then I will get simply the first column of this matrix. On the other hand the first vector from , namely vector , has in this basis coordinates 1,0,0. Therefore the result of the multiplication are the coordinates of in basis and this is the first column of .
So: and we have to find the coordinates of this vector in : , so the coordinates are 4,8 and this is the first column of the matrix we would like to calculate.
Let us find the second column. We do the same as before but with the second vector from basis . has coordinates 1,0 in .
The third column: has in coordinates , so
Given two linear maps and we can compose them and consider a map which transforms a vector from first via and the result via getting a vector form .
Such a map is denoted as . Given formulas defining i we can easily get the formula for . E.g. consider as above and such that . Therefore .
Now look at the matrices of those maps. If is basis of , is basis and is basis of , then given coordinates of in to get the coordinates of in we have to first multiply it by matrix (we will get coordinates in ) and multiply the result by . Therefore we have multiplied the coordinates by , which means that:
Notice which bases have to agree in this formula!
In particular in our example:
which is consistent with the formula we have calculated before.
There is a special linear map, which we call the identity, which does nothing. For example, in it is . Therefore given two basis and , along with matrix if we multiply this matrix from the right by the coordinates of a vector in basis we will get the coordinates also of (as ), but in basis . So matrix changes the basis from to .
Especially we will need matrices changing basis from the standard basis to the given one and from the given basis to the standard one. Let’s check how to calculate them.
It is easy to calculate the change-of-coordinates matrix change from a given basis to the standard basis. We will find (basis is defined in the example above). After multiplying it from the right by we will get its first column. On the other hand the first vector in has in it coordinates 1,0,0, so the result of multiplication are the coordinates of this vector in the standard basis, so simply it is this vector. So simply the -th column of this matrix is the -th vector from the basis. Therefore:
Now, the other case: from the standard basis to a given basis. We will calculate . It can be easily seen that in columns we should put coordinates of vectors from the standard basis in the given basis. Let us calculate them: and . Therefore:
Changing the basis of a matrix — the second method: multiplication by a change-of-coordinates matrix
We have a new tool to change basis of a matrix of a linear map. Because , we have that:
and in this way given a matrix of a linear map in standard basis we can calculate its matrix in basis from to . In our example:
Obviously we get the same result as calculated by the first method.