Tag Archives: glowna

6. Euclidean spaces

Part 1.: problems.

Positive and negative definite forms over \mathbb{R}

Given a bilinear real form h\colon V\times V\to \mathbb{R}, we can classify it with respect to possible sign of results on a single vector:

  • form h is positive definite, if for all non-zero v\in V, we get h(v,v)>0.
  • form h is negative definite, if for all non-zero v\in V, we get h(v,v)<0.
  • form h is positive semidefinite, if for all v\in V, we get h(v,v)\geq 0.
  • form h is negative semidefinite, if for all v\in V, we get h(v,v)\leq 0.

Obviously a form may not fall in any of those categories, if for some v,w\in V we have h(v,v)>0 and h(w,w)<0. Such forms are called indefinite.

Notice that a form is positive definite if and only if r(h)=s(h)=\dim V, and is negative definite if and only if r(h)=-s(h)=\dim V, thus if the numbers on the diagonal in a diagonal congruent matrix are strictly positive (positive definite) or strictly negative (negative definite) respectively. In order to find out whether a form is positive or negative definite or semidefinite we may find an orthogonal basis and the matrix of the form in this basis.

Alternatively, we may complement a formula of a form to squares making sure to use all expressions with the first variable first, and then all with the second one, and so on.



we get



where x'=x+y-2z, y'=y-z i z'=z, so the form is non-definite. The basis \mathcal{A}, in which the formula is expressed is (1,0,0),(-1,1,0),(1,1,1), because


Sylvester’s criterion

Since the sign of the determinant does not change in congruent matrix, we obtain so called Sylvester’s criterion which determines whether a form is positively definite or negatively definite. Notice that it does not tell anything about the categories with semidefinite forms!

How does it work? We study determinants of minors: let A_k be the matrix of size k\times k in the left upper corner of the matrix of a form we study. Let n\times n be the size of the matrix of this form. Sylvester’s criterion consists of the two following facts:

  • if for any k\leq n we have \det A_k > 0, the form is positively definite,
  • if for any k\leq n we have \det A_k > 0 for even k, and \det A_k<0, for odd k, then the form is negatively definite.

E.g. let h be such that its matrix is \left[\begin{array}{cc}2&-1\\-1&1\end{array}\right], so A_1=[2] and A_2=\left[\begin{array}{cc}2&-1\\-1&1\end{array}\right], therefore \det A_1=2>0 and \det A_2=2-1=1>0, so form h is positively definite.

E.g., let now h be such that its matrix is \left[\begin{array}{ccc}-1&-2&0\\-2&-5&0\\0&0&-1\end{array}\right], so A_1=[-1] and A_2=\left[\begin{array}{cc}-1&-2\\-2&-5\end{array}\right], also A_3=\left[\begin{array}{ccc}-1&-2&0\\-2&1&0\\0&0&-1\end{array}\right], therefore \det A_1=-1<0 and \det A_2=5-4=1>0 and \det A_3=-5+4=-1<0, so form h is negatively definite.

Finally let h be such that its matrix is \left[\begin{array}{cc}-2&-1\\-1&1\end{array}\right], so A_1=[-2] and A_2=\left[\begin{array}{cc}-2&-1\\-1&1\end{array}\right], therefore \det A_1=-2<0 and \det A_2=-2-1=-3<0, so form h is neither positively definite nor negatively definite.

Scalar product

A symmetric positive definite bilinear form over \mathbb{R} is called a scalar product. Scalar products enable us to define angles between vectors and lengths of vectors.

Summing up the definitions a scalar product is a function mapping vectors \alpha,\beta to \left<\alpha,\beta\right>, such that:

  • \langle au+bv,w\rangle=a\langle u,w\rangle +b\langle v,w\rangle,
  • \langle u,v\rangle=\langle v,u\rangle,
  • for every non-zero u, \langle u,u\rangle>0.

Standard scalar product in \mathbb{R}^n is sum of products on subsequent places, so e.g.: \left<(1,2,-1),(2,0,1)\right>=1\cdot 2+2\cdot 0+(-1)\cdot 1=2+0-1=1.

Euclidean space

A linear space over \mathbb{R} with a scalar product is called an euclidean space.

Length of a vector and angles between vectors

By Pitagoras Theorem it is easy to see (for the standard product) that \left<\alpha,\alpha\right> is the square of the length of a vector, e.g. \left<(3,4),(3,4)\right>=9+16=25=\|\alpha\|^2. The length of a vector \alpha, also called the norm of \alpha, will be denoted as \|\alpha\|. This motivates the definition:


Assume now that we are given three vectors p,q and r forming a triangle. So r=p-q. Let \theta be the angle between p i q. The law of cosines states that:






So cosine of an angle between vectors is defined by the following formula:

    \[\cos\theta=\frac{\left <p,q\right>}{\|p\|\|q\|}.\]

One more application of scalar product is calculating perpendicular projection of a vector onto a direction given by a second vector. Let r be the perpendicular projection of p onto direction given by q. It will have the same direction as q and length \|p\|\cos \theta, where \theta is the angle between p and q. Therefore:

    \[r=\|p\|\cos \theta \cdot \frac{q}{\|q\|}=q\cdot \frac{\left<p,q\right>}{\left<q,q\right>},\]

because: \frac{q}{\|q\|} is the vector of length 1 in direction of q.

Properties of norm and product

It is not very hard to prove the following

  • \|u+v\|^2=\|u\|^2+\|v\|^2+2\langle u,v\rangle,
  • \|au\|=|a|\|u\|,
  • \|u\|=0 if and only if u is the zero vector
  • |\langle u,v\rangle|\leq \|u\|\|v\| (Schwarz inequality)
  • \|u+v\|\leq \|u\|+\|v\| (triangle inequality)
  • jeśli \langle u,v\rangle=0, to \|u+v\|^2=\|u\|^2+\|v\|^2 (Pitagoras Theorem)

Perpendicularity, perpendicular spaces

We know that two vectors v,w are perpendicular if cosine of the angle between them equals zero. Therefore v\bot w if and only if \left<v,w\right>=0.

Notice that if we would like to find all vectors w perpendicular to v, then the above is the equation we have to solve. Moreover this is a linear uniform equation. If we would like to find the set of vectors perpendicular to all vectors from a given list, then we will get a system of uniform linear equations. So given a linear subspace V, the set V^\bot (called the orthogonal complement of V) of all vectors perpendicular to all the vectors from V is also a linear subspace! It is the space of solutions of some system of linear equations.

For example, let V=lin((1,1,0,-1),(-1,0,2,0)) in the space with standard product. A vector (x,y,z,t) is perpendicular to those vectors (and so also to every vector of V), if \left<(1,1,0,-1), (x,y,z,t)\right>=0 and \left<(-1,0,2,0), (x,y,z,t)\right>=0, in other words, if it satisfies the following system of equations:



    \[\left[\begin{array}{cccc|c}1&1&0&-1&0\\-1&0&2&0&0\end{array}\right]\underrightarrow{w_2+w_1} \left[\begin{array}{cccc|c}1&1&0&-1&0\\0&1&2&-1&0\end{array}\right]\underrightarrow{w_1-w_2} \left[\begin{array}{cccc|c}1&0&-2&0&0\\0&1&2&-1&0\end{array}\right]\]

So the general solution has the following form: (2z,-2z+t,z,t), and therefore we have the following basis of V^\bot: (2,-2,1,0),(0,1,0,1).

Notice that the coefficients in a system of equations describing given linear space are vectors which span the perpendicular space! Which is a new insight into our method of finding a system of equations for a space given by its spanning vectors.

Orthogonal and orthonormal bases

A basis of a space will be called orthogonal, if every pair of vectors in it are perpendicular. E.g. (1,0,1),(-1,1,1),(-1,-2,1) is an orthogonal basis of \mathbb{R}^3 with standard product — indeed it is easy to check, that all pairs are perpendicular, e.g. \left<(-1,1,1),(-1,-2,1)\right>=1-2+1=0.

We will learn how to find such basis later on. Now let us notice that calculating coordinates of a vector in such a basis is fairly simple. Actually, we are calculating its projections onto vectors from the basis. Given vector v, its n-th coordinate in an orthogonal basis, which has b_n as its n-th vector is simply \frac{\left<v,b_n\right>}{\left<b_n,b_n\right>}. Therefore, coordinates of (1,0,-2) in the above exemplary basis are: \frac{\left<(1,0,-2),(1,0,1)\right>}{\left<(1,0,1),(1,0,1)\right>}=\frac{-1}{2}, \frac{-3}{3}=-1 and \frac{-3}{6}=\frac{-1}{2}.

Orthonormal basis is an orthogonal basis in which additionally all the vectors have length 1. Given an orthogonal basis we can simply divide each of the vectors by its length to get an orthonormal basis. So \frac{(1,0,1)}{\sqrt{2}}, \frac{(-1,1,1)}{\sqrt{3}}, \frac{(-1,2,1)}{\sqrt{6}} is an orthonormal basis of \mathbb{R}^3. Notice that calculating the coordinates of a vector in an orthonormal basis is even simpler. Since \left<b_n,b_n\right>=1, we get that the n-th coordinate of v is simply \left<v,b_n\right>, where b_n is the n-th vector of the basis. Therefore the coordinates of (1,0,-2) in the basis in the example are: \frac{-1}{\sqrt{2}}, \frac{-3}{\sqrt{3}} and \frac{-3}{\sqrt{6}}.

Projection onto a linear subspace

We already know how to calculate a projection of a vector onto a line defined by a given vector. To calculate a projection onto more-dimensional subspace V we have to calculate its orthogonal basis and calculate projections onto directions defined by each of the vectors from the basis and sum those projections.

Notice also that if r is a projection of a vector v onto V, and r' is its projection onto V^{\bot}, then r=v-r'. Which can be nicely used when calculating a projection of a vector onto a plane in the case of three-dimensional space. Indeed, if V is a plane in \mathbb{R}^3, then the perpendicular space is a line and is given by a vector. Denote this vector by n. Therefore the projection of v onto plane V is:


Orthogonal reflection across a linear subspace

We can use our ability to calculate the projection r of v onto V, to easily calculate (see the image below) its image v' under reflection across V:



Gram-Schmidt orthogonalization

Assume that we are given a space spanned by some vectors , e.g. lin((1,0,1,0),(0,1,-1,1),(0,0,0,1))=lin(v_1,v_2,v_3). We would like to find an orthogonal basis of this space. The method of finding such a basis is called Gram-Schmidt orthogonalization. The idea is to take as the first vector of the new basis the first vector from the original basis:


The second vector has to be be perpendicular to the first one, so we take the second vector and subtract its projection onto the first one. So:


so only perpendicular ,,part” is left. In the case of the third vector we need to subtract projections onto both already constructed vectors:


In the case of more dimensions we will continue this procedure further.

In our case in the space with standard product:







So the basis we were looking for is (1,0,1,0),(1,2,-1,2),(-1,-2,1,3) (I can drop the fractions since multiplication by a number does not change angles).

Linear maps: projection onto a linear subspace and reflection across a linear subspace

Notice that a projection onto a linear subspace V and reflection across V are linear mappings. Moreover, it is easy to see its eigenvectors:

  • since projection does not change vectors in V, they are eigenvectors with eigenvalue 1. On the other hand, vectors from V^\bot are multiplied by zero, so they are eigenvectors with eigenvalue zero.
  • since reflection does not change vectors in V, they are eigenvectors with eigenvalue 1. On the other hand, vectors from V^\bot are multiplied by -1, so they are eigenvectors with eigenvalue -1.

So basis consisting of vectors from a basis of V and of vectors from a basis of V^\bot is a basis of eigenvectors of both those maps. Which make it possible to calculate their formulas.

E.g. let V=lin((1,0,1),(0,1,-1)) in the space with standard product. Therefore basis V^\bot is \{(-1,1,1)\}. So if \phi is the projection onto V, and \psi is the reflection across V, then (1,0,1),(0,1,-1) are eigenvectors with eigenvalue 1 of both maps. Also (-1,1,1) is an eigenvector with eigenvalue zero for \phi, and -1 for \psi. Therefor basis \mathcal{A}=\{(1,0,1),(0,1,-1),(-1,1,1)\} is a basis of eigenvectors of both maps, and::



Let us calculate their formulas. We have:












Gram’s determinant

Given a system of vectors v_1,\ldots, v_k in an euclidean space, the matrix G(v_1,\ldots , v_k)\in M_{k\times k}(\mathbb{R})=[\langle v_i,v_j\rangle]_{i,j\leq k} is called the Gram’s matrix of this system of vectors and its determinant is called Gram’s determinant W(v_1,\ldots, v_k). Immediately on can notice, that Gram’s matrix is symmetrical.

Notice in particular, that if the columns of a matrix A contain the coordinates of those vectors in a othonormal basis, then G(v_1,\ldots , v_k)=A^TA. Thus, if the number of vectors k equals the dimension of the whole space (i.e. if A is a square matrix), then W(v_1,\ldots , v_k)=(\det A)^2.

In particular, always W(v_1,\ldots , v_k)\geq 0. And it is equal to zero if and only if the system of vectors is linearly dependent.

4. Basic linear algebra

Part 1.: Problems, solutions.
Part 2.: Problems, solutions.
Part 3.: Problems.
Part 4.: Problems.

A system of k linear equations with variables x_{1}, x_{2},\ldots, x_{n} is simply a set of k equations, in which the variables appear without any powers or multiplication between them:

    \[\begin{cases} a_{1,1}x_1+a_{2,1}x_2+\ldots+a_{n,1}x_n=b_1\\ a_{1,2}x_1+a_{2,2}x_2+\ldots+a_{n,2}x_n=b_2\\ \ldots\\ a_{1,k}x_1+a_{2,k}x_2+\ldots+a_{n,k}x_n=b_k \end{cases},\]

where a_{i,j},b_j are some real numbers. E.g. the following is a system of 3 linear equations with four variables:

    \[\begin{cases}x_{1}+3x_{2}+x_{3}+5x_{4}=2\\ 2x_{1}+7x_{2}+9x_{3}+2x_{4}=4\\ 4x_{1}+13x_{2}+11x_{3}+12x_{4}=8\end{cases}\]

A solution of such a system of linear equations is a tuple of four numbers (four is the number of variables in this case), which satisfies all the equations in the system (after substitution of those numbers under the variables). E.g. (2,0,0,0) in our case. But a system can have more than one solutions. For example, (22,-7,1,0) is also a solution to the above system. More precisely a system of linear equations can have 0, 1, or infinitely many solutions.

A general solution of a system of linear equations is just a description of a set of all its solutions (which sometimes may mean saying that there are no solutions at all, or pointing out the only one). How to find a general solution? We will use so called Gaussian elimination method. Less formally we will call it transformation of our system of equations into a echelon form.

The first step is to write down a matrix of a given system of equations. A matrix is simply an rectangular table with numbers. Matrices will play more and more important role in this course, but now we can just think of a matrix of a system of equations as of an abbreviated notation of the system itself. Simply we write down the coefficients separating the column of free coefficients with a line:

    \[\left[\begin{array}{cccc|c}1 & 3 & 1 & 5 & 2\\ 2 & 7 & 9 & 2 & 4\\ 4 & 13 & 11 & 12 & 8\end{array}\right]\]

Next we make some operations on this matrix. Those operations simplify the matrix, but we shall make sure that they do not change the set of solutions of the corresponding system of equations. Therefore only three types of operations are permitted:

  • subtracting from a row another row multiplied by a number (corresponds to subtracting an equation from another one),
  • swapping two rows (corresponds to swapping two equations)
  • multiplying a row by a non-zero number (corresponds to multiplying both sides of an equation by a non-zero number)

And our aim is to achieve a staircase-like echelon form of a matrix (meaning that you can draw a stairs through a matrix, under which you will have only zeros). It means that in each corresponding equation there will be less variables than in the previous one.

What should we do to achieve such a form? The best method is to generate the echelon from the left side using the first row. Under 1 in the left upper corner we would like to have zeros. To achieve this, we have to subtract the first row multiplied by 2 from the second one and also subtract the first row multiplied by 4 from the third one. So:

    \[\underrightarrow{w_{2}-2w_{1}, w_{3}-4w_{1}} \left[\begin{array}{cccc|c}1 & 3 & 1 & 5 & 2\\ 0 & 1 & 7 & -8 & 0\\ 0 & 1 & 7 & -8 & 0\end{array}\right]\]

So now we would like to have a zero under 1 in the second row. Therefore we subtract the second row from the third:

    \[\underrightarrow{w_{3}-w_{2}} \left[\begin{array}{cccc|c}1 & 3 & 1 & 5 & 2\\ 0 & 1 & 7 & -8 & 0\\ 0 & 0 & 0 & 0 & 0\end{array}\right]\]

And so we have achieved an echelon form! We have leading coefficient in the first row on the first variable and a leading coefficient in the second row on the second variable.

The next step is to reduce the matrix (transform it into a reduced echelon form). We would like to have zeros also above the leading coefficients. In our case we only need to do something with 3 on the second place in the first row. To have a zero there we have to subtract the second row multiplied by 3 from the first one.

    \[\underrightarrow{w_{1}-3w_{2}} \left[\begin{array}{cccc|c}1 & 0 & -20 & 29 & 2\\ 0 & 1 & 7 & -8 & 0\\ 0 & 0 & 0 & 0 & 0\end{array}\right],\]

We need also to make sure that the leading coefficients are always ones (sometimes we have to multiply a row by a fraction number). But in our case it is already done. We have achieved a reduced echelon form. To have a general solution it suffices to write down corresponding equations moving the free variables to the right side of the equations:

    \[\begin{cases}x_{1}=2+20x_{3}-29 x_{4}\\ x_{2}=-7x_{3}+8x_{4} \end{cases}\]

We can also write it in parametrized form substituting to a vector (x_1,x_2,x_3,x_4) all we know in the general solution, so: \{(2+20x_{3}-29 x_{4}, -7x_{3}+8x_{4}, x_{3},x_{4})\colon x_3,x_4\in\mathbb{R}\}. Notice that by substituting x_3 and x_4 by any real numbers we will get a tuple (sequence of four numbers) which is a solution to our system of equations (it has infinitely many solutions). E.g. substituting x_3=1,x_4=0 we get the solution (22,-7,1,0), which was mentioned above.

Finally let us introduce some terminology. A system of linear equations is:

  • homogeneous if all constant terms are zero,
  • inconsistent if it has no solutions.

For example the following set:

    \[\begin{cases} x+y=2\\x-2y=-1\end{cases}\]

is not homogeneous, has exactly one solution, so it is not inconsistent. Meanwhile:

    \[\begin{cases} x+y+z=0\\x-y-z=0\\2x+y=0\end{cases}\]

is homogeneous and has exactly one solution, so is not inconsistent. On the other hand, the following set of equations is inconsistent:

    \[\begin{cases} x+y+z=0\\-2x-2y-2z=-1\end{cases}\]

Note that a system of linear equations can have exactly one, infinitely many or no solutions.


In many applications we will use the notion of determinant of a matrix. The determinant of a matrix makes sense for square matrices only and is defined recursively:

  • \det[a]=a
  •     \[\det \left[\begin{array}{cccc}a_{1,1}&a_{1,2}&\ldots&a_{1,n}\\a_{2,1}&a_{2,2}&\ldots&a_{2,n}\\\ldots&\ldots&&\ldots\\ a_{n,1}&a_{n,2}&\ldots&a_{n,n}\end{array}\right]=\]

        \[=a_{1,1}\det A_{1,1}-a_{1,2}\det A_{1,2}+a_{1,3}\det A_{1,3}-\ldots\pm a_{1,n}\det A_{1,m},\]

where A_{i,j} is matrix A with i-th row and j-th column crossed out. So (the determinant is denoted by \det or by using absolute value style brackets around a matrix):





And so on. E.g.:



    \[=1\cdot 0-0+2\cdot 19-0=38\]

Laplace expansion

The above definition is only a special case of a more general fact called Laplace expansion. Instead of using the first row we can use any row or column (choose always the one with most zeros). So:

    \[\det \left[\begin{array}{cccc}a_{1,1}&a_{1,2}&\ldots&a_{1,n}\\a_{2,1}&a_{2,2}&\ldots&a_{2,n}\\\ldots&\ldots&&\ldots\\ a_{n,1}&a_{n,2}&\ldots&a_{n,n}\end{array}\right]=\]

    \[=(-1)^i\left(a_{i,1}\det A_{i,1}-a_{i,2}\det A_{i,2}+a_{i,3}\det A_{i,3}-\ldots\pm a_{i,n}\det A_{i,m}\right),\]

for any row w_i. Analogical fact is true for any column.

E.g. for the below matrix it is easiest to use the third column:

    \[\left|\begin{array}{cccc}1&1&0&-1\\2&0&-1&-1\\3&-1&0&1\\0&1&0&-2\end{array}\right|=(-1)^3\cdot(-1)\cdot \left|\begin{array}{cccc}1&1&-1\\3&-1&1\\0&1&-2\end{array}\right|=10\]

Determinant and operations on a matrix

Notice first that from the Laplace expansion we easily get that if a matrix has a row of zeros (or column) its determinant equals zero.

Consider now different operations on rows of a matrix, which we use to calculate a ,,stair-like” form of a matrix. Using Laplace expansion we can prove that swapping two rows multiplies the determinant by -1 — indeed calculating the determinant using the first column we see that the signs in the sum may change, but also the rows in the minor matrices get swapped.

Immediately we can notice that multiplying a row by a number multiplies also the determinant by this number — you can see it easily calculating Laplace expansion using this row.

Therefore multiplying whole matrix by a number multiplies the determinant by this number many times, precisely:

    \[\det (aA)=a^n \det A, \]

where A is a matrix of size n\times n.

Notice also, that the determinant of a matrix with two identical rows equals zero, because swapping those rows does not change the matrix but multiplies the determinant by -1, so \det A=-\det A, therefore \det A=0. So because of the row multiplication rule, if two rows in a matrix are linearly dependent, then its determinant equals 0.

Also the Laplace expansion implies that if matrices A, B, C differ only by i-th row in the way that this row in matrix C is a sum of i-th rows in matrices B and C, then the determinant of C is the sum of determinants of A and B, e.g.:


But it can be easily seen that in general \det(A+B)\neq \det A+\det B!

Finally, consider the most important operation of adding to a row another row multiplied by a number. Then we actually deal with the situation described above. The resulting matrix is matrix C, which differs from A and B only by the row we sum to. Matrix A is the original matrix and matrix C is matrix A, in which we substitute the row we sum to with the row we are summing multiplied by a number. Therefore \det A=\det B+\det C, but C has two linearly dependent rows, so \det C=0 and \det B=\det A. Therefore the operation of adding a row multiplied by a number to another row does not change the determinant of a matrix.

Calculating the determinant via triangular form of matrix

If you look closely enough you will see that the Laplace expansion also implies that the determinant of a matrix in an echelon form (usually called triangular for square matrices) equals the product of elements on the diagonal of the matrix, so e.g.:

    \[\left|\begin{array}{ccc}-1&3&-1\\0&1&2\\0&0&3\end{array}\right|=(-1)\cdot 1\cdot 3=-3.\]

Because we know how the elementary operations change, to calculate the determinant of a matrix we can calculate a triangular form, calculate its determinant and recreate the determinant of the original matrix. This method is especially useful for large matrices, e.g.:



    \[\left[\begin{array}{ccccc}1&2&0&3&1\\0&1&1&-3&0\\0&3&1&-8&-4\\0&0&0&0&3\\0&0&0&-2&0\end{array}\right]\underrightarrow{w_3-3w_2, w_4\leftrightarrow w_5} \left[\begin{array}{ccccc}1&2&0&3&1\\0&1&1&-3&0\\0&0&-2&1&-4\\0&0&0&-2&0\\0&0&0&0&3\end{array}\right]\]

Therefore, the determinant of the last matrix is 1\cdot 1\cdot (-2)\cdot (-2)\cdot 3=12. On our way we have swapped rows once and we have multiplied one row by \frac{1}{2}, therefore the determinant of the first matrix equals \frac{12\cdot(-1)}{\frac{1}{2}}=-24.

The above fact also implies how to calculate the determinant of a matrix which is in the block form: \left[\begin{array}{cc}A&C\\0&B\end{array}\right] with left bottom block of zeros. The determinant of such a matrix equals \det A\cdot \det B, e.g.:

    \[\left|\begin{array}{ccccc}1&2&0&3&1\\2&6&2&0&2\\3&9&1&1&-1\\0&0&0&3&4\\0&0&0&1&1\end{array}\right|= \left|\begin{array}{ccc}1&2&0\\2&6&2\\3&9&1\end{array}\right|\cdot\left|\begin{array}{cc}3&4\\1&1\end{array}\right|.\]

2. Complex numbers

Part 1.: Problems, solutions.
Part 2.: Problems, solutions.
Part 3.: Problems, solutions,
Part 4.: Problems, solutions.

Complex numbers

To be able to calculate all algebraic operations, we need to be able to calculate roots of negative numbers. Therefore we introduce a new number i, such that i^2=-1.

Complex numbers \mathbb{C} are numbers of form z=a+bi, where a,b\in\mathbb{R}. Reals a and b are called the real and imaginary part of z and we denote them by \text{Re} z and \text{Im} z.

Therefore any complex number a+bi can be shown on a plane as a point with coordinates (a,b). This plane will be usually called the complex plane.

Adding and multiplying

We can add and multiply complex numbers straightforward, but recall that i^2=-1, so: (a+bi)+(c+di)=(a+c)+i(b+d) and (a+bi)(c+di)=(ac-bd)+i(bc+ad).

E.g.: (1+i)(-2-3i)+(-1+2i)=(1-5i)+(-1+2i)=-3i.


It is also easy to divide a complex number by a complex number. One shall proceed similarly as when there is an irrational root in the denominator. So:


Polar form

Every non-zero complex number z=a+ib, can be described in a yet another way — it id defined by its distance from zero (its modulus) |z|=\sqrt{a^2+b^2} and the angle between it and the real axis, called the argument \text{Arg} z (\sin \text{Arg} z=\frac{b}{|z|}, \cos \text{Arg} z=\frac{a}{|z|}). Therefore, z=|z|\left(\cos \text{Arg} z+i\sin \text{Arg} x\right).

E.g.: let z=1-i, then |z|=\sqrt{2} and \text{Arg}{z}=\frac{-\pi}{4}. On the other hand, if \text{Arg} x=\frac{-\pi}{6}, |x|=2, then x=2\left(\frac{\sqrt{3}}{2}-\frac{i}{2}\right)=\sqrt{3}-i.

Geometrical view on addition and multiplication

You can notice immediately, that adding two complex numbers is the same as adding two vectors on the complex plane.

Multiplication is even more interesting. Namely, multiplication of two complex numbers actually multiplies its modules and adds their arguments. This is especially useful when calculating a power of a complex number, and it implies so called de Moivre’s equality:

    \[z^n=|z|^n\left(\cos n\text{Arg} z+i\sin n\text{Arg} z\right).\]

E.g., Let us calculate (1+i)^6(\sqrt{3}-i). We get:

  • modulus of 1+i is \sqrt{2} and argument is \frac{\pi}{4}, so modulus of (1+i)^6 is 8 and argument is \frac{6\pi}{4}, in other words \frac{-\pi}{2}.
  • modulus of \sqrt{3}-i is 2 and argument is \frac{-\pi}{6}, so modulus (1+i)^6(\sqrt{3}-i) is 16 and argument is -\frac{2\pi}{3}.

Therefore (1+i)^6(\sqrt{3}-i)=16\left(\cos \frac{-2\pi}{3}+i\sin \frac{-2\pi}{3}\right)=-8-8i\sqrt{3}.

By the way, de Moivre’s equality implies many trigonometric equalities. E.g., because

    \[(\cos \alpha+i\sin\alpha)^2= \cos 2\alpha +i\sin 2\alpha,\]


    \[(\cos \alpha+i\sin\alpha)^2= \cos^2\alpha-\sin^2\alpha+2i\sin\alpha\cos\alpha,\]

we get that:

    \[\cos 2\alpha =\cos^2\alpha - \sin^2\alpha\]


    \[\sin 2\alpha = 2\sin\alpha\cos\alpha.\]

Calculating roots

Notice, that for every complex number z\neq 0 there are always n numbers x such that x^n=z. Indeed the first one can be calculated by calculating the root of the modulus \sqrt[n]{|z|} and by dividing the argument by n. But if we add to this argument any multiple of \frac{2\pi}{n}, then after taking the number to the n-th power we will get the same number, because this additional angle will sum up to some multiple of 2\pi. All those numbers will be called n-th roots of z.

E.g. \sqrt[4]{-16} are 4 numbers. Since |-16|=16, every root will be of modulus 2. Argument 16i is \pi, so the first root will have \frac{\pi}{4} as its argument and we can add to it any multiple of \frac{2\pi}{4}=\frac{\pi}{2}, so we get the following roots:

  • argument: \frac{\pi}{4}, so \sqrt{2}+i\sqrt{2}
  • argument: \frac{3\pi}{4}, so -\sqrt{2}+i\sqrt{2},
  • argument: \frac{-3\pi}{4}, so -\sqrt{2}-i\sqrt{2},
  • argument: \frac{-\pi}{4}, so \sqrt{2}-i\sqrt{2}.

Those root of z, argument of which is in [0,\frac{2\pi}{n}), is called the main root and denoted by \sqrt[n]{_+ z}. So \sqrt[4]{_+ -16}=\sqrt{2}+i\sqrt{2}.

Therefore we can calculated roots of a quadratic equations regardless of the sign of \Delta, or even when \Delta is not real. E.g. x^2+2x+(1-i), \Delta=4i, \sqrt{\Delta} has modulus 2 and argument \frac{\pi}{4}, so it is \sqrt{2}+i\sqrt{2}. And -\sqrt{2}-i\sqrt{2}. Therefore, x_1=-1+\frac{\sqrt{2}}{2}+i\frac{\sqrt{2}}{2} and x_2=-1-\frac{\sqrt{2}}{2}-i\frac{\sqrt{2}}{2}.

Exponential form and calculating complex powers

We will see that for x\in\mathbb{R}, we have: e^x=\sum_{i=0}^\infty \frac{x^n}{n!}. This gives us an idea how to define complex powers. If we substitute i to x we get e^{i}=\sum_{n=0}^{\infty}\frac{(-1)^nx^{2n}}{(2n)!}+i\sum_{n=0}^{\infty}\frac{(-1)^n x^{2n+1}}{(2n+1)!}, moreover, \sin x= \sum_{n=0}^{\infty}\frac{(-1)^n x^{2n+1}}{(2n+1)!} and \cos x= \sum_{n=0}^{\infty}\frac{(-1)^nx^{2n}}{(2n)!}. Therefore, we can define e to a complex power as follows:

    \[e^{a+bi}=e^a(\cos b+i\sin b),\]

Which implies the famous equation of the 5 constants of mathematics e^{i\pi}+1=0. 🙂

And we can write any complex number in an exponential form, which is similar to the polar form. Namely, z=e^{\ln|z|+i\text{Arg} z}.