13. Quadratic forms, polynomials and hypersurfaces

Quadratic form

A quadratic form is a function which assigns a number to each vector, in such a way that it is sum of products of two coordinates, e.g. q(x,y,z)=x^2-2xy+4xz-5z^2. The square of the norm is also an example of a quadratic form (\|(x,y,z)\|^2=x^2+y^2+z^2).

In other words, a quadratic form can be described as q(v)=h(v,v), where h is a bilinear form. We can always take a symmetric bilinear form, if the characteristic of the field is not equal to 2, and we are going to make such an assumption from now on.

Positive and negative definite forms

We can classify forms with respect to possible sign of results:

  • form q\colon V\to\mathbb{R} is positively definite, if for all v\in V, we get q(v)>0.
  • form q\colon V\to\mathbb{R} is negatively definite, if for all v\in V, we get q(v)<0.
  • form q\colon V\to\mathbb{R} is positively semidefinite, if for all v\in V, we get q(v)\geq 0.
  • form q\colon V\to\mathbb{R} is negatively semidefinite, if for all v\in V, we get q(v)\leq 0.

Obviously a form may not fall in any of those categories, if for some v,w\in V we have q(v)>0 and q(w)<0. Such forms are called indefinite.

Matrix of a form

The matrix of a quadratic form q with respect to basis \mathcal{A} is the matrix G(h,\mathcal{A}), where h is a symmetric bilinear form such that h(v,v)=q(v). E.g., let q(x,y,z)=x^2-2xy+4xz-5z^2, then:

    \[q(x,y,z)=[x,y,z]\cdot \left[\begin{array}{ccc}1&-1&2\\-1&0&0\\2&0&-5\end{array}\right]\cdot \left[\begin{array}{c}x\\y\\z\end{array}\right]\]

so notice, that the coefficients are divided by 2 outside the diagonal, because the same expression is generated twice.

Sylvester’s criterion

Sylvester’s criterion determines whether a form is positively definite or negatively definite. Notice that it does not tell anything about the categories with semidefinite forms!

How does it work? We study determinants of minors: let A_k be the matrix of size k\times k in the left upper corner of the matrix of a form we study. Let n\times n be the size of the matrix of this form. Sylvester’s criterion consists of the two following facts:

  • if for any k\leq n we have \det A_k > 0, the form is positively definite,
  • if for any k\leq n we have \det A_k > 0 for even k, and \det A_k<0, for odd k, then the form is negatively definite.

E.g. let q(x,y)=2x^2+y^2-2xy, so its matrix: \left[\begin{array}{cc}2&-1\\-1&1\end{array}\right], so A_1=[2] and A_2=\left[\begin{array}{cc}2&-1\\-1&1\end{array}\right], therefore \det A_1=2>0 and \det A_2=2-1=1>0, so form q is positively definite.

E.g., let q(x,y)=-x^2-5y^2-4xy-z^2, so its matrix: \left[\begin{array}{ccc}-1&-2&0\\-2&-5&0\\0&0&-1\end{array}\right], so A_1=[-1] and A_2=\left[\begin{array}{cc}-1&-2\\-2&-5\end{array}\right], also A_3=\left[\begin{array}{ccc}-1&-2&0\\-2&1&0\\0&0&-1\end{array}\right], therefore \det A_1=-1<0 and \det A_2=5-4=1>0 and \det A_3=-5+4=-1<0, so form q is negatively definite.

Finally let q(x,y)=-2x^2+y^2-2xy, its matrix: \left[\begin{array}{cc}-2&-1\\-1&1\end{array}\right], so A_1=[-2] and A_2=\left[\begin{array}{cc}-2&-1\\-1&1\end{array}\right], therefore \det A_1=-2<0 and \det A_2=-2-1=-3<0, so form q is neither positively definite nor negatively definite.

Diagonalization of quadratic forms

But to check everything (including semi definiteness), we have to diagonalize the form, i.e. find a basis in which its matrix is diagonal (a diagonal congruent matrix). Then, obviously if:

  • it has only positive entrees on the diagonal, then q is positive definite,
  • it has only negative entrees on the diagonal, then q is negative definite,
  • it has only nonnegative entrees on the diagonal, then q is positive definite,
  • it has only nonpositive entrees on the diagonal, then q is negative semi definite,
  • it has a positive and a negative entree on the diagonal, then q is nondefinite.

It can be done it the tree following methods

Diagonalization of a form: complementing to squares

We may complement a formula of a form to squares making sure to use all expressions with the first variable first, and then all with the second one, and so on.

E.g.

    \[q(x,y,z)=x^2+2xy-4xz+6yz=(x+y-2z)^2-y^2-4z^2+2yz=\]

    \[=(x+y-2z)^2-(y-z)^2+3z^2=x'^2-y'^2+3z'^2,\]

where x'=x+y-2z, y'=y-z i z'=z, so the form is non-definite. The basis \mathcal{A}, in which the formula is expressed is (1,0,0),(-1,1,0),(1,1,1), because

    \[M(id)_{st}^{\mathcal{A}}=\left[\begin{array}{ccc}1&1&-2\\0&1&-1\\0&0&1\end{array}\right].\]

Diagonalization of a form: orthogobal basis

We may also find an orthogonal basis with respect to the symmetrical bilinear form related to the considered quadratic form. Then the entrees on the diagonal are the values of the form on the vectors from this basis.

Diagonalization of a form: eigenvalues

Finally, we shall remind ourselves that there exists a basis consisting of eigenvectors of a self-adjoint endomorphism described by the same matrix, which is orthogonal with respect to the symmetrical bilinear form related to the considered quadratic form. Then the entrees on the diagonal are the eigenvalues of the matrix.

E.g.: let q(x,y)=-2x^2+y^2-2xy, the matrix: \left[\begin{array}{cc}-2&-1\\-1&1\end{array}\right], so its characteristic polynomial: (-2-\lambda)(1-\lambda)+1=\lambda^2+\lambda-1 has zeroes in \frac{-1-\sqrt{5}}{2} and \frac{-1+\sqrt{5}}{2}, so it has eigenvalues of both signs, so is q is indefinite.

Polynomials and polynomial functions

A polynomial of degree n over field K with variables x_1,\ldots, x_k is an expression of form

    \[\sum_{1\leq i_1\leq \ldots\leq i_j\leq k, 0\leq j\leq n} a_{i_1,\ldots, i_j}x_{i_1}\cdot\ldots\cdot x_{i_j},\]

a_{i_1,\ldots, i_j}\in K with at least one a_{i_1,\ldots, i_j} for j=k not equal to zero, and the space of all polynomials over K with variables x_1,\ldots, x_k is denoted by K[x_1,\ldots, x_k]. E.g. x_1^2x_2+3x^3-x_1x_3+5 is a polynomial of degree 3 in K[x_1,\ldots, x_k].

Given an affine k-dimensional space H and a basic system p_0;v_1,\ldots, v_k, a polynomial p\in K[x_1,\ldots, x_k] defines a function, called a polynomial function, p\colon H\to K given as

    \[p(p_0+a_1v_1+\ldots+a_kv_k)=p(a_1,\ldots, a_k)\]

which obviously abuses the notation, since p denotes the polynomial itself and the polynomial function. But for infinite fields it is not a problem,, because we have a bijection between these two sets.

Notice also, that if f\colon H\to K can be written as a polynomial function in a basic system, it can written in this form in any basic system. Moreover, the related polynomial is of the same degree.

Hypersurfaces and algebraic sets

X\subseteq P is an algebraic set, if

    \[X=\{p\in H\colon f_1(p)=0\land\ldots\land f_m(p)=0\},\]

where f_1,\ldots, f_m are polynomial functions. It is a hypersurface, if

    \[X=\{p\in H\colon f(p)=0\},\]

where f is a polynomial funkction.

It is easy to notice that over \mathbb{R} it it the same thing, since we can take f=f_1^2+\ldots+f_m^2.

Equivalence relation on polynomial functions and hypersurfaces

Two polynomial functions f,g\colon H\to K are equivalent, if there exist basic systems p;\mathcal{A} and q;\mathcal{B}, such that f in p;\mathcal{A} is described by the same polynomial as g in q;\mathcal{B}.

Two hypersurfaces are equivalent, if the functions describing them are equivalent. It is so if and only if the second hypersurface is an image of the first one under an affine isomorphism on H.

Canonical form of a polynomial function: hypersurfaces of second degree

Fix a hypersurface of second degree, i.e. described by a polynomial function of second degree. We want to find an equivalent polynomial function of a simplest possible form, i.e. in a canonical form. In other words, we will be looking for a basic system in which the equation describing the hypersurface is as simple as possible.

Every polynomial function of second degree is a sum of a quadratic form and a an affine function, i.e. e.g. if f((x,y))=2x^2+4xy+y^2+4x+4y+4 then f((x,y))=q(x,y)+\varphi(x,y), where q(x,y)=2x^2+4xy+y^2 and \varphi(x,y)=4x+4y+4.

For every polynomial function of second degree (thus every equation describing a hypersurface of second degree) it is possible to find a basic system in which the function takes one of the following forms:

    \[a_1x_1^2+\ldots+a_rx_r^2+c,\]

a_1,\ldots, a_r\neq 0, or

    \[a_1x_1^2+\ldots+a_rx_r^2+x_n,\]

r<n, a_1,\ldots, a_r\neq 0, where r is the rank of q.

How to transform the function to such a form? First we have to diagonalize the form q and describe the function \varphi in the new basis. It changes the basis but not the origin of the basic system. Now, for each variable x appearing with a non-zero coefficient b in \varphi for which we have x^2 with a non-zero coefficient a, we can introduce a new variable x'=(x+b/2a), because then ax'^2=ax^2+bx+c. This changes the constant and the origin from p to p-bv/2a, where v is the basic vector related to the variable x. If there are no other variables in \varphi, we have already reached the first form. If there are other variables (for which there is no x^2 in q), then we make from them and the constant coefficient a new variable x_n', which obviously change both the basis and the origin of the basic system.

    \[2x^2+4xy+y^2+4x+4y+4=2(x+y)^2-y^2+4x+4y+4=x'^2-y'^2+4x+4y+4=\ldots\]

where x'=x+y, y'=y, so x=x'-y', and for the basic system we get

    \[(0,0)+x(1,0)+y(0,1)=(0,0)+x'(1,0)+y'(-1,1).\]

And so

    \[\ldots= 2x'^2-y'^2+4x'-4y'+4y'+4=2x'^2-y'^2+4x'+4=2(x'+1)^2-y'^2+2=\]

    \[=2x''^2-y''^2+2,\]

where x''=x'+1 and y''=y', so the final basic system is (-1,0);(1,0),(-1,1).

In the second case, let f(x,y,z)=x^2-2xy+y^2+y+2z+1, and then:

    \[x^2-2xy+y^2+y+2z+1=(x+y)^2+y+2z+1=x'^2+y'+2z'+1=\ldots\]

where x'=x+y, y'=y, z'=z, and the basic system is (0,0,0);(1,0,0),(-1,1,0),(0,0,1).

    \[\ldots = x''^2+z'',\]

where x''=x', y''= y' and z''= y'+2z'+1, and since

    \[(0,0,0)+x'(1,0,0)+y'(-1,1,0)+z'(0,0,1)=\]

    \[=(0,0,-1/2)+x''(1,0,0)+y''(-1,1,-1/2)+z''(0,0,1/2),\]

so the final basic system is: (0,0,-1/2);(1,0,0),(-1,1,-1/2),(0,0,1/2).

Canonical form of a polynomial function of second degree: hypersurfaces over \mathbb{R} or \mathbb{C}

Notice that additionally an equation can be always divided by a free variable (if it is non-zero), changing it into 1 and if we are over \mathbb{R}, for every expression ax^2 the basic vector related to x can be divided by \sqrt{|a|} which changes this expression to \pm x^2 (and over \mathbb{C} even by \sqrt{a} changing this expression to x^2).

Thus for every equation of second degree over \mathbb{R}, there is a basic system in which the equation takes one of the following forms:

    \[\pm x_1^2\pm\ldots\pm x_r^2+1=0,\]

1\leq r\leq n, lub

    \[x_1^2\pm\ldots\pm x_r^2=0,\]

2\leq r\leq n-1, lub

    \[\pm x_1^2\pm \ldots\pm x_r^2+x_n=0,\]

1\leq r<n.

Thus for every equation of second degree over \mathbb{C}, there is a basic system in which the equation takes one of the following forms:

    \[x_1^2+\ldots+x_r^2+1=0,\]

1\leq r\leq n, lub

    \[x_1^2+\ldots+x_r^2=0,\]

2\leq r\leq n-1, lub

    \[x_1^2+\ldots+x_r^2+x_n=0,\]

1\leq r<n.

E.g. reformulating further on the equation 2x''^2-y''^2+2=0 is basic system (-1,0);(1,0),(-1,1), we see that it is equivalent to x''^2-\frac{1}{2}y''^2+1=0, i.e. with equation x'''^2-y'''^2+1=0, which is in basic system (-1,0);(1,0),(-1,1)/\sqrt{2} (because y'''=\sqrt{2}y''). Meanwhile, over \mathbb{C} it is equivalent to x'''^2+y'''^2+1=0, in basic system (-1,0);(1,0),(i,-i)/\sqrt{2}, (because this time y'''=\sqrt{2}iy'').

Centres of symmetry

A point p is a centre of symmetry of a hypersurface described by equation f(x)=0, if for every v\in T(H) we have f(p+v)=0 if and only if f(p-v)=0. It is easy to prove that p is a centre of symmetry if and only if it is a critical point of the function f, i.e. its partial derivatives are zero at these points.

Thus, if we consider the canonical forms of the equations, we see that a hypersurface described by

    \[a_1x_1^2+\ldots+a_rx_r^2+c=0\]

has a centre of symmetry and it is in it if and only if c=0. On the other hand, a hypersurface described by

    \[a_1x_1^2+\ldots+a_rx_r^2+x_n=0\]

has no centre of symmetry.

Affine types of hypersurfaces of second degree over \mathbb{R}

The above means that the type of equation which describes a given hyperspace is a constant. I.e. if in a basic system it takes one of the forms

    \[\pm x_1^2\pm\ldots\pm x_r^2+1=0,\]

1\leq r\leq n, or

    \[x_1^2\pm\ldots\pm x_r^2=0,\]

2\leq r\leq n-1, or

    \[\pm x_1^2\pm \ldots\pm x_r^2+x_n=0,\]

1\leq r<n, in any other basic system it cannot take any of the other forms.

Moreover, if s is the number of variables with coefficient +1 in the above equation, then if the equation is

    \[\pm x_1^2\pm\ldots\pm x_r^2+1=0,\]

then s is the same for every basic system in which this equation is in the canonical form.
If the equation is of form

    \[x_1^2\pm\ldots\pm x_r^2=0,\]

2\leq r\leq n-1, or

    \[\pm x_1^2\pm \ldots\pm x_r^2+x_n=0,\]

1\leq r<n, then we also get such a result but up to multiplication by -1, i.e. in every basic system in which the equation is in the canonical form, we get s or r-s variables with coefficient +1.

These possibilities are called the affine types of hypersurfaces.

We shall say that the hypersurface is proper, if it is not included in any hyperspace.

Affine types of proper curves of second degree in \mathbb{R}^2

Thus we have the following affine types of proper curves of second degree in \mathbb{R}^2:

    \[-x^2+1=0\]

two parallel lines

    \[x^2-y^2+1=0\]

hyperbola

    \[-x^2-y^2+1=0\]

ellipse

    \[x^2-y^2=0\]

a pair of intersecting lines

    \[x^2+y=0\]

parabola

Affine types of proper surfaces of second degree in \mathbb{R}^3

Thus we have the following affine types of proper surfaces of second degree in \mathbb{R}^3 (images by Wikipedia):

    \[-x^2+1=0\]

a pair of parallel planes

    \[x^2-y^2+1=0\]

hyperbolic cylinder

    \[-x^2-y^2+1=0\]

elliptic cylinder

    \[x^2+y^2-z^2+1=0\]

hyperboloid of two sheets

    \[x^2-y^2-z^2+1=0\]

hyperboloid of one sheet

    \[-x^2-y^2-z^2+1=0\]

ellipsoid

    \[x^2-y^2=0\]

pair of intersecting planes

    \[x^2+y^2-z^2=0\]

elliptic cone

    \[x^2+z=0\]

parabolic cylinder

    \[x^2+y^2+z=0\]

elliptic paraboloid

    \[x^2-y^2+z=0\]

hyperbolic paraboloid