9. Extrema on manifolds

Manifolds in \mathbb{R}^n

A set M\subseteq \mathbb{R}^n (a curve in one-dimensional case, a surface — in two-dimensional case, etc.) will be called a manifold (officially: a submanifold embedded in \mathbb{R}^n) of dimension m (m<n), if

    \[M=\{(x_1,\ldots, x_n)\in \mathbb{R}^n\colon  F(x_1,\ldots, x_n)=(0,\ldots, 0)\},\]

where F\colon \mathbb{R}^n\to \mathbb{R}^{n-m} is a function of C^1 class such that the rows of matrix F' are linearly independent for any (x_1,\ldots, x_n)\in M.

E.g. the sphere S=\{(x,y,z)\in \mathbb{R}^3\colon x^2+y^2+z^2=1\} is a 2-dimensional manifold. Indeed,

    \[S=\{(x,y,z)\in \mathbb{R}^3\colon F(x,y,z)=0\},\]

where F\colon \mathbb{R}^3\to\mathbb{R} and F(x,y,z)=x^2+y^2+z^2-1. F'(x,y,z)=[2x,2y,2z] contains one row which is a non-zero vector for any point on the sphere.

By the implicit function theorem we immediately get that M is locally at any poiny a graph of a function H\colon \mathbb{R}^m\to \mathbb{R}^{n-m}.

Conditional extrema

Given a function g\colon \mathbb{R}^n\to \mathbb{R}, and a manifold M we may for example want to know what is the maximal value of this function on this manifold. More precisely, we are going to consider local extrema but with the function restricted only to the points on the manifold (so called conditional extrema). We shall say that a\in M is a conditional local maximum (respectively, minimum), if there exists a ball B around a, such that the largest value (respectively the least value) on the set of arguments B\cap M the function g takes at a.

Lagrange multipliers

To find such points we are going to use the Lagrange multipliers Theorem. It states that given a function g\colon \mathbb{R}^n\to \mathbb{R} of C^1 class defined on an open set V and a manifold M\subseteq \mathbb{R}^n defined as

    \[M=\{(x_1,\ldots, x_n)\in \mathbb{R}^n\colon  F(x_1,\ldots, x_n)=(0,\ldots, 0)\},\]

where F\colon \mathbb{R}^n\to \mathbb{R}^{n-m} (niech F=(F_1,\ldots, F_{n-m})) jest funkcją klasy C^1,
if a point a\in M\cap V is a conditional local extremum, then there exist numbers \lamda_1,\ldots, \lambda_{n-m}, such that

    \[g'(a)=\lambda_1 F'_1(a)+\ldots +\lambda_{n-m}F'_{n-m}(a).\]

Notice that this means that g'(a) being a linear combination of F'_i is perpendicular to M at a, because the tangent space is the space perpendicular to all F'_i(a).

Thus given a compact manifold without a boundary we may use this condition to find all candidates for points on which the function can take its maximal and minimal value on the considered manifold.

For example let us take the manifold described by the equation F(x,y,z)=x^2+y^2+z^2-1=0 and function g(x,y,z)=x^2-yz. Then F(x,y,z)=x^2+y^2+z^2-1=0 and g(x,y,z)=x^2-yz. We get F'(x,y,z)=[2x,2y,2z] and g'(x,y,z)=[2x,-z,-y]. We look for \lambda\in \mathbb{R}, such that [2x,-z,-y]=\lambda [2x,2y,2z], i.e.

    \[\begin{cases} 2x=2\lambda x\\-z=2\lambda y\\-y=2\lambda z\end{cases}\]

If x\neq 0 then \lambda =1, thus -y=-4y, so y=0=z, and x=\pm 1. So we get two points in which we may have extrema: (1,0,0) and (-1,0,0) (and \lambda =1). The value at these points is 1. On the other hand, if x=0, then z=\lambda^2z and y=4\lambda^2y. Both cannot be equal to zero, so \lambda =\pm 1/2 and y=\pm z, but since x^2+y^2+z^2-1=0, we get 2y^2=2z^2=1. Thus, for \lambda=1/2 we have points (0,\sqrt{2}/2,-\sqrt{2}/2) and (0,-\sqrt{2}/2,\sqrt{2}/2), and for \lambda=-1/2 we have (0,\sqrt{2}/2,\sqrt{2}/2) and (0,\sqrt{2}/2,\sqrt{2}/2). In these point the value of g is respectively -1/2, -1/2 , 1/2 and 1/2. So the minimal value is -1/2 and maximal is 1.

This is the necessary condition. But we can also formulate the sufficient condition for p to be a conditional local extremum. Consider the mapping

    \[L=g(x_1,\ldots, x_n)-(\lambda_1 F_1((x_1,\ldots, x_n))+\ldots +\lambda_{n-m}F_{n-m}((x_1,\ldots, x_n))),\]

Then if L''(p) positive definite on the space tangent to M at p, then it is a conditional minimum. But if L''(p) negative definite on the space tangent to M at p, then it is a conditional minimum.

Kuhn-Tucker method

If you want to find a maximum of a given function, eg.: f(x,y)=3x_1-x_2/2 in a given set, e.g. D=\{(x,y)\in \mathbb{R}^2\colon x^2+y^2\leq 1, x\geq 1, y\geq 1\}, you know how and can do it in two steps: find critical points in the interior of this set and then look for minima and maxima on the boundary of this set, which is a manifold, so we can use Lagrange multiplier here. The Kuhn-Tucker method does these two steps simultaneously.

Given function f\colon \mathbb{R^n}\to \mathbb{R} and a system of conditions F_1(x_1,\ldots, x_n)\leq 0, …, F_k(x_1,\ldots, x_n)\leq 0, function f has a minimum in one of points which satisfy the following conditions:

  1. \partial L/\partial x_1=0, \ldots, \partial L/\partial x_n=0, where

        \[L(x_1,\ldots, x_n,\lambda_1,\ldots, \lambda_k)=f(x_1,\ldots, x_n)+\lambda_1F_1(x_1,\ldots, x_n)+\ldots+ \lambda_k F_k(x_1,\ldots, x_n),\]

  2. \lambda_1 F_1(x_1,\ldots, x_n)=0, \ldots, \lambda_kF_k(x_1,\ldots, x_n),
  3. F_1(x_1,\ldots, x_n)\leq 0, …, F_k(x_1,\ldots, x_n)\leq 0,
  4. \lambda_1,\ldots, \lambda_k\geq 0.

Notice that this is indeed it! In particular, if \lambda_1=\ldots=\lambda_k=0, then the condition about the derivatives of L reduces to checking whether f is zero, i.e. finding critical points of f. But if \lambda_i\neq 0, we also include equation F_i(x_1,\ldots, x_n)=0 and we add the derivatives of F_i multiplied by \lambda_i to the equations about derivatives — so we are doing exactly the same which we would do using Lagrange multipliers.

In our example we are looking for maximum g(x,y)=3x_1-x_2/2 inside D=\{(x,y)\in \mathbb{R}^2\colon x^2+y^2\leq 1, x\geq 1, y\geq 1\}. Which is the same as finding the minimum of f(x,y)=-3x_1+x_2/2, and we have
F_1(x,y)=x^2+y^2-1\leq 0, F_2(x,y)=-x\leq 0 and F_3(x,y)=-y\leq 0. Thus,


and we are looking for points such that

  1. -3+2\lambda_1x-\lambda_2=0, 1/2+2\lambda_1y-\lambda_3=0
  2. \lambda_1 (x^2+y^2-1)=0, -\lambda_2x=0, -\lambda_3y=0
  3. x^2+y^2-1\leq 0, -x\leq 0, -y\leq 0,
  4. \lambda_1,\ldots, \lambda_k\geq 0.

Thus, if x=y=0, the value of f is 0 which also equals to the value of g. If x=0=\lambda_3, then the equation 1/2+2\lambda_1y=0 implies that \lambda_1<0, which is a contradiction. If y=0=\lambda_2, we get x=\pm 1, so the value of f is \pm 3, and the value of g is \mp 3. For \lambda_2=\lambda_3=0, similarly we get a contradiction. So we get that the maximum of g is 3.