Aller au contenu principal

Hilbert projection theorem


Hilbert projection theorem


In mathematics, the Hilbert projection theorem is a famous result of convex analysis that says that for every vector x {\displaystyle x} in a Hilbert space H {\displaystyle H} and every nonempty closed convex C H , {\displaystyle C\subseteq H,} there exists a unique vector m C {\displaystyle m\in C} for which c x {\displaystyle \|c-x\|} is minimized over the vectors c C {\displaystyle c\in C} ; that is, such that m x c x {\displaystyle \|m-x\|\leq \|c-x\|} for every c C . {\displaystyle c\in C.}

Finite dimensional case

Some intuition for the theorem can be obtained by considering the first order condition of the optimization problem.

Consider a finite dimensional real Hilbert space H {\displaystyle H} with a subspace C {\displaystyle C} and a point x . {\displaystyle x.} If m C {\displaystyle m\in C} is a minimizer or minimum point of the function N : C R {\displaystyle N:C\to \mathbb {R} } defined by N ( c ) := c x {\displaystyle N(c):=\|c-x\|} (which is the same as the minimum point of c c x 2 {\displaystyle c\mapsto \|c-x\|^{2}} ), then derivative must be zero at m . {\displaystyle m.}

In matrix derivative notation x c 2 = c x , c x = 2 c x , c {\displaystyle {\begin{aligned}\partial \lVert x-c\rVert ^{2}&=\partial \langle c-x,c-x\rangle \\&=2\langle c-x,\partial c\rangle \end{aligned}}} Since c {\displaystyle \partial c} is a vector in C {\displaystyle C} that represents an arbitrary tangent direction, it follows that m x {\displaystyle m-x} must be orthogonal to every vector in C . {\displaystyle C.}

Statement

Detailed elementary proof

Proof by reduction to a special case

It suffices to prove the theorem in the case of x = 0 {\displaystyle x=0} because the general case follows from the statement below by replacing C {\displaystyle C} with C x . {\displaystyle C-x.}

Consequences

Properties

Expression as a global minimum

The statement and conclusion of the Hilbert projection theorem can be expressed in terms of global minimums of the followings functions. Their notation will also be used to simplify certain statements.

Given a non-empty subset C H {\displaystyle C\subseteq H} and some x H , {\displaystyle x\in H,} define a function d C , x : C [ 0 , )  by  c x c . {\displaystyle d_{C,x}:C\to [0,\infty )\quad {\text{ by }}c\mapsto \|x-c\|.} A global minimum point of d C , x , {\displaystyle d_{C,x},} if one exists, is any point m {\displaystyle m} in domain d C , x = C {\displaystyle \,\operatorname {domain} d_{C,x}=C\,} such that d C , x ( m ) d C , x ( c )  for all  c C , {\displaystyle d_{C,x}(m)\,\leq \,d_{C,x}(c)\quad {\text{ for all }}c\in C,} in which case d C , x ( m ) = m x {\displaystyle d_{C,x}(m)=\|m-x\|} is equal to the global minimum value of the function d C , x , {\displaystyle d_{C,x},} which is: inf c C d C , x ( c ) = inf c C x c . {\displaystyle \inf _{c\in C}d_{C,x}(c)=\inf _{c\in C}\|x-c\|.}

Effects of translations and scalings

When this global minimum point m {\displaystyle m} exists and is unique then denote it by min ( C , x ) ; {\displaystyle \min(C,x);} explicitly, the defining properties of min ( C , x ) {\displaystyle \min(C,x)} (if it exists) are: min ( C , x ) C  and  x min ( C , x ) x c  for all  c C . {\displaystyle \min(C,x)\in C\quad {\text{ and }}\quad \left\|x-\min(C,x)\right\|\leq \|x-c\|\quad {\text{ for all }}c\in C.} The Hilbert projection theorem guarantees that this unique minimum point exists whenever C {\displaystyle C} is a non-empty closed and convex subset of a Hilbert space. However, such a minimum point can also exist in non-convex or non-closed subsets as well; for instance, just as long is C {\displaystyle C} is non-empty, if x C {\displaystyle x\in C} then min ( C , x ) = x . {\displaystyle \min(C,x)=x.}

If C H {\displaystyle C\subseteq H} is a non-empty subset, s {\displaystyle s} is any scalar, and x , x 0 H {\displaystyle x,x_{0}\in H} are any vectors then min ( s C + x 0 , s x + x 0 ) = s min ( C , x ) + x 0 {\displaystyle \,\min \left(sC+x_{0},sx+x_{0}\right)=s\min(C,x)+x_{0}} which implies: min ( s C , s x ) = s min ( C , x ) min ( C , x ) = min ( C , x ) {\displaystyle {\begin{alignedat}{6}\min &(sC,sx)&&=s&&\min(C,x)\\\min &(-C,-x)&&=-&&\min(C,x)\\\end{alignedat}}} min ( C + x 0 , x + x 0 ) = min ( C , x ) + x 0 min ( C x 0 , x x 0 ) = min ( C , x ) x 0 {\displaystyle {\begin{alignedat}{6}\min \left(C+x_{0},x+x_{0}\right)&=\min(C,x)+x_{0}\\\min \left(C-x_{0},x-x_{0}\right)&=\min(C,x)-x_{0}\\\end{alignedat}}} min ( C , x ) = min ( C + x , 0 ) x min ( C , 0 ) + x = min ( C + x , x ) min ( C x , 0 ) = min ( C , x ) x {\displaystyle {\begin{alignedat}{6}\min &(C,-x){}&&=\min(C+x,0)-x\\\min &(C,0)\;+\;x\;\;\;\;&&=\min(C+x,x)\\\min &(C-x,0){}&&=\min(C,x)-x\\\end{alignedat}}}

Examples

The following counter-example demonstrates a continuous linear isomorphism A : H H {\displaystyle A:H\to H} for which min ( A ( C ) , A ( x ) ) A ( min ( C , x ) ) . {\displaystyle \,\min(A(C),A(x))\neq A(\min(C,x)).} Endow H := R 2 {\displaystyle H:=\mathbb {R} ^{2}} with the dot product, let x 0 := ( 0 , 1 ) , {\displaystyle x_{0}:=(0,1),} and for every real s R , {\displaystyle s\in \mathbb {R} ,} let L s := { ( x , s x ) : x R } {\displaystyle L_{s}:=\{(x,sx):x\in \mathbb {R} \}} be the line of slope s {\displaystyle s} through the origin, where it is readily verified that min ( L s , x 0 ) = s 1 + s 2 ( 1 , s ) . {\displaystyle \min \left(L_{s},x_{0}\right)={\frac {s}{1+s^{2}}}(1,s).} Pick a real number r 0 {\displaystyle r\neq 0} and define A : R 2 R 2 {\displaystyle A:\mathbb {R} ^{2}\to \mathbb {R} ^{2}} by A ( x , y ) := ( r x , y ) {\displaystyle A(x,y):=(rx,y)} (so this map scales the x {\displaystyle x-} coordinate by r {\displaystyle r} while leaving the y {\displaystyle y-} coordinate unchanged). Then A : R 2 R 2 {\displaystyle A:\mathbb {R} ^{2}\to \mathbb {R} ^{2}} is an invertible continuous linear operator that satisfies A ( L s ) = L s / r {\displaystyle A\left(L_{s}\right)=L_{s/r}} and A ( x 0 ) = x 0 , {\displaystyle A\left(x_{0}\right)=x_{0},} so that min ( A ( L s ) , A ( x 0 ) ) = s r 2 + s 2 ( 1 , s ) {\displaystyle \,\min \left(A\left(L_{s}\right),A\left(x_{0}\right)\right)={\frac {s}{r^{2}+s^{2}}}(1,s)} and A ( min ( L s , x 0 ) ) = s 1 + s 2 ( r , s ) . {\displaystyle A\left(\min \left(L_{s},x_{0}\right)\right)={\frac {s}{1+s^{2}}}\left(r,s\right).} Consequently, if C := L s {\displaystyle C:=L_{s}} with s 0 {\displaystyle s\neq 0} and if ( r , s ) ( ± 1 , 1 ) {\displaystyle (r,s)\neq (\pm 1,1)} then min ( A ( C ) , A ( x 0 ) ) A ( min ( C , x 0 ) ) . {\displaystyle \,\min(A(C),A\left(x_{0}\right))\neq A\left(\min \left(C,x_{0}\right)\right).}

See also

  • Orthogonal complement – Concept in linear algebra
  • Orthogonal projection – Idempotent linear transformation from a vector space to itselfPages displaying short descriptions of redirect targets
  • Orthogonality principle – Condition for optimality of Bayesian estimator
  • Riesz representation theorem – Theorem about the dual of a Hilbert space
Collection James Bond 007

Notes

References

Bibliography

  • Rudin, Walter (1987). Real and Complex Analysis (Third ed.).
  • Rudin, Walter (1991). Functional Analysis. International Series in Pure and Applied Mathematics. Vol. 8 (Second ed.). New York, NY: McGraw-Hill Science/Engineering/Math. ISBN 978-0-07-054236-5. OCLC 21163277.

Text submitted to CC-BY-SA license. Source: Hilbert projection theorem by Wikipedia (Historical)