Section 11.5 Lagrange Multipliers using Differentials
¶Suppose first that you want to extremize a function \(f\) of two variables subject to the constraint \(g=\hbox{constant}\text{.}\) In two dimensions, level sets are curves, and we can introduce a coordinate \(v\) along the level curves of \(g\text{.}\) Then \(v\) and \(g\) can be used as coordinates, at least in a small region about any point. Thus,
The condition that \(f\) be extremized at a point \(P\) on a given level curve of \(g\) is precisely that
Thus, at \(P\text{,}\) we have
where
Equation (11.5.3) can be used to find the point(s) \(P\) on the level curve where \(f\) is extremized: Simply expand both sides in terms of the original variables, and realize that corresponding partial derivatives must therefore be proportional, that is
These two equations (together with the constraint condition) can be solved for \(x^i\) (and \(\lambda\)); this is the method of Lagrange multipliers.
It is straightforward to generalize this procedure to higher dimensions. If \(f\) is a funciton of \(n\) variables, and the constrain function is also, then the level sets of \(g\) are \((n-1)\)-dimensional surfaces, on which we can introduce the \(n-1\) coordinates \(v^i\text{;}\) \(\{g,v^i\}\) can therefore be used as (local) coordinates. Thus,
and the condition that \(f\) be extremized at a point \(P\) on a given level surface of \(g\) is that
for each \(i\text{.}\) Thus, at \(P\text{,}\) we again have (11.5.3); only the number of “components” of this equation has changed.
We can further generalize this procedure to multiple constraints, involving constraint functions \(g^j\text{.}\) The intersection of \(k\) such constraint surfaces will in general be an \((n-k)\)-dimensional surface, on which we can introduce \(n-k\) coordinates \(v^i\text{,}\) and we must now use \(\{g^j,v^i\}\) as coordinates, leading to
The condition that \(f\) be extremized on this intersection surface is the same as before, namely that the second term above vanish, but now we obtain
at the point \(P\text{.}\) Thus, the differential of \(f\) is a linear combination of the differentials of the constraint functions, a condition which must also hold for each partial derivative. The resulting system of equations can be solved for \(P\) (and \(\lambda^j\)), at least in principle.