Multivariable Calculus with Nilpotent Infinitesimals: More Smooth Infinitesimal Analysis

This is a continuation of my earlier post on smooth infinitesimal analysis. In this installment, I’ll show how the definition of a “stationary point” in Smooth Infinitesimal Analysis leads directly to a nice substitute for the Lagrange multipliers method. Then I’ll show how you can define differential forms as objects which assign a “signed volume” to genuinely infinitesimal objects, and how you can get Stokes’s Theorem (and the divergence theorem, etc.) in SIA.

Multivariable Calculus

Definition [Partial Derivatives].
Let f(x,y) be a function from R^2 to R. We define the partial derivative \partial f/\partial x (also written f_x) as follows: Given y, let g_y(x) = f(x,y). Then f_x(x_0,y_0) is defined to be g'_{y_0}(x_0). A similar definition is made for f_y, and for functions of more than two variables.

Definition [D(n)].
For n\in \mathbb{N}, let D(n) = \{(d_1,\ldots, d_n)\in D^n \mid \forall i,j\, d_id_j = 0\}. Note that D(1) = D.

The sets D(n) play the role in multivariable calculus that D played in singlevariable calculus. For example, we have the following.

Proposition. Let f(x,y) be a function from R^2 to R. Then, for all (d_1, d_2) \in D(2),

f(x_0 + d_1, y_0 + d_2) = f(x_0,y_0) + d_1 f_x(x_0,y_0) + d_2 f_y(x_0,y_0)

and furthermore, f_x(x_0,y_0) and f_y(x_0,y_0) are unique with those properties.

The analogous statement is also true for functions of more than two variables. \square

We also have
Proposition [Extended Microcancellation]. Let a_1, \ldots, a_n\in R. Suppose that for all (d_1,\ldots,d_n) \in D(n), \sum a_id_i = 0. Then each a_i equals 0. \square

Stationary Points and Lagrange Multipliers

There is an interesting substitute for the method of Lagrange multipliers in Smooth Infinitesimal Analysis. To introduce it, I’ll first discuss the concept of stationary points.

Suppose that we’ve forgotten what a stationary point and what a critical point is, and we need to redefine the concept in Smooth Infinitesimal Analysis. How should we do it? We want a stationary point to be such that every local maximum and local minimum is one. A point x gives rise to a local maximum (x,f(x)) of a single-variable function f just in case there is some neighborhood of x such that f(x) \geq f(x_0) for all x_0 in that neighborhood.

However, in Smooth Infinitesimal Analysis, there is always a neighborhood of x on which f is linear. That means that for x to be a local maximum, it must be \emph{constant} on some neighborhood. Obviously, the same is true if x is a local minimum. This suggests that we say that f has a stationary point at x just in case f(x) = f(x + d) for all d\in D.

Definition [Stationary Point of a Single-Variable Function]. Let f\in R^R and x\in R. We say that f has a stationary point at x if for all d\in D, f(x + d) = f(x).

Similarly, given a function f(x,y) of two variables, and a point (x_0,y_0), f is linear on the set (x_0,y_0) + D(2). This suggests that we define (x_0,y_0) to be a stationary point of f just in case f(x_0,y_0) = f(x_0 + d_1, y_0 +d_2) for all (d_1,d_2)\in D(2).

Definition [Stationary Point of a Multivariable Function]. Let f\colon R^n\to R. We say that \bar{x}\in R^n is a stationary point of f if for all \bar{d}\in D(n), f(\bar{x} + \bar{d}) = f(\bar{x}).

Now, suppose we want to maximize or minimize a function f(x,y) subject to the constraint that it be on some level surface g(x,y) = k, where k is a constant. Now, we should require of (x_0,y_0) not that f(x_0 + d_1, y_0 + d_2) = f(x_0,y_0) for all (d_1, d_2) \in D(2), but only for those (d_1,d_2)\in D which keep (x_0,y_0) on the same level surface of g; that is, those (d_1,d_2)\in D(2) for which g(x_0 + d_1, y_0 + d_2) = g(x_0,y_0). I’ll record this in a definition.

Definition [Constrained Stationary Point]. Let f, g\colon R^n\to R. A point \bar{x}\in R^n is a stationary point of f constrained by g if for all \bar{d}\in D^n, if g(\bar{x} + \bar{d}) = g(\bar{x}) then f(\bar{x} + \bar{d}) = f(\bar{x}).

I’ll show how this definition leads immediately to a method of solving constrained extrema problems by doing an example.

This example (and this method) are from [Bell2]. Suppose we want to find the radius and height of the cylindrical can (with top and bottom) of least surface area that holds a volume of k cubic centimeters. The surface area is f(r,h) = 2\pi r h + \pi r^2 + \pi r^2, and we are constrained by the volume, which is g(r,h) = \pi r^2 h.

We want to find those (r,h) such that f(r + d_1, h + d_2) = f(x_0,y_0) for all those (d_1,d_2)\in D(2) such that g(r + d_1, h + d_2) = g(r,h). So, the first question is to
figure out which (d_1,d_2)\in D(2) satisfy that property.

We have

g(r + d_1, h + d_2) = \pi (r + d_1)^2 (h + d_2)

which is

\pi(r^2 + 2rd_1)(h + d_2) = \pi(r^2h + 2rd_1h + r^2 d_2)

If this is to equal \pi r^2h, then we must have \pi(2rd_1h + r^2 d_2) = 0, so that d_1 = -(r/(2h)) d_2.

Now, we want to find an (r,h) so that f(r + d_1, h + d_2) = f(r,h) where d_1 = -(r/(2h)) d_2.

We have

f(r + d_1, h + d_2) = 2\pi((r + d_1)(h + d_2) + (r + d_1)^2)

which is

2\pi(rh + d_1h + d_2 r + r^2 + 2rd_1) = 2\pi(rh + r^2 + d_1(h + 2r) + d_2 r

If this is to equal 2\pi(rh + r^2) then we must have d_1(h + 2r) + d_2 r = 0. Substituting d_1 = -(r/(2h)) d_2, we get (-(r/(2h))(h + 2r) + r)d_2 = 0. By microcancellation, we have -(r/(2h))(h + 2r) + r = 0, from which it follows that 2r =h.

Stokes’s Theorem

It is interesting that not only can the theorems of vector calculus such as Green’s theorem, Stokes’s theorem, and the Divergence theorem be stated and proved in Smooth Infinitesimal Analysis, but, just as in the classical case, they are all special cases of a generalized Stokes’s theorem.

In this section I will state Stokes’s theorem.

Definition. Given x, y\in R, we say that x\leq y if \neg(y < x). We define \lbrack x,y\rbrack to be the set \{z\in R\mid x \leq z \leq y \}.

Definition. Let C\colon \lbrack 0,1\rbrack\to R^3 be a curve, and F = \langle M,N,P\rangle \colon R^3 \to R^3 be a vector field. The line integral \int_C F\cdot dr is defined to be \int_0^1 F(C(t))\cdot C'(t)\,dt.

Definition. Let S = S(u,v)\colon [0,1]^2\to R^3 be a surface, and f\colon R^3 \to R be a function. The surface integral \iint_S f\,d\sigma is defined to be \int_0^1 \int_0^1 f(S(u,v))\cdot |S_u(u,v) \times S_v(u,v)|\,du\,dv.

This definition may be intuitively justified in the same manner that the arclength of a function was derived in an earlier section.

Definition. Let S = S(u,v)\colon [0,1]^2\to R^3 be a surface, and F\colon R^3\to R^3 be a vector field. The surface integral \iint_S F\cdot n\,d\sigma is defined to be

\iint_S F\cdot \left(\frac{S_u \times S_v}{|S_u \times S_v|}\right)\,d\sigma.

Note that this equals \int_0^1 \int_0^1 F(S(u,v))\cdot (S_u(u,v) \times S_v(u,v))\,du\,dv.

We extend both definitions to cover formal R-linear combinations of curves and surfaces, and we define the boundary \partial S of a region S to be the formal R-linear combination of curves S(0,\cdot) + S(\cdot, 1) - S(1,\cdot) - S(\cdot,0).

The curl of a vector field F = \langle M,N,P\rangle is defined as usual, and we can prove the usual Stokes’s Theorem:

Theorem. Let S be a surface and F a vector field. Then

\iint_S \mathop{\mathrm{curl}} F \cdot n\,d\sigma = \int_{\partial S} F\cdot dr

This theorem may be used to compute answers to standard multivariable calculus problems requiring Stokes’s theorem in the usual way.

As an exercise, state the divergence theorem in SIA.

Generalized Stokes’s Theorem

The definitions in this section are directly from [Moerdijk-Reyes].

Definition [Infinitesimal n-cubes]. For n\in \mathbb{N}, and S any set, an infinitesimal n-cube in S is some (\bar{d},f) where \bar{d}\in D^n and f\colon D^n\to S.

Intuitively, an infinitesimal n-cube on a set S is specified by saying how you want to map D^n into your set, and how far you want to go along each coordinate.

Note that an infinitesimal 0-cube is simply a point.

Definition [Infinitesimal n-chains]. An infinitesimal n-chain is a formal R-linear combination of infinitesimal n-cubes.

Definition [Boundary of n-chains]. Let C be a 1-cube (d,f). The boundary \partial C is defined to be the 0-chain f(d) - f(0), where this is a formal Rl-linear combination of 0-cubes identified as points.

Let C be a 2-cube ((d_1,d_2),f). The boundary \partial C is defined to be the 1-chain (d_1,f(\cdot,0)) + (d_2,f(d_1,\cdot)) - (d_1,f(\cdot,d_2)) - (d_2,f(0,\cdot)).

In general, if C is an n-cube (\bar{d},f), the boundary \partial C is defined to be the n-1-chain \sum_{i = 1}^n \sum_{\alpha = 0,1} (-1)^{i + \alpha} ((d_1,\ldots,\hat{d_i},\ldots,d_n),(x_1,\ldots,x_n) \mapsto f(x_1,\ldots, \alpha \cdot d_i,\ldots x_n)).

The boundary map is extended to chains in the usual way.

Definition [Differential Forms]. An n-form on a set S is a mapping \omega from the infinitesimal n-cubes on S to R satisfying

1. Homogeneity. Let a\in R, 1\leq i\leq n, and f\colon D^n\to S. Define g\colon D^n\to S by g(\bar{d}) = f(d_1,\ldots,ad_i,\ldots,d_n). Then for all \bar{d}\in D^n, \omega((\bar{d},g)) = a\omega((\bar{d},f)).

2. Alternation. Let \sigma be a permutation of \{1,2,\ldots,n\}. Then \omega(\bar{d},\sigma f) = \mathrm{sgn}(\sigma)\cdot \omega(\sigma\bar{d},f), where \sigma f = (x_1,\ldots, x_n)\mapsto f(x_{\sigma(1)},\ldots, x_{\sigma(n)}) and \sigma\bar{d} = (d_{\sigma(1)},\ldots, d_{\sigma(n)}).

3. Degeneracy. If d_i = 0, \omega(\bar{d},f) = 0.

We often write \omega as

C\mapsto \int_C\omega.

We extend \omega to act on all n-chains in the usual way.

These axioms intuitively say that \omega is a reasonable way of assigning an oriented size to the infinitesimal n-cubes.

The homogeneity condition says that if you double the length of one side of an infinitesimal n-cube, you double its size.

The alternation condition says that if you swap the order of two coordinates in an infinitesimal n-cube, then you negate its oriented size.

The degeneracy condition says that if any side of the infinitesimal n-cube is of length 0, its oriented size is of length 0.

By the Kock-Lawvere axiom, for all differential n-forms \omega, there is a unique map \tilde{\omega}\colon S^{D^n}\to R such that for all \bar{d}\in D^n and f\colon D^n\to S we have \omega(\bar{d},f) = d_1\cdots d_n\cdot \tilde{\omega}(f).

Definition [Exterior Derivative]. The exterior derivative d\omega of a differential n-form \omega is an n+1-form defined by

\int_C d\omega = \int_{\partial C} \omega

for all infinitesimal n+1-cubes.

Definition [Finite n-cubes]. A finite n-cube in S is a map M from \lbrack 0,1\rbrack^n to S.

The boundary of a finite n-cube is defined in the same way that the boundary of an infinitesimal n-cube was defined.

In the above section, a curve was a finite 1-cube in R^3 and a surface was a finite 2-cube in R^3.

Definition [Integration of forms over finite cubes]. Let \omega be an n-form on S and M a finite n-cube on S. Then \int_M \omega is defined to be

\int_0^1\cdots \int_0^1 \tilde{\omega}(\bar{d} \mapsto M(\bar{t} + \bar{d}))\,dt_1\ldots dt_n.

Generalized Stokes’s theorem (for finite n-cubes) is provable in SIA (see [Moerdijk-Reyes] for the proof).

Theorem [Generalized Stokes’s Theorem]. Let S be a set, \omega an n-form on S, and M a finite n+1-cube on S. Then

\int_{\partial M}\omega = \int_M d\omega

Let’s see how this gives the Fundamental Theorem of Calculus.

Let F\in R^R and let f = F'. We would like to see how \int_0^1 f(t)\,dt = F(1) - F(0) is a special case of Generalized Stokes’s Theorem. (On the other hand, that it’s true is immediate from the way we defined integration.)

Let \omega be the 0-form on R defined by \omega(x) = F(x). (Recall that 0-cubes are identified with points.)

Then d\omega is the 1-form which takes infinitesimal 1-cubes (d,g) to \int_{\partial(d,g)} \omega. We must show that for the finite 1-cube \lbrack 0,1\rbrack, \int_{[0,1]} d\omega = \int_0^1 f(t)\,dt.

The boundary of (d,g) is g(d) - g(0) (as a formal linear combination, not as a subtraction in R). Therefore, \int_{\partial(d,g)} \omega = \omega(g(d)) - \omega(g(0)) = F(g(d)) - F(g(0)). Since g is a function from D to R, there is a unique a such that g(d) = g(0) + ad for all d\in D. Then F(g(d)) - F(g(0)) = F(g(0) + ad) - F(g(0)) = F(g(0)) + (F'(g(0)))ad - F(g(0)) = f(g(0))ad. Therefore, \tilde{d\omega}(g) = f(g(0))a, where g(d) = g(0) + ad for all d\in D.

Therefore, \int_{C}d\omega = \int_0^1 \tilde{d\omega}(d\mapsto d + t)\,dt = \int_0^1 f(t)\,dt.

One can show in a similar manner that Stokes’s theorem and the Divergence theorem are special cases of Generalized Stokes’s theorem, although the computations are significantly more arduous.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s