13 February, 2016

Linear Algebra

While linear algebra is, strictly speaking, a subject of algebra and not of analysis or
differential equations (which are, along with probability theory, the basic mathematical
tools used for studying quantitative finance), the theory of vector spaces is so fundamen-
tal to the modern formulation of either, that it is necessary to study it before proceeding
too far. Indeed, the theory has proven itself to be perhaps the most influential in unifying
significant branches of modern mathematics
Click here to read full post »

07 December, 2015

Numerical Analysis Review Part 1a: Solutions of Equations in One Variable (Bisection Method)

- Introduction -

This post begins a sequence of posts reviewing topics from numerical analysis that are important in quantitative finance. In this first post we will cover root solving techniques for single equations, discussing aspects from the theory and exhibiting several computational examples for different types of equations.

By subtracting from both sides, any system of $m$ equations of $n$ variables can be written in the form
where $F:\mathbb{R}^{n}\to\mathbb{R}^{m}$ is assumed to be a sufficiently smooth $\mathbb{R}^{m}$-valued function of $x\in\mathbb{R}^{n}$, where the term sufficient depends on the particular application or numerical method being applied.  Such a system is said to be in explicit form.  If the form (1) is not possible, then the system is said to be in implicit form and must be written as
where $x\in\mathbb{R}^{n}$, $y\in\mathbb{R}^{m}$ and $F:\mathbb{R}^{m+n}\to\mathbb{R^{m}}$.  We will assume that every system we encounter is explicit unless otherwise stated.  For example,
is in explicit form and
is in implicit form and can not be put into explicit form by using elementary functions and operations alone.  Note that we do not limit ourselves to purely algebraic equations and allow for analytic constructs such as definite integrals and special functions defined by power series; however, we do not allow for equations that involve integrals and derivatives of $y$ (in particular, we are solving for variables or numbers, not functions, the topic of later posts covering integral and differential equations).  In any case, it is easy to find examples of purely algebraic equations that cannot be put into explicit form.

In general, for there to be any hope of a solution to (1) it is required that $m\leq n$ (if $m>n$ then $F$ is not surjective and hence there may be no $x\in\mathbb{R}^{n}$ such that $F(x)=0$).  If $m=n$ then we can expect any solution (if one exists) to be unique.  If $m<n$, then there could be many solutions and the system is said to be overparameterized (in such a case, we can often introduce $n-m$ additional equations called constraint conditions so that we obtain a new system $G(x)=0$ where $G\in\mathbb{R}^{m+(n-m)}=\mathbb{R}^{n}$).  In the special case that $F$ is linear (i.e., $F$ can be represented by matrix multiplication), then these dimensional heuristics become actual theorems and much more can be definitively stated.  In any case, we shall come to these topics in future posts on systems of both linear and nonlinear equations.

For the remainder of this post, we shall assume both $m,n=1$ (we will not deal with the case $n>1$ here because this necessarily leads us to consider constraints, which in turn leads us to consider systems of equations where $m>1$).

- Single Equations in Explicit Form -

Let us consider an explicit equation in the form of (1) for $m,n=1$.  We shall consider several numerical methods for this problem:
  1. Bisection
  2. Fixed Point
  3. Newton-Raphson
  4. Secant Line
Each of these methods belongs to the class called iterative methods, as opposed to direct methods which are really only applicable to systems of linear equations where systematic direct procedures are possible because of certain theorems from linear algebra (indeed, if we could solve (1) directly, we would not need to resort to numerical methods, and there is no general procedure for solving systems of nonlinear equations in a systematic and direct fashion).

There are some general considerations for iterative methods that need to be made, which are applicable in all situations, among them include:
  1. What are the requisite requirements (including regularity conditions) the problem must first satisfy?
  2. How fast does the method converge?
  3. What situations can lead to numerical instability for the method?
  4. How sensitive is the method to choice of initial approximation?
We shall address these considerations when reviewing each of the methods listed above.

- Bisection Method -

Let $F:\mathbb{R}\to\mathbb{R}$ be continuous on $[a,b]$.  The intermediate value theorem from elementary point-set topology states that for such an $F$, if $F(a)\leq0$ and $F(b)\geq0$, then there exists some $x'\in[a,b]$ such that $F(x')=0$.  Although the method we are about to describe will converge if $x'$ is not unique, we assume that it is (this can be often be justified from considerations of the problem).
Assuming that $F(a)$ and $F(b)$ have opposite signs and are non-zero (if either is equal to $0$ then we are done), we sample a point $x_{0}\in(a,b)$ and compute $F(x_{0})F(a)$ and $F(x_{0})F(b)$.  By the intermediate value theorem, $F(x_{0})$ has the opposite sign from either $F(a)$ or $F(b)$, and we define $I_{1}=(x_{0},b)$ or $I_{1}=(a,x_{0})$ according to whether
holds.  We then iterate this procedure, obtaining a sequence of intervals $\{I_{n}\}_{n\geq0}$ with $x'\in I_{n}$ for every $n$ and $|I_{n}|\to0$ as $n\to\infty$.  In principle we would stop once $F(x_{n})=0$, although we usually use a condition like
We have thus established the main requirements for our method (Consideration #1): continuity of $F$ (so that the intermediate value theorem holds) and determination of an interval $I_{0}=[a,b]$ where $x'\in(a,b)$ and $F(a)F(b)<0$.  Furthermore, in order for the method to converge to the root applicable to the problem in mind (in the event that $F$ has multiple roots in its domain), we have the additional requirement that $I_{0}$ be chosen so that $F$ has only one root in $I_{0}$ (it is clear that the method will converge to a root as long as one exists in $I_{0}$).

To address Considerion #2, we present the following theorem.
Theorem 1Suppose that $F\in\mathcal{C}[a,b]$, that $F(a)F(b)<0$, and that $x_{0}=\frac{b+a}{2}$ (the midpoint of $[a,b]$).  Then the bisection method generates a sequence of points $\{x_{n}\}_{n=0}^{\infty}\subset[a,b]$ such that $x_{n}\to x'$ where $F(x')=0$.  Moreover, we have the quantitative bound $$|x_{n}-x'|\leq\frac{b-a}{2^{n}}$$ for every $n\geq1$.
Proof.  We assume that the midpoint is taken to advance each step of the iteration, i.e. $x_{n}=\frac{1}{2}(b_{n}+a_{n})$ for all $n\geq1$ is the mid-point of $I_{n}=[a_{n},b_{n}]$.  In that case, since for all $n\geq1$
and $x'\in(a_{n},b_{n})$, we have
In particular,
It is instructive to solve for $n$ for a given error tolerance $\epsilon$.  We have
thus taking logarithms (which is valid since $\log x$ is an increasing function) we obtain
$$n\leq C\log\left(\epsilon^{-1}(b-a)\right)$$
where $C=(\log(2))^{-1}$.  If $\epsilon=10^{-6}$, then $$N(\epsilon)\approx20+C\log(b-a),$$ thus if $(b-a)=1$ then at least 20 iterations are required before $|x_{n}-x'|<10^{-6}$ can be guaranteed.  In practice, the number of required iterations will usually be substantially lower, but even so, as we shall see there are methods which are guaranteed to converge much faster provided additional regularity conditions on $F$ are satisfied.

As for Considerations #3 and #4, it should be evident that the bisection method will always converge as long as $F$ is continuous.  Moreover, the approximating sequence is guaranteed to stay bracketed within in the initial interval $[a,b]$ for all $n\geq 1$, and the convergence is not affected asymptotically by the initial choice $x_{0}$.  This is one of the great virtues of the bisection method and one of the reasons that it is frequently used to start some of the more sensitive iteration schemes that we will describe below.

To illustrate the bisection method and its convergence properties, we will implement it in C++ to solve some equations that we will also solve with other methods to be discussed below.  There are by now many ways in C++ to represent functions for numerical methods.  Clasically, this was done (in C) using function pointers and (in C++) or functors (objects with the () operator overloaded) or generically via templates.  C++11 and prior to C++11 the Boost library) offers a number of convenient alternatives including functional programming constructs like variadic templates and lambda expressions, as well as objected oriented constructs like function objects and argument binding.  Throughout this series, we will apply all of these old and new methodologies.  We shall also use MATLAB to prototype and/or check our methods in C++ (and in some cases use the MATLAB API in our C++ programs).

#pragma once

#include <functional>
#include <numeric>

double bisection_method(std::function<double(double x)>& F, int& error, double a = 0, double b = 1, double eps = 0.0000001);
int sgn(double x);

double bisection_method(std::function<double(double x)>& F, int& error, double a, double b, double eps)
 if (!F) {
  error = 1;
  return 1;
 if (sgn(F(a)) == sgn(F(b))) {
  error = 2;
  return 1;
 if (b < a) 
  std::swap(a, b);
 int N = round(log((b - a) / eps) / log(2)) + 1;
 double x_n = (a + b) / 2;
 double x = x_n + 1;
 int n = 0;
 while ((n <= N) && (abs(x_n - x) > eps)) {
  x = x_n;
  if (sgn(F(a))*sgn(F(x)) < 0) { // update b_n
   b = x;
   x_n = (a + x) / 2;
  } else { // update a_n
   a = x;
   x_n = (x + b) / 2;
 if (n == N)
  error = 3;
  error = 0;
 return x;

int sgn(double x)
 return x < 0 ? -1 : 1;

#include <iostream>
#include <cstdlib>
#include <functional>
#include <cmath>
#include "bisection_method.h"

int main(void)
 int error = 0;
 std::function<double(double)> F = [](double x) { return 1 - exp(x); };
 double x = bisection_method(F, error, -1, 3);
 std::cout << "Root estimated to be x' = " << x << " (Error Code: " << error << ")" << std::endl;
 return 0;

In this code we have asked our solver to find the root of $$F(x)=1-e^{x}$$
which it determines within $\epsilon=10^{-6}$ to be $x'=0$.  Note that the return is with error code $0$, indicating that the bisection method succeeded before the theoretical upper-bound on the number of iterations $N(\epsilon)$. The following figure illustrates the situation - observe that the root $x'=0$ is a fixed point of $F$. We shall discuss this situation further when we get to fixed-point iteration.

Click here to read full post »

12 July, 2015

Gaussian Form of Geometric Brownian Motion

In numerical analysis, it can in some cases be more efficient to evaluate a function when it is expressed in a certain form.  For example, it is numerically more efficient to compute a second degree polynomial
 when it is in the form
That is because evaluating the former requires three multiplication operations and two addition operations, whereas the latter requires only two multiplication operations and two addition operations.  This way of evaluating polynomials numerically is known as Horner's Method, which describes the general formulation for a polynomial of degree $n$ and coefficients $\{a_{j}\}_{0\leq j\leq n}.$  A simple implementation of the algorithm in C can be found here.  In general, evaluation of a $n$ degree polynomial $P(x)$ will require $2n-1$ multiplication operations and $n$ addition operations using the naïve method, whereas it will require just $n$ multiplication operations and $n$ addition operations using Horner's Method.  Although from the point of view of algorithmic analysis both methods are equivalent (they both require $O(n)$ operations to compute $P(x)$), on a practical/implementation level, Horner's Method requires only $2n$ operations whereas the naïve approach requires $3n-1$ operations.  On the other hand, since the savings come from only the multiplication operations, the performance gain (unless each $a_{j}$ and $x$ are all integers) is somewhat tempered by the fact that for most computing hardware floating point multiplication is much faster than floating point addition (the opposite is generally true for integral arithmetic).  Note also that if $x^{2}$ is expressed explicitly as $x*x$, then a modern compiler will very likely optimize the naïve expression into something similar to Horner's Method; however, for high degree polynomials, expanding the powers with the multiplication operator like this is not feasible without producing very ugly, static, and difficult to maintain code - a power function of some sort is likely to be used and it is then significantly less likely the compiler will be able (or if it can, choose) to optimize the resulting code).

Now if the function in question is evaluated repeatedly, as in the case of finite difference methods and especially Monte Carlo simulations, then it is sometimes possible to achieve significant performance gains.  In financial applications, the formula for the exact solution to the SDE
$$dS_{t}=rSdt+\sigma SdW_{t},$$
where $W_{t}$ is a standard Brownian motion process, frequently needs to be evaluated repeatedly in the context Monte Carlo simulations.  The formula is given by
$$S_{t}=S_{0}\exp\left\{\left(r-\frac{1}{2}\sigma^{2}\right)t+\sigma W_{t}\right\}.$$
Frequently, certain parameters in the formula are known and can be synthesized as constants before entering into the main control loop, which could be on the order of millions or even billions of iterations.  In the case of a very simple Monte Carlo simulation for the pricing of a European vanilla call option, as discussed in this post, all the parameters except for the samples $W_{t}$ are known ($t$ is fixed in this situation).  Thus we can rewrite $S_{t}$ as
and $N$ is a random variable with a standard normal distribution that is re-sampled at each iteration.  $A$ and $b$ are both computed before entering into the Monte Carlo simulation loop, thereby realizing a significant gain in computational efficiency when the number of iterations (trials) is large.

In other situations, different functional forms of $S_{t}$ can be useful.  A number of optimized numerical routines are available for Gaussian functions, which are functions of the form
If we match $S_{t}$ with $G(x)$, then the obvious candidate for identification with the argument $x$ is $\sigma$.  If we complete the square in the argument of the exponential function in $S_{t}$, we get
Matching the triple of parameters $(\alpha,\beta,\gamma)$ in $G(\sigma)$, we find
$$\alpha=S_{0}\exp\left\{rt+\frac{W_{t}^{2}}{2t}\right\}, \;\;\;\;  \beta=\frac{W_{t}}{t}, \;\;\;\; \gamma=\frac{1}{\sqrt{t}}.$$
If we map $W_{t}\mapsto\sqrt{t}{N}$ where $N$ is a standard normal random variable, then
$$\alpha=S_{0}\exp\left\{rt+\frac{N^{2}}{2}\right\}, \;\;\;\; \beta=\frac{N}{\sqrt{t}}, \;\;\;\; \gamma=\frac{1}{\sqrt{t}}.$$

This is the Gaussian form of $S_{t}$, with independent variable $\sigma$.  This representation is particularly useful for numerical pricing routines when a stochastic/local volatility model is being used.
Click here to read full post »

23 March, 2015

A Primer in Harmonic Analysis

I picked these problems from Modern Fourier Analysis Vol I - I think that they serve as a good primer for the basic techniques and theorems in harmonic analysis (a subject that I have recently started looking back into in order to deal with some of the techniques used when Levy processes in mathematical finance).
Problem I.  Fix $d\geq1$ and suppose $\psi:(0,\infty)\mapsto[0,\infty)$ is $C^{1}$, non-increasing, and $\int_{\mathbb{R}^{d}}\psi(|x|)\;dx\leq A<\infty.$  Define
and show that $$[M_{\psi}f](x)\leq A[Mf](x)$$
where $M$ is the usual Hardy-Littelwood maximal function.
Solution.  We first observe that the translation invariance of the indicated estimate implies that it is sufficient to prove the case $x=0$ (this can be seen more explicitly by replacing $f$ by $\tau_{x}f$, where $\tau_{x}$ is the translation by $x$ operator, and applying the present case to be proven to see then that the estimate holds for all $x$).   For convenience let us define $\psi_{r}(|y|)=r^{-d}\psi(|y|/r)$.  The radial properties of the terms in the estimate suggest polar coordinates will be useful in dealing with the resultant integrals.  Let us recall that the polar coordinate formula implies as a consequence of itself that
$$\frac{d}{ds}\int_{B(0,s)}f(y)\;dy=\frac{d}{ds}\int_{0}^{s}dt\int_{\partial B(0,t)}f(\omega)\;dS(\omega)=\int_{\partial B(0,s)}f(\omega)\;dS(\omega)=s^{d-1}\int_{S^{d-1}}f(s\omega)\;dS(\omega).$$
In the last term we have used a change of variables in order to place the integration over the the unit sphere (and in particular, to keep the domain fixed).  In order to apply this formula in an integration by parts without causing notational chaos, let us define
Note that $\beta(s)$ is majorized by $\omega(d)s^{d}[Mf](0)$ where $\omega(d)$ is the measure of the unit ball in $\mathbb{R}^{d}$.
Let us make a further assumption that $\psi$ is compactly supported on a ball of radius $\delta$ so that $\psi_{r}$ is also compactly supported (and thus bounded) on a ball of radius $r\delta$.  Invoking polar coordinates in the left hand side, setting $x=0$, $\psi(|y|)=\psi(|-y|)$ (in particular, the associativity of convolution), and integration by parts along with the fact that $\beta(0)=0=\psi_{r}(r\delta)$, we estimate at last

&=[Mf](0)\int_{0}^{\infty}\psi_{r}(s)\;ds\int_{\partial B(0,s)}\;dS(\omega)\\
as desired.  To complete the proof, simply take an increasing sequence $\psi_{n}\to\psi$ of compactly supported $C^{1}$ functions.  Since the estimate holds for each $\psi_{n}$, it holds for the limit function $\psi$.

Problem II.  Consider the heat kernel $$G(x,t)=\frac{1}{(4\pi t)^{\frac{d}{2}}}e^{-\frac{|x|^{2}}{4t}}.$$
  1. Given $\alpha>0$ find constants $\beta$ and $C$ so that $$G(x+y,t)\leq CG(x,\beta t)$$ holds for every $x\in\mathbb{R}^{d}$, $t>0$ and $|y|\leq\alpha\sqrt{t}.
  2. Deduce that for $f\in L^{1}$ and $u(x,t)=(G(t,\cdot)*f)(x)$ that $$\mu\left(\left\{y:|u(x,t)|\geq\lambda\;\text{for some}\;t>0\;\text{and}\;x\in B(y,\alpha\sqrt{t})\right\}\right)\leq\frac{||f||_{L^{1}}}{\lambda}.$$
(1) Solution. We estimate the quadratic form
&=|x+y|^{2}-2(x+y)\cdot y+|y|^{2}\\
where we have used the fact that $2ab\leq a^{2}+b^{2}$ for $a,b\in\mathbb{R}$ and also the restriction $|y|\leq\alpha\sqrt{t}.$  Consequently,
and thus
&=(4\pi t)^{-d/2}\exp\left\{\frac{-|x+y|^{2}}{4t}\right\}\\
&\leq(4\pi t)^{-d/2}\exp\left\{\frac{-|x|^{2}}{8t}+\frac{\alpha^{2}}{4}\right\}\\
&=e^{\alpha^{2}/4}2^{d/2}(8\pi t)^{-d/2}\exp\left\{\frac{-|x|^{2}}{8t}\right\}.
This completes the proof, with the estimate holding for $\beta=2$ and $C=e^{\alpha^{2}/4}2^{d/2}.$

Remark.  It is interesting that this result can be obtained as a special case of the Hardy-Moser-Trudinger inequality available here http://arxiv.org/abs/1012.5591.  Indeed, generalizing a bit, we are asked to produce constants $a,\beta$ such that
$$\sup_{x\in\mathbb{R}^d,y\in B(0,a)}\frac{G(x+y,t)}{G(x,\beta t)}<\infty.$$
But the continuity and non-vanishing of $G$ implies that the above estimate will hold if
$$\int_{\mathbb{R}^d} \frac{G(x+y,t)}{G(x,\beta t)} dx$$
converges.  Bounding the integral, we get
$$D\int_{\mathbb{R}^d} \exp\left(\frac{-\beta+1}{4t\beta} \left\{|x+y|^2-|x|^2\right\}\right) dx \\ \leq \int_{\mathbb{R}^d} \exp\left(\frac{-\beta+1}{4t\beta} \left\{|y|^2-2x\cdot y\right\}\right) dx:=J.$$
Now, using the fact that $|y|\leq a$, we get
$$J\leq I(a)=\int_{\mathbb{R}^d} \exp\left(\frac{-\beta+1}{4t\beta} \left\{|a|^2-2x\cdot y\right\}\right) dx.$$
The Moser-Trundinger inequality states that $I(a)<\infty$ for values $$a^2\frac{-\beta+1}{4t\beta} \leq 4\pi,$$ which readily gives $a\leq C\sqrt{t}$, from which the claim follows.

(2) Solution.  From Problem I we ave
$$\sup_{t>0}|G(\cdot,t)*f(\cdot)|(x,t)\leq A(Mf)(x),$$
and the claim follows by observing that it is true for $Mf$.

Problem III.  Let $f:\mathbb{R}^{d-1}\to\mathbb{R}$ belong to $L^{\infty}$ and let $u:\mathbb{R}^{d}_{+}\to\mathbb{R}$ be the Poisson integral of $f$.
  1. Show that $u(x,y)$ converges non-tangentially almost everywhere to $f(x)$ (i.e. from approach regions contained in any cone with vertex at $x$).  Specifically, for fixed $x_{0}$ and $$\mathcal{C}_{\alpha}(x_{0}):=\{(x,y)\in\mathbb{R}^{n}_{+}\;:\;|x-x_{0}|<\alpha y\},$$ we have $$\lim_{(x,y)\to(x_{0},0),\;(x,y)\in C_{\alpha}(x_{0})}u(x,y)=f(x_{0}).$$ The convention for the Poisson kernel is $$P_{y}(x)=\frac{c_{n}y}{(|x|^{2}+y^{2})^{d/2}},$$ where we regard $y>0$ and $x\in\mathbb{R}^{d-1}$ so that a typical point $z\in R^{d}_{+}$ is $z=(x,y)$ ($y$ is the distance from $z$ to $\partial\mathbb{R}^{d}_{+}$).  The Poisson integral of $f$ is then $$u(x,y)=\int_{\mathbb{R}^{d-1}}P_{y}(t)f(x-t)\;dt.$$ Since it will be needed below, we note that this is equivalent to the more usual representation $$P_{y}(x-t)=\frac{c_{n}y}{|(x,y)-(t,0)|^{d}}.$$  The expression $P_{y}(x-t)$ of course arises from the associativity of the convolution above.  This last form will be used to obtain the final estimate for $J$ in the proof below. 
  2. Show that $u$ matches the boundary values in a distributional sense. 
  3.  Show that if $v\in L^{\infty}(\mathbb{R}^{d}_{+})$ is distributionally harmonic and matches the boundary values $f$ in a distributional sense, then $u=v$ as distributions.
(1) Solution.  Fix $x_{0}\in\mathbb{R}^{d-1}=\partial\mathbb{R}^{d}_{+}.$  To obtain a.e. nontangential convergence, we need to show $u(x_{0}-t,y)\to f(x_{0}$ a.e. $x_{0}$ where $t\in\mathbb{R}^{d-1}$ and $|t|\leq\alpha y.$  We have as a first attempt
If we estimate the integral directly, we get the inferior estimate $\leq2||f||_{\infty}.$  To gain better control on $f$, we suppose now $x_{0}\in\mathcal{L}(f),$ the Lebesgue set of $f$.  Since $L^{\infty}\subset L^{1}_{\text{loc}}$, we see that $m(L(f)^{c})=0$ by the corresponding well-known result for $L^{1}$.  Now let $\epsilon>0$.  Our choice of $x_{0}$ and the Lebesgue differentiation theorem implies the existence of a $\delta>0$ such that if $r<\delta$ we have
$$\frac{1}{m(B_{r})}\int_{|x-x_{0}|\leq r}|f(x)-f(x_{0})|\;dx=\frac{1}{m(B_{r})}\int_{|x|\leq r}|f(x_{0}-x)-f(x_{0})|\;dx<\epsilon.$$
With this $\delta$, we now estimate the last integral by splitting it over the complementary sets $B(0,\delta)$ and $(B(0,\delta))^{c}$, denoted by $I$ and $J$, respectively.  Note that $I$ would be trivial if $f$ were continuous; however, we are saved by the pointwise majorization property of the maximal function.  Indeed, if $g(x)=|f(x)-f(x_{0})|\chi_{|x-x_{0}|<\delta}$, then $|g(x_{0}-t)|\leq[Mg](x_{0})<\epsilon$ by the above estimate.  Hence,
We now proceed to estimate $J$.  We first observe if $y$ is fixed and $t\in\mathbb{R}^{d-1}$ and $|t|<\alpha y$, then
$$P_{y}(x-t)\leq A_{\alpha}P_{y}(x)$$
for some absolute constant $A_{\alpha}$ independent of $f$ (this is a very similar estimate to that in problem #7, however, the analog of $y$ is not fixed and therefore a second constant $\beta$ is introduced).  To prove this, note that for $\alpha,y>0$ and $|t|\leq\alpha y$, and for simplicity $d=2$,
where we have used the well-known Cauchy inequality ``with $\epsilon=\alpha$'' in the first inequality.  Therefore
as desired (the generalization to $d>2$ is essentially the same).  Now, using the fact that $f\in L^{\infty}$, our choice of $x_{0}$, $\delta$, and $t$, and the ladder expression for $P_{y}(x)$ given in the problem statement, we get
J&\leq A_{\alpha}\int_{|x|>\delta}P_{y}(x)|f(x_{0}-x)-f(x_{0})|\;dx\\
since the ladder integral exists.  Putting this together, we get
$$\limsup_{|t|<\alpha y,y\to0}|u(x_{0}-t,y)-f(x_{0})|\leq A_{\alpha}\epsilon\to0\;\text{as}\;\epsilon\to0,$$
from which the desired nontangential convergence follows.

(2) Solution.  Since $f\in L^{\infty}$, $Mf\in L^{\infty}$ because $M(\cdot)$ is bounded on $L^{p}$ for $p>1$.  It follows from Problem I that
$$\sup_{y>0}|u(x,y)|=\sup_{y>0}|P_{y}*f|(x)\leq A(Mf)(x)\leq A||Mf||_{\infty}<\infty,$$
and thus
Fixing a test function $\phi\in C_{c}^{\infty}(\mathbb{R}^{d-1})$, the dominated convergence theorem now implies (since $|u(x,y)\phi(x)|\leq A||Mf||_{\infty}|\phi(x)|,$ which is integrable)
Remark. The proof of Weyl's lemma is a mollification argument.  For details, see problem #5 in my Since this holds for every test function, it follows that $u(x,0)=f(x)$ in the sense of distributions (the $0$ is just suggestive notation to indicate the restriction of $u$ to the boundary of the half space; $u$ is \emph{not} defined by convolution with $P_{y}$ when $y=0$ and must be obtained by a limiting process).
In fact, the limit function $\lim_{y\to0}u(x,y)$ is equal to $f(x)$ a.e. by the fundamental theorem of calculus of variations.

(3) Solution.  For test functions $\phi\in C_{c}^{\infty}(\mathbb{R}^{d-1})$ and $\psi\in C_{c}^{\infty}(\mathbb{R}^{d}_{+})$, we have
$$(1)  \int_{\mathbb{R}^{d-1}}v(x,0)\phi(x)\;dxdy=\int_{\mathbb{R}^{d-1}}f(x)\phi(x)\;dxdy$$
$$(2)  \int_{\mathbb{R}^{d}_{+}}v(x,y)\Delta\psi(x,y)\;dxdy=0.$$

(Again, the setting of $y=0$ is just notation for the restriction of $u$ or $v$ to $\mathbb{R}^{d-1}.$)  By (a), $(1)=\int_{\mathbb{R}^{d-1}}u(x,0)\phi(x)\;dx$ and so we see already that $u=v=f$ in the sense of distributions on $\mathbb{R}^{d-1},$ and actually pointwise a.e.  Weyl's lemma implies that if $v\in L^{1}_{\text{loc}}(\mathbb{R}^{d}_{+})$ (recall that the half space is \emph{open}) is a weak solution to Laplace's equation, then $v$ is a classical solution after possibly a correction on a set of measure zero.  The uniqueness of Dirichlet's problem and the preceeding application of Weyl's lemma (sine $L^{\infty}\subset L^{1}_{\text{loc}}$) now implies that $u=v$ a.e. in $\overline{\mathbb{R}^{d}_{+}}$ and so they are equal in the sense of distributions and after a correction of $v$ on a set of measure zero, they are equal pointwise too.
linked blog post in which I proved the lemma in a different course.  http://mathtm.blogspot.com/2013/02/math-266b-assignment-2.html.
Click here to read full post »

22 March, 2015

Divergence of Harmonic Series on a Sequence of Decreasing Sub-Domains of $\mathbb{N}$

The series $\sum_{n\in\mathbb{N}}n^{-p}$ diverges if $p\leq1$ and converges if $p>1$, and so it may seem plausible that (being a "bifurcation point" of this condition) the harmonic series $\sum_{n\in\mathbb{N}}n^{-1}$ could converge on some proper subset of $A\subset\mathbb{N}$. This is obvious if $A$ is finite. If $A$ is infinite, then a moment's thought reveals that there are many subsets on which the harmonic series converges since its terms contain any other series with terms in $\mathbb{N}^{-1}$. So for instance $$\sum_{n\in\mathbb{N}}n^{-2}=\frac{\pi^{2}}{6},$$ $$\sum_{n\in\mathbb{N}}\frac{1}{n!}=e,$$ $$\sum_{n\in\mathbb{N}}\frac{1}{2^{n}}=2,$$ and so on. Given that rather "large" subsets of $\mathbb{N}$ lead to convergence of the harmonic series, the following result was somewhat surprising to me when I was first asked to prove it.
Claim. Let $$A_\epsilon := \{a \in \mathbb{N} : 1 - cos(a) < \epsilon\}.$$ Then $$\sum_{n\in A_\epsilon } \frac{1}{n}$$ diverges for all $0<\epsilon<1.$
Proof.  For $0<\epsilon< 1$, the inequality $1-\cos(a)< \epsilon$ has solutions for $$a\in(2k\pi-\theta,2k\pi+\theta)$$ where $\theta=\cos^{-1}(1-\epsilon)$ (note that $\theta\in(0,\frac{\pi}{2})$ and by using a Taylor expansion, it is easy to see $\theta=O(\epsilon^{\frac{1}{2}})$, although all that is important is $\theta\to0$ as $\epsilon\to0$). For there to be any positive integers $a:=a_{k}$ in such an interval, it is necessary and sufficient that
where $[\cdot]$ is the "floor" function (round down, e.g. truncate the decimals). Intuitively, this condition just says there is an integer in the $k$th solution interval (note that there could be multiple integral solutions in a $k$th interval, though this is not very important since we are mostly interested in the case for small $\epsilon$; furthermore, since $\theta=O(\epsilon^{\frac{1}{2}})$, then once $\epsilon$ is sufficiently small (say $\epsilon<0.1,$ so that $2\theta$).

From the above observations and the fact that $2\pi<6.3$ (circumference of the unit circle), it is not difficult to ascertain that $\#A=\infty$ (the cardinality of the set $A$).  Therefore $A$ is countable with its elements forming an "approximate" arithmetic sequence of integers in that sense that for
$$D:=\max_{a_{i}\in A}|a_{i+1}-a_{i}|<\infty,$$
on "average" the difference of two successive integers is approximately $D$ (having analytical results that are sharp is unnecessary in the present situation as we are only after qualitative facts like convergence).

We can now determine whether or not the sum converges. Define sequences $a_{j}:=\frac{1}{j}$ for $j\in A $, and $0$ otherwise, and $b_{j}:=\frac{1}{j}$ for all $j=1,2,\ldots.$ Then $c_{j}:=\frac{a_{j}}{b_{j}}=1$ for $j\in A$, and $0$ otherwise. Therefore, $c_{j}$ has a sum which looks like $$1+0+\ldots+0+1+0+\ldots+0+1+\ldots$$ Define one more sequence $d_{j}:=1$ if $j=a_{1}D$ ($a_{1}$ being the first integral solution to the original inequality) and $0$ otherwise (in other words, $d_{j}$ really is an arithmetic sequence with common difference $D$). Recall from the theory of Cesaro summation that for zero-spacing $D$, $$\frac{1+0+\ldots+0+1+0+\ldots+0+\ldots+0+1_{n}}{n}\to\frac{1}{D+1}\;as\;n\to\infty$$ (note because Cesaro summation is an averaging process, the limit holds even if there is a finite number of instances of improper spacing for a finite number of terms). Consequently, $$\frac{d_{1}+\ldots+d_{j}}{n}=\frac{1}{D+1}\;as\;n\to\infty$$ (see previous parenthetical remark). Consequently, \begin{align*} \lim\limits_{n\to\infty}\frac{1}{n}\sum\limits_{j=1}^{n}\frac{a_{j}}{b_{j}} &=\lim\limits_{n\to\infty}\frac{1}{n}\sum\limits_{j=1}^{n}c_{j}\\ &\geq\lim\limits_{n\to\infty}\frac{1}{n}\sum\limits_{j=1}^{n}d_{j}\\ &=\frac{1}{D+1}\\ &>0 \end{align*} for all $\epsilon>0$, no matter how small (note that $D$ behaves something like $O(\theta^{-1})$, and by extension something like $O(\epsilon^{-\frac{1}{2}}).$ It follows that $$\sum\limits_{j=1}^{\infty}a_{j}=\infty,$$ e.g. diverges for every $\epsilon>0$ (if you don't see why or don't recognize the convergence theorem used, just apply the summation by parts formula to $\sum a_{j}$ together with the established bound).
Click here to read full post »