23 March, 2015

A Primer in Harmonic Analysis



I picked these problems from Modern Fourier Analysis Vol I - I think that they serve as a good primer for the basic techniques and theorems in harmonic analysis (a subject that I have recently started looking back into in order to deal with some of the techniques used when Levy processes in mathematical finance).
Problem I.  Fix $d\geq1$ and suppose $\psi:(0,\infty)\mapsto[0,\infty)$ is $C^{1}$, non-increasing, and $\int_{\mathbb{R}^{d}}\psi(|x|)\;dx\leq A<\infty.$  Define
$$[M_{\psi}f](x):=\sup_{0<r<\infty}\frac{1}{r^{d}}\int_{\mathbb{R}^{d}}|f(x-y)|\psi\left(\frac{|y|}{r}\right)\;dy$$
and show that $$[M_{\psi}f](x)\leq A[Mf](x)$$
where $M$ is the usual Hardy-Littelwood maximal function.
Solution.  We first observe that the translation invariance of the indicated estimate implies that it is sufficient to prove the case $x=0$ (this can be seen more explicitly by replacing $f$ by $\tau_{x}f$, where $\tau_{x}$ is the translation by $x$ operator, and applying the present case to be proven to see then that the estimate holds for all $x$).   For convenience let us define $\psi_{r}(|y|)=r^{-d}\psi(|y|/r)$.  The radial properties of the terms in the estimate suggest polar coordinates will be useful in dealing with the resultant integrals.  Let us recall that the polar coordinate formula implies as a consequence of itself that
$$\frac{d}{ds}\int_{B(0,s)}f(y)\;dy=\frac{d}{ds}\int_{0}^{s}dt\int_{\partial B(0,t)}f(\omega)\;dS(\omega)=\int_{\partial B(0,s)}f(\omega)\;dS(\omega)=s^{d-1}\int_{S^{d-1}}f(s\omega)\;dS(\omega).$$
In the last term we have used a change of variables in order to place the integration over the the unit sphere (and in particular, to keep the domain fixed).  In order to apply this formula in an integration by parts without causing notational chaos, let us define
$$\alpha(s)=\int_{S^{d-1}}|f(s\omega)|\;dS(\omega)$$
and
$$\beta(s)=\int_{0}^{s}\alpha(t)t^{d-1}\;dt.$$
Note that $\beta(s)$ is majorized by $\omega(d)s^{d}[Mf](0)$ where $\omega(d)$ is the measure of the unit ball in $\mathbb{R}^{d}$.
Let us make a further assumption that $\psi$ is compactly supported on a ball of radius $\delta$ so that $\psi_{r}$ is also compactly supported (and thus bounded) on a ball of radius $r\delta$.  Invoking polar coordinates in the left hand side, setting $x=0$, $\psi(|y|)=\psi(|-y|)$ (in particular, the associativity of convolution), and integration by parts along with the fact that $\beta(0)=0=\psi_{r}(r\delta)$, we estimate at last

$$\begin{align*}
\int_{\mathbb{R}^{d}}|f(-y)|\psi_{r}(|y|)\;dy&=\int_{\mathbb{R}^{d}}|f(y)|\psi_{r}(|y|)\;dy\\
&=\int_{0}^{\infty}\psi_{r}(s)s^{d-1}\;ds\int_{\mathcal{S}^{d-1}}|f(s\omega)|\;dS(\omega)\\
&=\int_{0}^{r\delta}\psi_{r}(s)s^{d-1}\int_{\mathcal{S}^{d-1}}|f(s\omega)|\;dS(\omega)\\
&=\int_{0}^{r\delta}\psi_{r}(s)s^{d-1}\alpha(s)\;ds\\
&=\beta(r\delta)\psi_{r}(r\delta)-\beta(0)\psi_{r}(0)-\int_{0}^{r\delta}\beta(s)d\psi_{r}(s)\\
&=\int_{0}^{r\delta}\beta(s)d(-\psi_{r}(s))\\
&\leq[Mf](0)\int_{0}^{\infty}\omega(d)s^{d}d(-\psi_{r}(s))\\
&=[Mf](0)\int_{0}^{\infty}d\omega(d)s^{d-1}\psi_{r}(s)\;ds\\
&=[Mf](0)\int_{0}^{\infty}\psi_{r}(s)\;ds\int_{\partial B(0,s)}\;dS(\omega)\\
&=[Mf](0)\int_{\mathbb{R}^{d}}\frac{1}{r^{d}}\psi\left(\frac{|y|}{r}\right)\;dy\\
&=[Mf](0)\int_{\mathbb{R}^{d}}\psi(|y|)\;dy\\
&=A[Mf](0),
\end{align*}$$
as desired.  To complete the proof, simply take an increasing sequence $\psi_{n}\to\psi$ of compactly supported $C^{1}$ functions.  Since the estimate holds for each $\psi_{n}$, it holds for the limit function $\psi$.

Problem II.  Consider the heat kernel $$G(x,t)=\frac{1}{(4\pi t)^{\frac{d}{2}}}e^{-\frac{|x|^{2}}{4t}}.$$
  1. Given $\alpha>0$ find constants $\beta$ and $C$ so that $$G(x+y,t)\leq CG(x,\beta t)$$ holds for every $x\in\mathbb{R}^{d}$, $t>0$ and $|y|\leq\alpha\sqrt{t}.
  2. Deduce that for $f\in L^{1}$ and $u(x,t)=(G(t,\cdot)*f)(x)$ that $$\mu\left(\left\{y:|u(x,t)|\geq\lambda\;\text{for some}\;t>0\;\text{and}\;x\in B(y,\alpha\sqrt{t})\right\}\right)\leq\frac{||f||_{L^{1}}}{\lambda}.$$
(1) Solution. We estimate the quadratic form
$$\begin{align*}
|x|^{2}
&=|x+y-y|^{2}\\
&=|x+y|^{2}-2(x+y)\cdot y+|y|^{2}\\
&\leq|x+y|^{2}+2|x+y||y|+|y|^{2}\\
&\leq|x|y|^{2}+|x+y|^{2}+|y|^{2}+|y|^{2}\\
&=2|x+y|^{2}+2|y|^{2}\\
&\leq2|x+y|^{2}+\alpha^{2}t,
\end{align*}$$
where we have used the fact that $2ab\leq a^{2}+b^{2}$ for $a,b\in\mathbb{R}$ and also the restriction $|y|\leq\alpha\sqrt{t}.$  Consequently,
$$\frac{|x+y|^{2}}{4t}\geq\frac{\frac{1}{2}|x|^{2}-\alpha^{2}t}{4t}=\frac{|x|^{2}}{8t}-\frac{\alpha^{2}}{4}$$
and thus
$$\begin{align*}
G(x+y,t)
&=(4\pi t)^{-d/2}\exp\left\{\frac{-|x+y|^{2}}{4t}\right\}\\
&\leq(4\pi t)^{-d/2}\exp\left\{\frac{-|x|^{2}}{8t}+\frac{\alpha^{2}}{4}\right\}\\
&=e^{\alpha^{2}/4}2^{d/2}(8\pi t)^{-d/2}\exp\left\{\frac{-|x|^{2}}{8t}\right\}.
\end{align*}$$
This completes the proof, with the estimate holding for $\beta=2$ and $C=e^{\alpha^{2}/4}2^{d/2}.$

Remark.  It is interesting that this result can be obtained as a special case of the Hardy-Moser-Trudinger inequality available here http://arxiv.org/abs/1012.5591.  Indeed, generalizing a bit, we are asked to produce constants $a,\beta$ such that
$$\sup_{x\in\mathbb{R}^d,y\in B(0,a)}\frac{G(x+y,t)}{G(x,\beta t)}<\infty.$$
But the continuity and non-vanishing of $G$ implies that the above estimate will hold if
$$\int_{\mathbb{R}^d} \frac{G(x+y,t)}{G(x,\beta t)} dx$$
converges.  Bounding the integral, we get
$$D\int_{\mathbb{R}^d} \exp\left(\frac{-\beta+1}{4t\beta} \left\{|x+y|^2-|x|^2\right\}\right) dx \\ \leq \int_{\mathbb{R}^d} \exp\left(\frac{-\beta+1}{4t\beta} \left\{|y|^2-2x\cdot y\right\}\right) dx:=J.$$
Now, using the fact that $|y|\leq a$, we get
$$J\leq I(a)=\int_{\mathbb{R}^d} \exp\left(\frac{-\beta+1}{4t\beta} \left\{|a|^2-2x\cdot y\right\}\right) dx.$$
The Moser-Trundinger inequality states that $I(a)<\infty$ for values $$a^2\frac{-\beta+1}{4t\beta} \leq 4\pi,$$ which readily gives $a\leq C\sqrt{t}$, from which the claim follows.

(2) Solution.  From Problem I we ave
$$\sup_{t>0}|G(\cdot,t)*f(\cdot)|(x,t)\leq A(Mf)(x),$$
and the claim follows by observing that it is true for $Mf$.

Problem III.  Let $f:\mathbb{R}^{d-1}\to\mathbb{R}$ belong to $L^{\infty}$ and let $u:\mathbb{R}^{d}_{+}\to\mathbb{R}$ be the Poisson integral of $f$.
  1. Show that $u(x,y)$ converges non-tangentially almost everywhere to $f(x)$ (i.e. from approach regions contained in any cone with vertex at $x$).  Specifically, for fixed $x_{0}$ and $$\mathcal{C}_{\alpha}(x_{0}):=\{(x,y)\in\mathbb{R}^{n}_{+}\;:\;|x-x_{0}|<\alpha y\},$$ we have $$\lim_{(x,y)\to(x_{0},0),\;(x,y)\in C_{\alpha}(x_{0})}u(x,y)=f(x_{0}).$$ The convention for the Poisson kernel is $$P_{y}(x)=\frac{c_{n}y}{(|x|^{2}+y^{2})^{d/2}},$$ where we regard $y>0$ and $x\in\mathbb{R}^{d-1}$ so that a typical point $z\in R^{d}_{+}$ is $z=(x,y)$ ($y$ is the distance from $z$ to $\partial\mathbb{R}^{d}_{+}$).  The Poisson integral of $f$ is then $$u(x,y)=\int_{\mathbb{R}^{d-1}}P_{y}(t)f(x-t)\;dt.$$ Since it will be needed below, we note that this is equivalent to the more usual representation $$P_{y}(x-t)=\frac{c_{n}y}{|(x,y)-(t,0)|^{d}}.$$  The expression $P_{y}(x-t)$ of course arises from the associativity of the convolution above.  This last form will be used to obtain the final estimate for $J$ in the proof below. 
  2. Show that $u$ matches the boundary values in a distributional sense. 
  3.  Show that if $v\in L^{\infty}(\mathbb{R}^{d}_{+})$ is distributionally harmonic and matches the boundary values $f$ in a distributional sense, then $u=v$ as distributions.
(1) Solution.  Fix $x_{0}\in\mathbb{R}^{d-1}=\partial\mathbb{R}^{d}_{+}.$  To obtain a.e. nontangential convergence, we need to show $u(x_{0}-t,y)\to f(x_{0}$ a.e. $x_{0}$ where $t\in\mathbb{R}^{d-1}$ and $|t|\leq\alpha y.$  We have as a first attempt
$$\begin{align*}
|u(x_{0}-t,y)-f(x_{0})|
&=\left|\int_{\mathbb{R}^{d-1}}P_{y}(x)f(x_{0}-t-x)\;dx-f(x_{0})\right|\\
&\leq\int_{\mathbb{R}^{d-1}}P_{y}(x-t)|f(x_{0}-t)-f(x_{0})|\;dx.
\end{align*}$$
If we estimate the integral directly, we get the inferior estimate $\leq2||f||_{\infty}.$  To gain better control on $f$, we suppose now $x_{0}\in\mathcal{L}(f),$ the Lebesgue set of $f$.  Since $L^{\infty}\subset L^{1}_{\text{loc}}$, we see that $m(L(f)^{c})=0$ by the corresponding well-known result for $L^{1}$.  Now let $\epsilon>0$.  Our choice of $x_{0}$ and the Lebesgue differentiation theorem implies the existence of a $\delta>0$ such that if $r<\delta$ we have
$$\frac{1}{m(B_{r})}\int_{|x-x_{0}|\leq r}|f(x)-f(x_{0})|\;dx=\frac{1}{m(B_{r})}\int_{|x|\leq r}|f(x_{0}-x)-f(x_{0})|\;dx<\epsilon.$$
With this $\delta$, we now estimate the last integral by splitting it over the complementary sets $B(0,\delta)$ and $(B(0,\delta))^{c}$, denoted by $I$ and $J$, respectively.  Note that $I$ would be trivial if $f$ were continuous; however, we are saved by the pointwise majorization property of the maximal function.  Indeed, if $g(x)=|f(x)-f(x_{0})|\chi_{|x-x_{0}|<\delta}$, then $|g(x_{0}-t)|\leq[Mg](x_{0})<\epsilon$ by the above estimate.  Hence,
$$I=\int_{|x|\leq\delta}P_{y}(x-t)|g(x_{0}-t)|\;dx\leq\epsilon\int_{\mathbb{R}^{d-1}}P_{y}(x-t)\;dx=\epsilon.$$
We now proceed to estimate $J$.  We first observe if $y$ is fixed and $t\in\mathbb{R}^{d-1}$ and $|t|<\alpha y$, then
$$P_{y}(x-t)\leq A_{\alpha}P_{y}(x)$$
for some absolute constant $A_{\alpha}$ independent of $f$ (this is a very similar estimate to that in problem #7, however, the analog of $y$ is not fixed and therefore a second constant $\beta$ is introduced).  To prove this, note that for $\alpha,y>0$ and $|t|\leq\alpha y$, and for simplicity $d=2$,
$$\begin{align*}
x^{2}+y^{2}
&=((x-t)+t)^{2}+y^{2}\\
&=(x-t)^{2}+2(x-t)t+t^{2}+y^{2}\\
&\leq(x-t)^{2}+\frac{1}{\alpha}(x-t)^{2}+\frac{1}{\alpha}t^{2}+y^{2}\\
&\leq(1+\alpha)[(x-t)^{2}+y^{2}],
\end{align*}$$
where we have used the well-known Cauchy inequality ``with $\epsilon=\alpha$'' in the first inequality.  Therefore
$$\frac{y}{(x-t)^{2}+y^{2}}\leq\frac{A_{\alpha}y}{x^{2}+y^{2}}$$
as desired (the generalization to $d>2$ is essentially the same).  Now, using the fact that $f\in L^{\infty}$, our choice of $x_{0}$, $\delta$, and $t$, and the ladder expression for $P_{y}(x)$ given in the problem statement, we get
$$\begin{align*}
J&\leq A_{\alpha}\int_{|x|>\delta}P_{y}(x)|f(x_{0}-x)-f(x_{0})|\;dx\\
&\leq2||f||_{\infty}A_{\alpha}\int_{|x|>\delta}P_{y}(x)\;dx\\
&\leq2||f||_{\infty}A_{\alpha}c_{n}y\int_{|x|>\delta}|x-x_{0}|^{-d}\;dx\\
&\to0\;\text{as}\;y\to0^{+}
\end{align*}$$
since the ladder integral exists.  Putting this together, we get
$$\limsup_{|t|<\alpha y,y\to0}|u(x_{0}-t,y)-f(x_{0})|\leq A_{\alpha}\epsilon\to0\;\text{as}\;\epsilon\to0,$$
from which the desired nontangential convergence follows.

(2) Solution.  Since $f\in L^{\infty}$, $Mf\in L^{\infty}$ because $M(\cdot)$ is bounded on $L^{p}$ for $p>1$.  It follows from Problem I that
$$\sup_{y>0}|u(x,y)|=\sup_{y>0}|P_{y}*f|(x)\leq A(Mf)(x)\leq A||Mf||_{\infty}<\infty,$$
and thus
$$\sup_{y>0}||u(x,\cdot)||_{\infty}<\infty.$$
Fixing a test function $\phi\in C_{c}^{\infty}(\mathbb{R}^{d-1})$, the dominated convergence theorem now implies (since $|u(x,y)\phi(x)|\leq A||Mf||_{\infty}|\phi(x)|,$ which is integrable)
$$\lim_{y\to0}\int_{\mathbb{R}^{d-1}}u(x,y)\phi(x)\;dx=\int_{\mathbb{R}^{d-1}}\lim_{y\to0}u(x,y)\phi(x)\;dx=\int_{\mathbb{R}^{d-1}}f(x)\phi(x)\;dx.$$
Remark. The proof of Weyl's lemma is a mollification argument.  For details, see problem #5 in my Since this holds for every test function, it follows that $u(x,0)=f(x)$ in the sense of distributions (the $0$ is just suggestive notation to indicate the restriction of $u$ to the boundary of the half space; $u$ is \emph{not} defined by convolution with $P_{y}$ when $y=0$ and must be obtained by a limiting process).
In fact, the limit function $\lim_{y\to0}u(x,y)$ is equal to $f(x)$ a.e. by the fundamental theorem of calculus of variations.

(3) Solution.  For test functions $\phi\in C_{c}^{\infty}(\mathbb{R}^{d-1})$ and $\psi\in C_{c}^{\infty}(\mathbb{R}^{d}_{+})$, we have
$$(1)  \int_{\mathbb{R}^{d-1}}v(x,0)\phi(x)\;dxdy=\int_{\mathbb{R}^{d-1}}f(x)\phi(x)\;dxdy$$
and
$$(2)  \int_{\mathbb{R}^{d}_{+}}v(x,y)\Delta\psi(x,y)\;dxdy=0.$$

(Again, the setting of $y=0$ is just notation for the restriction of $u$ or $v$ to $\mathbb{R}^{d-1}.$)  By (a), $(1)=\int_{\mathbb{R}^{d-1}}u(x,0)\phi(x)\;dx$ and so we see already that $u=v=f$ in the sense of distributions on $\mathbb{R}^{d-1},$ and actually pointwise a.e.  Weyl's lemma implies that if $v\in L^{1}_{\text{loc}}(\mathbb{R}^{d}_{+})$ (recall that the half space is \emph{open}) is a weak solution to Laplace's equation, then $v$ is a classical solution after possibly a correction on a set of measure zero.  The uniqueness of Dirichlet's problem and the preceeding application of Weyl's lemma (sine $L^{\infty}\subset L^{1}_{\text{loc}}$) now implies that $u=v$ a.e. in $\overline{\mathbb{R}^{d}_{+}}$ and so they are equal in the sense of distributions and after a correction of $v$ on a set of measure zero, they are equal pointwise too.
linked blog post in which I proved the lemma in a different course.  http://mathtm.blogspot.com/2013/02/math-266b-assignment-2.html.
Click here to read full post »

22 March, 2015

Divergence of Harmonic Series on a Sequence of Decreasing Sub-Domains of $\mathbb{N}$


The series $\sum_{n\in\mathbb{N}}n^{-p}$ diverges if $p\leq1$ and converges if $p>1$, and so it may seem plausible that (being a "bifurcation point" of this condition) the harmonic series $\sum_{n\in\mathbb{N}}n^{-1}$ could converge on some proper subset of $A\subset\mathbb{N}$. This is obvious if $A$ is finite. If $A$ is infinite, then a moment's thought reveals that there are many subsets on which the harmonic series converges since its terms contain any other series with terms in $\mathbb{N}^{-1}$. So for instance $$\sum_{n\in\mathbb{N}}n^{-2}=\frac{\pi^{2}}{6},$$ $$\sum_{n\in\mathbb{N}}\frac{1}{n!}=e,$$ $$\sum_{n\in\mathbb{N}}\frac{1}{2^{n}}=2,$$ and so on. Given that rather "large" subsets of $\mathbb{N}$ lead to convergence of the harmonic series, the following result was somewhat surprising to me when I was first asked to prove it.
Claim. Let $$A_\epsilon := \{a \in \mathbb{N} : 1 - cos(a) < \epsilon\}.$$ Then $$\sum_{n\in A_\epsilon } \frac{1}{n}$$ diverges for all $0<\epsilon<1.$
Proof.  For $0<\epsilon< 1$, the inequality $1-\cos(a)< \epsilon$ has solutions for $$a\in(2k\pi-\theta,2k\pi+\theta)$$ where $\theta=\cos^{-1}(1-\epsilon)$ (note that $\theta\in(0,\frac{\pi}{2})$ and by using a Taylor expansion, it is easy to see $\theta=O(\epsilon^{\frac{1}{2}})$, although all that is important is $\theta\to0$ as $\epsilon\to0$). For there to be any positive integers $a:=a_{k}$ in such an interval, it is necessary and sufficient that
$$\frac{[2k\pi-\theta]}{[2k\pi+\theta]}<1,$$
where $[\cdot]$ is the "floor" function (round down, e.g. truncate the decimals). Intuitively, this condition just says there is an integer in the $k$th solution interval (note that there could be multiple integral solutions in a $k$th interval, though this is not very important since we are mostly interested in the case for small $\epsilon$; furthermore, since $\theta=O(\epsilon^{\frac{1}{2}})$, then once $\epsilon$ is sufficiently small (say $\epsilon<0.1,$ so that $2\theta$).

From the above observations and the fact that $2\pi<6.3$ (circumference of the unit circle), it is not difficult to ascertain that $\#A=\infty$ (the cardinality of the set $A$).  Therefore $A$ is countable with its elements forming an "approximate" arithmetic sequence of integers in that sense that for
$$D:=\max_{a_{i}\in A}|a_{i+1}-a_{i}|<\infty,$$
on "average" the difference of two successive integers is approximately $D$ (having analytical results that are sharp is unnecessary in the present situation as we are only after qualitative facts like convergence).

We can now determine whether or not the sum converges. Define sequences $a_{j}:=\frac{1}{j}$ for $j\in A $, and $0$ otherwise, and $b_{j}:=\frac{1}{j}$ for all $j=1,2,\ldots.$ Then $c_{j}:=\frac{a_{j}}{b_{j}}=1$ for $j\in A$, and $0$ otherwise. Therefore, $c_{j}$ has a sum which looks like $$1+0+\ldots+0+1+0+\ldots+0+1+\ldots$$ Define one more sequence $d_{j}:=1$ if $j=a_{1}D$ ($a_{1}$ being the first integral solution to the original inequality) and $0$ otherwise (in other words, $d_{j}$ really is an arithmetic sequence with common difference $D$). Recall from the theory of Cesaro summation that for zero-spacing $D$, $$\frac{1+0+\ldots+0+1+0+\ldots+0+\ldots+0+1_{n}}{n}\to\frac{1}{D+1}\;as\;n\to\infty$$ (note because Cesaro summation is an averaging process, the limit holds even if there is a finite number of instances of improper spacing for a finite number of terms). Consequently, $$\frac{d_{1}+\ldots+d_{j}}{n}=\frac{1}{D+1}\;as\;n\to\infty$$ (see previous parenthetical remark). Consequently, \begin{align*} \lim\limits_{n\to\infty}\frac{1}{n}\sum\limits_{j=1}^{n}\frac{a_{j}}{b_{j}} &=\lim\limits_{n\to\infty}\frac{1}{n}\sum\limits_{j=1}^{n}c_{j}\\ &\geq\lim\limits_{n\to\infty}\frac{1}{n}\sum\limits_{j=1}^{n}d_{j}\\ &=\frac{1}{D+1}\\ &>0 \end{align*} for all $\epsilon>0$, no matter how small (note that $D$ behaves something like $O(\theta^{-1})$, and by extension something like $O(\epsilon^{-\frac{1}{2}}).$ It follows that $$\sum\limits_{j=1}^{\infty}a_{j}=\infty,$$ e.g. diverges for every $\epsilon>0$ (if you don't see why or don't recognize the convergence theorem used, just apply the summation by parts formula to $\sum a_{j}$ together with the established bound).
Click here to read full post »

21 March, 2015

Does the Trigonometric Harmonic Series Converge?



It is well known that the harmonic series $H(x)=\sum_{n=1}^{\infty} xn^{-1}=+\infty$ for every $x\neq0$, but what about the trigonometric harmonic series $T(x)=\sum_{n=1}^{\infty}e^{inx}n^{-1}$?  Obviously for $k=1,2,\ldots$ we have $T(2k\pi)=H(1)=+\infty$.  It is an interesting fact that the cancellation properties inherent in $T$ imply convergence.  This is relatively straight-forward to prove this using a modificaton of Leibniz's alternating series test.  More remarkable is that the convergence is actually absolute.

In order to investigate the convergence of
$$(1)\;\;\;\;\;T(x)=\sum_{n=1}^{\infty}\frac{e^{inx}}{n}<\infty,$$
first note that
$$\lim_{n\to\infty}|z^{n}|\to0$$
for every $z\in\mathbb{C}$ with $|z|<1$.  Since
$$1>\frac{1}{n}>\frac{1}{n+1}>0$$ for all $n>1$, we find $\frac{1}{n}\searrow0$ (monotonically decreases to zero) and so Dirichlet's test implies
$$(2)\;\;\;\;\;\sum\limits_{n=1}^{\infty}\frac{z^{n}}{n}<\infty,$$
the convergence taking place and being absolute for every $z$ with $|z|<1$.  To deal with the boundary $|z|=1$, note that if $|z|=1$ and $z\neq1$ (i.e. $z\neq1+0i$), then we have
$$\left|\sum_{n=1}^{N}z^{n}\right|=\left|\frac{1-z^{N+1}}{1-z}\right|\leq\frac{2}{1-z}<\infty.$$
The upper bound $M=\frac{2}{1-z}$ is independent of $N$ and so (2) holds for all $|z|\leq1$, except when $z=1$.  Putting $z\mapsto e^{inx}$ shows that (1) converges absolutely for every $x\neq 2k\pi$ ($k=1, 2, \ldots$).

To carry out the actual summation for $T(x)$ is a tedious exercise in complex analytic methods, and the resulting formulas are unworkable (although again rather remarkably, they contain only elementary functions).  Another approach is to recognize that $T(x)$ is the Fourier transform (series) of some periodic function with Fourier coefficients $\hat{f}(0)=0$ and for $n>1$
$$\hat{f}(n)=\frac{1}{n}.$$
Despite this, the computation is relatively straight-forward for certain values of $x$.  For example, take $x=1$ and note that
$$T(1)=\sum_{n=1}^{\infty}\frac{e^{in}}{n}.$$
Writing
$$\int\left(\underbrace{(e^{iz})^{1}+(e^{iz})^{2}+\ldots}_{\text{geomtric series with ratio }r=e^{iz}}\right)dz=\int\frac{e^{iz}}{1-e^{iz}}\;dz,$$
we find that (with $u=1-e^{iz}$)
$$\sum_{n=1}^{\infty}\frac{1}{in}(e^{iz})^{n}=i\int\frac{du}{u}=i\ln(1-e^{iz}).$$
Combining all of this together, we obtain
$$\begin{align*}
T(1)
&=\left(i\ln(1-e^{iz})\right)\Big|_{z=1}\\
&=i\ln\left(e^{i/2}\left((e^{-i/2}-e^{i/2}\right)\right)\\
&=i\ln\left(e^{i/2}\right)+i\ln\left(2i\sin\left(-\frac{1}{2}\right)\right)\\
&=-\frac{1}{2}+i\left(\ln(-i)+\ln\left(\sin\left(\frac{1}{2}\right)\right)\right)\\
&=-\frac{1}{2}+\frac{\pi}{2}+i\ln\left(2\sin\left(\frac{1}{2}\right)\right)
\end{align*}$$
Since
$$T(1)=\sum_{n=1}^{\infty}\left(\frac{\cos n}{n}+i\frac{\sin n}{n}\right),$$
taking real and imaginary parts yields
$$T(1)=\frac{-\ln(2-2\cos(1))}{2}+i\frac{\pi-1}{2}.$$

The graphic at the beginning of the post shows the graph of $\sin n/n$ on the $(n,x)$ plane.
Click here to read full post »

20 March, 2015

A Rigorous Proof of Ito's Lemma

In this post we state and prove Ito's lemma.  To get directly to the proof, go to II Proof of Ito's Lemma.

For all its importance, Ito's lemma is rarely proved in finance texts, where one often finds only a heuristic justification involving Taylor's series and the intuition of the "differential form" of the lemma.  There are various reasons for this.  Ito's lemma is really a statement about integration, not differentiation.  Indeed, differentiation is not even defined in the realm of stochastic processes due to the non-differentiability of Brownian paths.  Thus, in order to present a proof of Ito's lemma, one must first cover stochastic integrals and prior to that the basic properties of Brownian motion, topics which for reasons of scope/audience cannot always be covered.  However, even more mathematically inclined texts only provide a sketch and skirt over technical details of convergence.  The purpose of this article is to remedy this situation and we begin with

I. MOTIVATION AND A REVIEW OF ORDINARY CALCULUS

If $f$ is $k+1$ times differentiable then Taylor's theorem asserts
$$(1)\;\;\;\;f(t+h)-f(t)=hf'(t)+\frac{h^{2}}{2}f''(t)+\ldots+\frac{h^{k+1}}{(k+1)!}f^{(k+1)}(t^{*})$$
where $t^{*}\in[t,t+h]$ if $h>0$ and $t\in[t+h,t]$ if $h<0$.

Fix $T>0$ ($T$ not necessarily small) and consider the difference $f(T)-f(0)$.  This can be computed as a sum of non-overlapping differences, i.e. if $\Pi=\{t_{0}=0,t_{1},\ldots,t_{n}=T\}$ is a partition of $[0,T]$, then with the aid of (1) using $h=t_{i+1}-t_{i}$, we get

$$\begin{align*}
(2)\;\;\;\;f(T)-f(0)&=\sum_{i=0}^{n-1}f(t_{i+1})-f(t_{i})\\
&=\sum_{i=0}^{n-1}f'(t_{i})(t_{i+1}-t_{i})+\frac{1}{2}\sum_{i=0}^{n-1}f''(t_{i})(t_{i+1}-t_{i})^{2}+\sum_{i=0}^{n-1}o\left(||\Pi||^{2}\right).\end{align*}$$

As $n\to\infty$ (or $||\Pi||\to0$, i.e. $\max_{i}(t_{i+1}-t_{i})\to0$), we get
$$\sum_{i=0}^{n-1}f'(t_{i})(t_{i+1}-t_{i})\to\int_{0}^{T}f'(s)\;ds$$
and for $k\geq2$
$$\frac{1}{k!}\sum_{i=0}^{n-1}f^{(k)}(t_{i})(t_{i+1}-t_{i})^{k}\leq||\Pi||^{k-1}\sum_{i=0}^{n-1}f^{(k)}(t_{i})(t_{i+1}-t_{i})\to0\cdot\int_{0}^{T}f^{(k)}(s)\;ds=0.$$

That is, $f(T)-f(0)=\int_{0}^{T}f'(s)\;ds$, which is the second fundamental theorem of calculus.  Now suppose $f$ and $g$ are smooth functions with $k+1$ derivatives and consider the composition $h=f\circ g$.  The familiar chain rule implies $h$ is differentiable and that
$$(3)\;\;\;\;h'(t)=f'(g(t))g'(t).$$

By substituting $h$ into (2) and computing $h^{(k)}$ iteratively according to (3), we get
$$(4)\;\;\;\;f(g(T))-f(g(0))=\int_{0}^{T}f'(g(x))g'(x)\;dx.$$

We shall now see what happens when $g$ is not differentiable.  In that case, $h$ is not differentiable, and (1) through (4) are no longer valid.  However, we can write (4) instead as
$$(5)\;\;\;\;f(g(T))-f(g(0))=\int_{0}^{T}f'(g(x))\;dg$$
where the integral is now taken as a Riemann-Stieltjes intergal.  If $g$ is differerntiable, then (5) reduces to (4), but (5) still makes sense even if $g$ is merely continuous (continuity is needed since $\int h\;dg$ is not well-defined if $h$ and $g$ share a common discontinuity, and $h=f(g(t))$ will in general be discontinuous wherever $g$ is).  Moreover, since $f$ is smooth, we may rewrite (2) as
$$\begin{align*}
(6)\;\;\;\;f(g(T))-f(g(0))&=\sum_{i=0}^{n-1}f(g(t_{i+1}))-f(g(t_{i}))\\
&=\sum_{i=0}^{n-1}f'(g(t_{i}))(g(t_{i+1})-g(t_{i})) + \frac{1}{2}\sum_{i=0}^{n-1}f''(g(t_{i}))(g(t_{i+1})-g(t_{i}))^{2}+\ldots\end{align*}$$

Despite $g$ being non-differentiable, if it is sufficently "nice" then the terms converge to the same values as in (2) and we will recover (5).  A useful sufficient condition is that $g$ be continuous and of bounded variation.  This means
$$[g](T)=\sup_{\Pi}\sum_{i\in\Pi}|g(t_{i+1})-g_(t_{i})|<\infty.$$
It is easy to prove that if $g$ is differentiable, then it is of bounded variation, since then an easy application of the above (or the mean-value theorem) gives (for a norm decreasing sequence of partitions $\Pi_{1},\Pi_{2},\ldots$)
$$[g](T)=\lim_{n\to\infty}\sum_{j\in\Pi_{n}}|g(t_{j+1})-g(t_{j})|=\int_{0}^{T}|g'(t)|\;dt<\infty.$$
For $\int f\;dg$, $g$ being of bounded variation and not sharing common discontinuities with $f$ is usually the most general sufficient condition used when considering existence, though this is not strictly necessary.  When $g$ is not of bounded variation, then $\int f\;dg$ may or may not exist, and it may even exist conditionally on the particular sample point used in the approximating sums, as we shall see below.

Now, the Ito lemma deals with the special case $g(t)=W(t)$ where $W$ is a Brownian motion sample path.  It turns out that for $\omega$ a.s. that
$$[W](T)=\infty,$$
and
$$[W,W](T):=\sup_{\Pi}\sum_{i\in\Pi}|W(t_{i+1})-W(t_{i})|^{2}=\infty.$$
The latter quantity is called the quadratic (or second) variation of $W$.  For continuous functions $g$, $[g,g](T)\equiv0$ (this follows from estimating the higher order terms in (2)).  Moreover,
$$[W]^{(3)}(T):=\sup_{\Pi}\sum_{i\in\Pi}|W(t_{i+1})-W(t_{i})|^{3}=0.$$
In fact, we have $[W]^{(\alpha)}(T)=\infty$ for $\alpha\leq2$ and $[W]^{(\alpha)}(T)=0$ for $\alpha>0$, a.s. $\omega$.  It would seem that the regularity on which integration theory depends on so directly (i.e. variation of the integrator) is not tractable for $W$.  It turns out though, that we can obtain something useful by weakening the definition slightly.  Let $\Pi_{1},\Pi_{2},\ldots$ be a sequence of partitions with $||\Pi_{n}||\to0$ as $n\to\infty$.  Then we can redefine the quadratic variation as
$$[W,W](T):=\lim_{n\to\infty}\sum_{i\in\Pi_{n}}|W(t_{i+1})-W(t_{i})|^{2}.$$
Unfortunately, even this is not well-defined without further qualification.  The reason that the supremum definition for the quadratic variation is a.s. infinite is due to the fact that it is possible for any $C>0$ to find a sequence of partitions $\{\Pi^{C}_{n}\}_{n}$ so that the above definition is equal to $C$ for some fixed sample path $\omega$.  However, the limit converges to $T$ in $L^{2}(\Omega)$ (or in probability, if you prefer).  That is to say, it converges in the $L^{2}$ norm to some random variables $Q(\omega)$ so that $Q(\omega)=T$ a.s. $\omega$ (recall that $L^{2}$ limits are defined only up to a set of measure $0$).  It turns out that if we make the further restriction that $\Pi_{1}\supset \Pi_{2}\supset \Pi_{3},\ldots...$ and that $\sum_{n=1}^{\infty}||\Pi_{n}||<\infty$, then the limit also holds $\omega$ a.s. pointwise (Borel-Cantelli).  In the remainder of this post we will not distinguish between these modes of convergence and state freely that $[W,W](T)=T$, without furthe reference to any technicalities with this claim.

Since $[W](T)=\infty$ and $[W,W](T)=T$, we must take care in computing the various limits appearing in
$$\begin{align*}(7)\;\;\;\;f(W(T))-f(W(0))&=\sum_{i=0}^{n-1}f(W(t_{i+1}))-f(W(t_{i}))\\
&=\sum_{i=0}^{n-1}f'(W(t_{i}))(W(t_{i+1})-W(t_{i})) + \frac{1}{2}\sum_{i=0}^{n-1}f''(W(t_{i}))(W(t_{i+1})-W(t_{i}))^{2}+\ldots\end{align*}$$

Since $[W,W](T)=T<\infty$, is follows that $[W]^{(k)}(T)=0$ for all $k\geq3$ by a simple estimate as as been done several times above.  Thus the $\ldots$ terms can safely be ignored.  And since $\sup_{t\in[0,T]}|f''(t)|<\infty$, the second sum converges.  We shall see that it converges to
$$\int_{0}^{T}f''(W(s))\;ds.$$
(Incidentally, this is where the commonly used, though mathematically meaningless, notation $dWdW=dt$ comes from).  The first term also converges, though this is not immediately obvious since the Riemann-Stieltjes theory does not apply to it as the integrator $W(t)$ is not of bounded variation.  It turns out that it converges to
$$\int_{0}^{T}f'(W(s))\;dW$$
where the integral is what is known as an Ito integral.  This integral is constructed exactly like a Riemann-Stieltjes integral, except that the sample point used in the approximating sums must always be the left-hand point of the interval.  Differerent approximation schemes (i.e. mid-point, right-point, etc.) lead to different limiting values.  If the mid-point is used, it is referred to as the Strochonivich integral.  We shall not need this integral here.  The reason that the Ito integral is used (i.e. left-hand point approximation) is that $f(W(t_{i}))$ is interpreted as the position we take in a stock at time $t_{i}$ with the information available at time $t_{i}$, and the capital gain on the stock is then $f(W(t_{i}))(W(t_{i+1})-W(t_{i}))$ if we assume the stock price follows a Brownian motion (which strictly speaking it doesn't, but we shall ignore this fact here since it can be corrected by replacing $W$ with geometric Brownian motion $X$).  Taking the limit as $\max|t_{i+1}-t_{i}|\to0$ and then summing the individual gains gives us the net capital gains on a portfolio resulting from taking positions $f(W(t))$ in continuous time.

In light of the above, we conclude that
$$(8)\;\;\;\;f(W(T))-f(W(0))=\int_{0}^{T}f'(W(s))\;dW(s)+\frac{1}{2}\int_{0}^{T}f''(W(s))\;ds.$$
Compare this to (4), and we see that we obtain one, and only one, extra term $\frac{1}{2}\int_{0}^{T}f''(W(s))\;ds$, can be traced back to the fact $[W,W](T)=T$ and $[W]^{k}=0$ for $k\geq2.$  This is often recast in differential notation (which again, is mathematically meaningless)
$$(9)\;\;\;\;df=f'dW+\frac{1}{2}f''dt.$$

The mathematically meaningful form is (8), though (9) is used more often for calculations since it is accompanied by what is known as a "box" calculus that facilitates computations.  This will be discussed in more detail below.


II. PROOF OF ITO'S LEMMA

Let $\{W(t)\}_{t\geq0}$ be a standard Brownian motion with the natural filtration $\{\mathcal{F}_{t}\}_{t\geq0}$, and $f(x,t)\in\mathcal{C}^{2}(\mathbb{R}\times[0,T])$ jointly in $(x,t)$.  We will consider the stochastic process $\Delta(t)=f(W(t),t)$, which is clearly adapted to $\{
\mathcal{F}_{t}\}_{t\geq0}.$

We take the following preliminary facts for granted, and defer to previous blog posts covering Brownian motion and stochastic integration for proofs.
  1. Almost surely, we have the variation formulas $[W]^{1}(t)=+\infty,[W]^{2}(t)=t$ and $[W]^{k}(t)=0$ for $k\geq3$.
  2. Almost surely, we have the convergence of $\lim_{||\Pi_{[0,T]}||\to0}\sum_{i=1}^{n}\Delta(t_{i})(W(t_{i+1})-W(t_{i}))$ for any continuous and adapted process $\Delta(t)$.  We denote this limit by $\int_{0}^{T}\Delta(t)\;dW(t)$ and refer to it as the Ito integral of $\Delta$.  The limit is taken in $L^{2}(\Omega).$
Theorem (Ito's Lemma).  With the notation above, we have for all $T>0$ $$\begin{align*}f(W(T),T)-f(W(0),0)=\\\int_{0}^{T}f_{t}(W(t),t)\;dt+\int_{0}^{T}f_{x}(W(t),t)\;dW(t)+\frac{1}{2}\int_{0}^{T}f_{xx}(W(t),t)\;dt.\end{align*}$$  We sometimes write for $f=f(W(t),t)$ $$df=f_{t}dt+f_{x}dW+\frac{1}{2}f_{xx}dt.$$

Proof.  Fix $T>0$ and let $\Pi=\{t_{0}=0,t_{1},\ldots,t_{n}=T\}$ be a partition of $[0,T]$ and compute using Taylor's expansion
$$\begin{align*}
f(W(T),T)-f(0,0)&=\sum_{i=0}^{n-1}(f(W(t_{i+1}),t_{i+1})-f(W(t_{i}),t_{i}))\\
&=\sum_{i=0}^{n-1}f_{t}(W(t_{i}),t_{i})(t_{i+1}-t_{i})\\
&+\sum_{i=0}^{n-1}f_{x}(W(t_{i}),t_{i})(W(t_{i+1})-W(t_{i}))\\
&+\frac{1}{2}\sum_{i=0}^{n-1}f_{xx}(W(t_{i}),t_{i})(W(t_{i+1})-W(t_{i}))^{2}\\
&+\sum_{i=0}^{n-1}O((t_{i+1}-t_{i})(W(t_{i+1})-W(t_{i})))\\
&+\sum_{i=0}^{n-1}O((t_{i+1}-t_{i})^{2})\\
&+\sum_{i=0}^{n-1}O((W(t_{i+1})-W(t_{i}))^{3})\\
&:= A+B+C+D+E+F.\end{align*}$$

The left hand side is unaffected by taking limits as $||\Pi||\to0$, and so we may do so in computing the right hand side terms.  Without loss of generality we assume $\Pi$ is uniform, so we consider equivalently $n\to\infty.$

The regularity of $f$ implies that
$$A\to\int_{0}^{T}f_{t}(W(t),t)\;dt\;\text{as}\;n\to\infty,$$
the integral being an ordinary Lebesgue (Riemann) integral.  By item 2 above we have
$$B\to\int_{0}^{T}f_{x}(W(t),t)\;dW(t)\;\text{as}\;n\to\infty,$$
the integral being an Ito integral as discussed here.  To deal with $D$, $E$ and $F$ we estimate
$$|D|\ll_{\beta}\sup_{0\leq i\leq n}|W(t_{i+1})-W(t_{i})|\sum_{i=0}^{n-1}(t_{i+1}-t_{i})\ll_{\beta}T\sup_{0\leq i\leq n}|W(t_{i+1})-W(t_{i})|,$$
$$|E|\ll_{\beta}\sup_{0\leq i\leq n}|t_{i+1}-t_{i}|\sum_{i=0}^{n-1}(t_{i+1}-t_{i})\ll_{\beta}T\sup_{0\leq i\leq n}|t_{i+1}-t_{i}|,$$
and
$$|F|\ll_{\beta}\sup_{0\leq i\leq n}|W(t_{i+1})-W(t_{i})|\sum_{i=0}^{n-1}(W(t_{i+1})-W(t_{i}))^{2}.$$
Appealing to item 2 above we then conclude (since the maps $t\mapsto t$ and $t\mapsto W(t)$ are continuous) that
$$D,E,F\to0\;\text{as}\;n\to\infty.$$
It remains to establish the limit
$$C\to\frac{1}{2}\int_{0}^{T}f_{xx}(W(t),t)\;dt\;\text{as}\;n\to\infty.$$
Intuitively this should be true since $[W]^{2}(T)=T,$ a fact that we sometimes write as $dWdW=dt.$  However, a rigorous proof requires some effort, and this is precisely the point in the proof (assuming Brownian motion and stochastic integration are covered) that almost every mathematical finance text skips over.  (Note that theorem has already been proved in the special case that $f=p(x,t)$, a second degree polynomial; as an example, consider the special case $f(x,t)=\frac{1}{2}x^{2}$ in order to compute the Ito integral $\int_{0}^{T}W(t)\;dW(t)$).

Because this fact is of interest in and of itself, we isolate the proof that $C\to\frac{1}{2}\int_{0}^{T}f_{xx}(W(t),t)\;dt\;\text{as}\;n\to\infty$ in the following lemma.

Lemma.  Let $f$ be a bounded continuous function on $[0,T]$ and $\{W(t)\}_{t \geq 0}$ a standard one-dimensional Brownian motion. Then almost surely $$\sum_{i=0}^{n-1} f(W(t_{i}))(W(t_{i+1})-W(t_{i}))^{2}\to\int_{0}^{T}f(W(t))\;dt\;\text{as}\;n\to\infty$$ where $n\to\infty$ means (WLOG) $\Pi=\{t_{0}=0,t_{1},\ldots,t_{n}=T\}$ is a uniform partition of $[0,T]$ and $|\Pi| := \max_j |t_j-t_{j-1}|\to0$.

Proof.  Since $t \mapsto f(W(t))$ is (almost surely) continuous, $$\sum_{i=0}^{n-1} f(W_{t_{i}})(t_{i+1}-t_{i}) \to \int_0^T f(W(t))\;dt\;\text{as}\;n\to\infty.$$
Therefore, it suffices to show
$$I_n := \sum_{i=0}^{n-1} f(W(t_{i})) \bigg[ (W(t_{i+1})-W(t_{i}))^2 - (t_{i+1}-t_{i}) \bigg] \to 0\;\text{as}\;n\to\infty.$$

At this point it is convenient to define $\Delta t_{i} := t_{i+1}-t_{i}$ and $\Delta W_i := W(t_{i+1})-W(t_{i})$.  Recalling that $\{W(t)^2-t\}_{t \geq 0}$ is a martingale with respect to the canonical filtration $(\mathcal{F}_t)_{t \geq 0}$, we compute

$$\begin{align*} &\quad \mathbb{E} \bigg( f(W(t_{i})) f(W(t_{i-1}))[\Delta W_i^2 - \Delta_i]\Delta W_i^2-\Delta_i]\bigg)\\ &= \mathbb{E} \bigg( \mathbb{E} \bigg( f(W(t_{i})) f(W(t_{i-1}))  [\Delta W_i^2 - \Delta_i]  [\Delta W_i^2-\Delta_i] \mid \mathcal{F}_{t_{i}} \bigg) \bigg) \\ &= \mathbb{E} \bigg( f(W(t_{i})) f(W(t_{i-1}))  [\Delta W_i^2-\Delta_i]  \underbrace{\mathbb{E} \bigg( \Delta W_i^2 - \Delta_i \mid \mathcal{F}_{t_{i}} \bigg)}_{\mathbb{E}(\Delta W_i^2-\Delta i)=0} \bigg) = 0, \end{align*}$$

and thus

$$\mathbb{E}(I_n^2) = \mathbb{E}\left(\sum_{i=0}^{n-1} f(W(t_{i}))^2 (\Delta W_i^2-\Delta_i)^2 \right).$$

(Observe that the cross-terms vanish.)  Using that $f$ is bounded and $W(t)-W(s) \sim W(t-s) \sim \sqrt{t-s} W(1)$ we find

$$\begin{align*} \mathbb{E}(I_n^2) &\leq \|f\|_{\infty}^2 \sum_{i=0}^{n-1} \mathbb{E}\bigg[(\Delta W_i^2-\Delta_i)^2\bigg] \\ &= \|f\|_{\infty}^2 \sum_{i=0}^{n-1} \Delta_i^2  (\mathbb{E}(W_1^2)-1)^2 \\ &\leq C |\Pi| \sum_{i=0}^{n-1} \Delta_i = C |\Pi| T \end{align*}$$


for $C := \|f\|_{\infty}^2 (\mathbb{E}(W_1^2)-1)^2$. Letting $|\Pi| \to 0$, the claim follows.



III.  CLARIFICATION OF "ALMOST SURE" CONVERGENCE

We assume the reader is familiar with the various lines of convergence in real analysis: pointwise, uniform, almost uniform, in measure/probability, $L^{p}$, etc.  This short section is just to help clarify what is meant by almost sure convergence in the context of this and related topics.

Statements of convergence involving Brownian motion are almost always established in $L^{2}(\Omega,P)$, which in turn implies convergence in probability because Chebyshev's inequality states for a sequence of random variables $X_{n}$ and proposed limit $X$ that
$$P(|X_{n}-X|\geq\epsilon)\leq\frac{1}{\epsilon^{2}}\mathbb{E}\left[|X_{n}-X|^{2}\right]\to0\;\text{as}\;n\to\infty\;\text{for all}\;\epsilon>0\;\text{fixed}.$$

For example, in the proof of Ito's lemma we really proved that $$\lim_{n\to\infty}\sum_{i=0}^{n-1}f(W(t_{i-1}),t_{i-1})(W(t_{i+1})-W(t_{i}))^{2}=\int_{0}^{T}f(t)\;dt$$ in $L^{2}(\Omega)$, and by consequence, almost surely.  To clarify, this means that for almost every sample path, or outcome $\omega\in\Omega$, we have
$$\lim_{n\to\infty}X_{n}(\omega):=\lim_{n\to\infty}\sum_{i=0}^{n-1}f(W(t_{i-1}),t_{i-1})(W(t_{i+1})-W(t_{i}))^{2}=\int_{0}^{T}f(t)\;dt.$$

The case is similar to proving things like almost surely $[W,W](t)=t$ and almost surely $\int f(t)\;dW(t)$ exists in the Ito sense.


Click here to read full post »

Bifurcating Lease Embedded FX Derivatives



Section I.  Overview

Suppose an entity enters into an agreement to lease property and make rental payments each month, but that the fixed notional underlying the lease payments is denominated in some other currency.  This introduces an exposure for the lessee (and lessor), since now the lessee must pay (and the lessor receive) a domestic currency equivalent of some fixed amount in a foreign currency - in other words, the actual payment in the functional currency is Foreign Denominated Lease Notional x Exchange Rate, whatever that might be at the time the payment becomes due.  If the entity is a corporate entity, accounting regulations require the entity to "bifurcate" the embedded derivative from the contract and account for it as though it was a legitimate derivative, per the rules of derivative accounting.  This introduces accounting complexities, but the problem must also of course be solved from a valuation point of view.

In this post we consider lease agreements as just described, as well as those with caps and floors on the exchange rate with strikes contractually written into the agreements.

Section II.  Valuation Methodology

For leases determined to have embedded derivatives (from the point of view of the domestic entity), we value the embedded derivative as a strip of component derivatives corresponding to each future cash flow. That is, each cash flow represents the notional (denominated in the foreign currency CUR1) for each component derivative, and value of the lease embedded derivative is the aggregate value of these component derivatives.

Our FX convention is CUR1/CUR2 where this rate is the \# units of CUR2 per 1 unit of CUR1 - such a quantity has units [CUR2]/[CUR1]. We refer to CUR2 as the domestic, functional and settlement currency and CUR1 as the foreign, deal and notional currency.

Our valuation methodology is based on usual market-practice - in particular, no arbitrage and discounted cash flow principles. For options, we use the additional assumption of no-arbitrage for an asset price following a simple geometric Brownian motion (Black-Scholes-Merton model). Consider a present valuation date $t$, future maturity date $T>t$, future cash flow $N=N(T)$, corresponding strike rate $K=K(T)$, forward rate $F=F_{t}(T)$, discount rate $D=D_{t}(T)$ and volatility $\sigma=\sigma_{t}(K,T)$. Let $V=V_{t}(T,N,K,F,D,\sigma)$ denote the value of a derivative written on CUR1/CUR2 with the previous parameters. Then our previous assumptions lead us to the following valuation formulas: $$(1)\;\;\;\;V^{\text{fwd}}_{t}(T)=N(T)\cdot(F_{t}(T)-K(T))\cdot D_{t}(T)$$
$$(2)\;\;\;\;V^{\text{call}}_{t}(\Phi(d_{+})F_{t}(T)-\Phi(d_{-})K)\cdot D_{t}(T),$$
and
$$(3)\;\;\;\;V^{\text{put}}_{t}(T)=(\Phi(-d_{-})K-N(-d_{+})F_{t}(T))\cdot D_{t}(T).$$
In (2) and (3) we define
$$d_{\pm}=\frac{1}{\sigma_{t}(K,T)\sqrt{T-t}}\left[\log\left(\frac{F_{t}(T)}{K(T)}\right)\pm\frac{1}{2}\sigma_{t}(K,T)^{2}(T-t)\right]$$
and
$$\Phi(x)=(2\pi)^{-1/2}\int_{-\infty}^{x}e^{\frac{-y^{2}}{2}}\;dy,$$
the standard normal cumulative distribution function.

(Note the dependence of $\sigma_{t}$ on $(K,T)$ is due to the nature of FX option markets exhibiting term structure variation and ``smiles.'')

Section III.  Extraction Methodology

Section III(a).  Specifying the Strike

We extract the embedded derivative in accordance to the principle that the stated value of the cash flow at inception of the lease agreement should be such that the value of the embedded derivative at inception is $0$. We appy this principle to the forward component of the embedded derivative, and approximate it by assuming the cancellation between the cap and floor values (because one is a short position and the other is a long position - see below) would net $0$ if we assumed that they constitute a forward in combination. This is exactly true from put-call parity when the strikes are the same, but only approximately true if they are different (which they must be since otherwise the combination of the three instruments would net $0$ and there would be no embedded derivative). In particular, if the lease agreement has $i=1,2,3,\ldots,n$ future cash flows, a cap $\overline{S}=\overline{S}(T_{i})$ and a floor $\underline{S}=\underline{S}(T_{i})$, then for each corresponding component derivative we set (where again, $t=0$ is the inception date of the lease): $$(4)\;\;\;\;K^{\text{fwd}}(T_{i})=F_{0}(T_{i}),$$ $$(4)\;\;\;\;K^{\text{cap}}(T_{i})=\overline{S}(T_{i}),$$ and $$(4)\;\;\;\;K^{\text{flr}}(T_{i})=\underline{S}(T_{i}).$$ Accounting rules indicate that this is the proper approach from a valuation point of view.

Section III(b).  Specifying the Derivative - A Decomposition

In keeping with our notation, we let $L_{t}(T_{i})$ denote the present fair value at time $t$ of the future cash flow made at time $T_{i}$. This is always a negative quantity from the entity's point of view. The idea in order to obtain the embedded derivative is to separate the risky portion of this value from the non-risky portion. In particular, we decompose $L_{t}(T_{i})$ as
$$L_{t}(T_{i})=B_{t}(T_{i})+Z_{t}(T_{i}),$$
where $B_{t}(T_{i})$ only depends on $t$ through the discounting at time $t$ $D_{t}(T_{i})$ (in particular, it is independent of market variables like $F_{t}(T_{i})$), and $Z_{t}(T_{i})$ is a function of all random market variables inherent in $L_{t}(T_{i})$. There are an infinite number of ways to structure such a decomposition, but accounting guidance discussed above is equivalent to certain initial and terminal conditions which allow us to uniquely solve for $B(T_{i})$ and $Z(T_{i})$.

Section III(c).  Specifying the Derivative - Forward Only Case

If the lease payment $L_{t}(T_{i})$ lacks any optional features, then its payoff is
$$L_{T_{i}}(T_{i})=-N\cdot S_{T_{i}}$$
and therefore its fair present value for $0<t<T_{i}$ is given by $$(7)\;\;\;\;L_{t}(T_{i})=-N(T_{i})\cdot F_{t}(T_{i})\cdot D_{t}(T_{i}).$$ Observe that this quantity has units of CUR2 and this is the present value of what the entity has to pay at time $T_{i}$. Since it depends on the forward rate $F_{t}(T_{i})$, it has an exposure to CUR1/CUR2 and is therefore risky. The ideas previous discussed involves decomposing $L_{t}(T_{i})$ into two parts $$L_{t}(T_{i})=B_{t}(T_{i})+Z_{t}(T_{i}),$$ where $B_{t}(T_{i})$ only depends on $t$ through $D_{t}(T_{i})$ (in particular, it is independent of $F_{t}(T_{i})$), and $Z_{t}(T_{i})$ is a function of $F_{t}(T_{i}).$ The initial condition $$Z_{0}(T_{i})=0$$
and terminal payoff condition
$$L_{T_{i}}(T_{i})=B_{T_{i}}(T_{i})+Z_{T_{i}}(T_{i})=N(T_{i})\cdot F_{T_{i}}(T_{i})\cdot D_{T_{i}}(T_{i})=-N(T_{i})\cdot S_{T_{i}}$$
allow us to uniquely solve for the payoff of $B(T_{i})$ and $Z(T_{i})$, the principle of rational pricing and the fact that $B_{t}(T_{i})$ is a constant in $t$ (ignoring discounting) then gives us $L$, $B$, and $Z$ for all $0<t<T_{i})$. Indeed, from the terminal condition we have
$$Z_{T_{i}}(T_{i})=-N(T_{i})\cdot S_{T_{i}}-B_{T_{i}}(T_{i})$$
and from the initial condition
$$B_{0}(T_{i})=L_{0}(T_{i})=-N(T_{i})\cdot F_{0}(T_{i})\cdot D_{0}(T_{i})=-N(T_{i})\cdot K^{\text{fwd}}(T_{i})\cdot D_{0}(T_{i}).$$
Hence (dropping the ``fwd'' from $K$),
$$Z_{T_{i}}(T_{i})=-N(T_{i})\cdot(K(T_{i})-S_{T_{i}}).$$
This shows that the payoff of $Z(T_{i})$ is equal to a short position in CUR1/CUR2 with notion $N(T_{i})$. Applying our reasoning above, we discover $Z_{t}(T_{i})$ is given by (2). Explicitly,
$$(8)\;\;\;\;Z_{t}(T_{i})=-N(T_{i})\cdot(K(T_{i})-F_{t}(T_{i}))\cdot D_{t}(T_{i})$$br /> and
$$(9)\;\;\;\;B_{t}(T_{i})=-N(T_{i})\cdot K(T_{i}).$$

Section III(c).  Specifying the Derivative - Ranged Forward Case (Caps & Floors)

If $L_{t}(T_{i})$ has optional features, then the initial condition $Z_{0}(T_{i})$ is replaced by the value of these optional features using the strike prices given by (5) and (6). For a ranged forward, we have a cap $\overline{S}(T_{i})$ and a floor $\underline{S}(T_{i})$. With our FX convention CUR1/CUR2, the terms ``cap'' and ``floor' are really as such from the counter-party's perspective, or from the entity's perspective when considering the value of the overall lease $L_{t}(T_{i})$ (a cap and floor on how much the entity has to pay). However, when considering the value of $Z_{t}(T_{i})$ from the entity's, the cap $\overline{S}(T_{i})$ is an upper-bound on how much CUR2 can weaken against CUR1, hence a floor ($=1/\overline{S}(T_{i})$) on their losses from their short position in the forward component of the embedded derivative. Conversely, the floor $\underline{S}(T_{i})$ is a lower-bound on how much CUR2 can strengthen against CUR1 hence a cap ($=1/\underline{S}(T_{i})$) on their gains.

The previous paragraph shows that $Z_{t}(T_{i})$ is a sum the sum of three distinct derivatives $\sum_{k=1}^{3}Z^{k}_{t}(T_{i})$ - a short position in a put option on CUR1/CUR2 struck at $\underline{S}(T_{i})$, a long position in a call option on CUR1/CUR2 struck at $\overline{S}(T_{i})$, and a short position in a forward on CUR1/CUR2 struck at $K(T_{i})=F_{0}(T_{i}).$ This can be proved as we did for the case of a forward, where the initial condition is taken to be $$Z_{0}(T_{i})=\sum_{k=1}^{3}Z^{k}_{0}(T_{i})=V^{\text{call}}_{0}(T_{i})-V^{\text{put}}_{0}(T_{i})+\underbrace{V^{\text{fwd}}_{0}(T_{i})}_{=0},$$
as given by (1), (2) and (3), respectively.
The terminal condition is
$$L_{T_{i}}(T_{i})=\left\{\begin{array}{ll}-N(T_{i})\cdot\overline{S}(T_{i}),&S_{T_{i}}>\overline{S}(T_{i})\\-N(T_{i})\cdot S_{T_{i}},&\underline{S}(T_{i})\leq S_{T_{i}}\leq\overline{S}(T_{i})\\-N(T_{i})\cdot \underline{S}(T_{i}),&S_{T_{i}}<\underline{S}(T_{i}).\end{array}\right.$$
It follows that
$$B_{0}(T_{i})=-N(T_{i})\cdot K(T_{i})\cdot D_{0}(T_{i})$$
and hence
$$B_{t}(T_{i})=-N(T_{i})\cdot K(T_{i})\cdot D_{t}(T_{i})$$
for all $0<t<T_{i}.$ Now,
$$Z_{T_{i}}(T_{i})=L_{T_{i}}(T_{i})-B_{T_{i}}(T_{i})=\left\{\begin{array}{ll}K(T_{i})-\overline{S}(T_{i}),&S_{T_{i}}>\overline{S}(T_{i})\\K(T_{i})-S_{T_{i}},&\underline{S}(T_{i})\leq S_{T_{i}}\leq\overline{S}(T_{i})\\K(T_{i})-\underline{S}(T_{i}),&S_{T_{i}}<\underline{S}(T_{i}).\end{array}\right.$$
One verifies easily that this is equal to
$$Z_{T_{i}}(T_{i})=-(S_{T_{i}}K(T_{i}))+\max(S_{T_{i}}-\overline{S}(T_{i}),0)-\max(\underline{S}(T_{i})-S_{T_{i}},0)$$
which are the payoff functions of the indicated derivatives. Thus,
$$(10)\;\;\;\;Z_{t}(T_{i})=-A+B-C$$
for all $0<t<T_{i}$ where $A$ is given by (1), $B$ by (2) and $C$ by (3).

Section IV. Lease Modifications (Introduction)

In a subsequent post I will elaborate on the bifurcation and valuation of modifications to lease agreements.  For now, let us keep in mind the above consider a typical lease cash flow $L_{t}(T)$ with a notional of $N$.  At the time the lease is entered into, FASB requires bifurcation of any implied derivative $Z$. Suppose $Z$ is just an FX forward (short CUR1/CUR2).  At inception ($t=0$) the strike is $K_{0}=F_{0}(T)$, the forward rate corresponding to the future time $T$ as calculated at time $t=0$. The value of $Z$ at any time $0<t<T$ is
$$Z_{t}(N_{0},K_{0},T)=N_{0}\cdot(K_{0}-\cdot F{t}(T))\cdot D_{t}(T),$$
where $D_{t}(T)$ is the discount factor for term $T$ at time $t$. This valuation methodology makes the embedded derivative $0$ at inception of the lease.

Suppose at some time $0<\tau<T$ we have the modification $N_{0}\mapsto N_{\tau}<N_{0}$ (the lease payment decreases). Then this is economically equivalent to maintaining the unmodified lease and entering into another lease with notional $\Delta N_{0,\tau}:=N_{0}-N_{\tau}$ as a lessor at the time of modification $\tau$. FASB would then require the lessor to put the resulting embedded derivative on their balance sheet that time.  The value of this derivative (since it is equivalent to a long position in CUR1/CUR2 or short CUR2/CUR1) is
$$\tilde{Z_{t}}(\Delta N_{0,\tau},K_{\tau},T)=\Delta N_{0,\tau}\cdot(F_{t}(T)-K_{\tau})\cdot D_{t}(T).$$
Now, from an operational lease accounting point of view, the P/L at the cash flow date is just $N_{0}-\Delta N_{0,\tau}=N_{\tau}.$ Therefore, the net embedded derivative of this overall lease contract is $Z+\tilde{Z}$ (the ``+'' is actually a ``-'' since we modeled $\tilde{Z}$ as a long position). Thus, the derivative's value at all times $\tau<t<T$ is
$$\begin{align*}
Z^{\tau}_{t}(N_{\tau},K_{\tau},T)
&=Z_{t}(N_{0},K_{0},T)+\tilde{Z_{t}}(\Delta N_{0,\tau},K_{\tau},T)\\
&=N_{0}\cdot(K_{0}-F_{t}(T))\cdot D_{t}(T)+\Delta N_{0,\tau}\cdot(F_{t}(T)-K_{\tau})\cdot D_{t}(T)\\
&=D_{t}(T)\Big[N_{0}K_{0}-N_{0}F_{t}(T)+N_{0}F_{t}(T)-N_{0}K_{\tau}-N_{\tau}F_{t}(T)+N_{\tau}K_{\tau}\Big]\\
&=N_{0}\cdot(K_{0}-K_{\tau})\cdot D_{t}(T)+N_{\tau}\cdot(K_{\tau}-F_{t}(T))\cdot D_{t}(T)\\
&=N_{0}\Delta K_{0,\tau}D_{t}(T)+N_{\tau}\cdot(K_{\tau}-F_{t}(T))\cdot D_{t}(T)\\
&=N_{\tau}(K_{\tau}-F_{t}(T))\cdot D_{t}(T)+C
\end{align*}$$
where the constant $C$ is the settlement price of the original embedded derivative established at time $\tau$, but settled at time $T$ and discounted back to time $t$.  The value of the new derivative is then the sum of this settlement and newly entered into forward with the revised notional.  There is more intuitive way to rewrite this result.  Let us introduce the notation $\Delta N_{0,\tau}$ as with $\Delta K_{0,\tau}.$
Then
$$\begin{align*}Z^{\tau}_{t}(N_{\tau},K_{\tau},T)
&=Z_{t}(N_{0},K_{0},T)+\tilde{Z_{t}}(\Delta N_{0,\tau},K_{\tau},T)\\
&=N_{0}\cdot(K_{0}-F_{t}(T))\cdot D_{t}(T)+\Delta N_{0,\tau}\cdot(F_{t}(T)-K_{\tau})\cdot D_{t}(T)\\
&=N_{\tau}\cdot(K_{0}-F_{t}(T))\cdot D_{t}(T)+\Delta N_{0,\tau}\Delta K_{0,\tau}\\
&=N_{\tau}\cdot(K_{0}-F_{t}(T))\cdot D_{t}(T)+C,
\end{align*}$$
where now $C$ is the settlment price of the cancelled notional and the value of the new embedded derivative is the sum of this settlement cost and the original forward contract with reduced notional.  In other words, it is the value of the same forward, but with a revised notional that represents the decreased exposure, and a constant settlement that carries with the valuation.  This constant is the cost of reducing the exposure at time $\tau$ from $N_{0}$ to $N_{\tau}$.

The entity can account for this in two ways.  Either P/L the settlement cost $C$ on the modification date, or value expense it over time by carrying the settlement cost valuation period to valuation period up-to expiration of the lease.  The two are equivalent from an accounting point of view, but obviously from a valuation point of view one will lead to a jump in the value of the embedded derivative and the other will maintain continuity of the valuation.

A similar approach can be taken to deal with lease increases and extensions - these valuations and related accounting issues will be taken up in a subsequent post.

Click here to read full post »

17 March, 2015

Can a Derivative's Value Exceed the Underlying Notional Value?



On a recent project I valued some derivatives, the results of which the client balked at because the values exceeded the notional on which they were written.  So is it ever possible for a derivative's valuation to exceed its underlying notional?


The answer, of course, depends, and first we need to clarify what we mean by a derivative's value exceeding its notional.  A typical derivative (like an option or future) is written on some underlying asset with price $S_{t}$ and some quantity or notional $N$.  The term quantity is frequently used for assets like stocks and the term notional for assets like currencies - so in the latter case, if I have USD/EUR call option, then I view the asset as the US dollar (that I want to buy a call option on) priced in the European Euro (that is what the USD/EUR exchange rate is - the cost of a US dollar in Euros), with a notional (i.e. quantity) equal to (say) $\$100,000,000$ USD.

Notice that the quantity/notional has units in the underlying asset and that the spot price $S_{t}$ has units of value in the numeraire/settlement currency per 1 asset.  Hence, when we ask if a derivative's value can exceed its underlying notional, we are really asking whether the value at time $t$, denoted $V_{t}$, can exceed the quantity $NS_{t}$, which has units in the settlement currency (EUR in the above example).  In other words, we ask whether
$$V_{t}>NS_{t}$$
can hold without introducing an arbitrage.

Essentially, the purpose of the above discussion was to express precisely what we mean for the valuation to exceed the underlying notional and moreover, to emphasize that derivative's valuation cannot be directly compared to the notional in order to answer the question, since since the units are not the same - the underlying notional $N$ needs to be multiplied by the spot price $S_{t}$ so that each has units in the valuation currency.

The classic counter-example to answering this question affirmatively in all instances comes from considering a call on option written on $N$ of some asset with price $S_{t}$.  If you value this option at time $t$, then it is clear that
$$V_{t}<NS_{t},$$
for otherwise one could short a covered option at no cost and an arbitrage would exist.  But this argument no longer holds for instruments with payoffs that are not artificially bounded by some optionality mechanism.  Indeed, the example from my experience involved an FX forward on CUR1/CUR2 (to be generic).  For such a forward, let $K$ be the strike, $N$ the notional (denominated in CUR1), $D$ the discount factor and $\alpha$ the CUR2/CUR1 exchange rate ($1/S_{t}$).  Then if the entity is in the short position we have
$$-\alpha N\cdot(F-K)\cdot D>N$$
$$(F-K)<-\frac{1}{\alpha D}$$
$$F<K-\frac{1}{\alpha D}.$$
Hence, if the forward rate is sufficiently small (i.e. price of CUR1 declined) with respect to the inception strike, then the value of the forward will exceed the notional as an asset. Conversely,
$$-\alpha N\cdot(F-K)\cdot D<-N$$
$$(F-K)>\frac{1}{\alpha D}$$
$$F>K+\frac{1}{\alpha D}$$
Hence, if the forward rate is sufficiently large (i.e. the price of CUR1 increased) with respect to the inception strike, then the value of the forward will exceed the notional as a liability.

It is not difficult to come up with similar bounds for other basic instruments such as swaps either.
Click here to read full post »