Taylor Martin: March 2015

23 March, 2015

A Primer in Harmonic Analysis

I picked these problems from Modern Fourier Analysis Vol I - I think that they serve as a good primer for the basic techniques and theorems in harmonic analysis (a subject that I have recently started looking back into in order to deal with some of the techniques used when Levy processes in mathematical finance).

Problem I. Fix $d\geq1$ and suppose $\psi:(0,\infty)\mapsto[0,\infty)$ is $C^{1}$, non-increasing, and $\int_{\mathbb{R}^{d}}\psi(|x|)\;dx\leq A<\infty.$ Define
$$[M_{\psi}f](x):=\sup_{0<r<\infty}\frac{1}{r^{d}}\int_{\mathbb{R}^{d}}|f(x-y)|\psi\left(\frac{|y|}{r}\right)\;dy$$
and show that $$[M_{\psi}f](x)\leq A[Mf](x)$$
where $M$ is the usual Hardy-Littelwood maximal function.

Solution. We first observe that the translation invariance of the indicated estimate implies that it is sufficient to prove the case $x=0$ (this can be seen more explicitly by replacing $f$ by $\tau_{x}f$, where $\tau_{x}$ is the translation by $x$ operator, and applying the present case to be proven to see then that the estimate holds for all $x$). For convenience let us define $\psi_{r}(|y|)=r^{-d}\psi(|y|/r)$. The radial properties of the terms in the estimate suggest polar coordinates will be useful in dealing with the resultant integrals. Let us recall that the polar coordinate formula implies as a consequence of itself that
$$\frac{d}{ds}\int_{B(0,s)}f(y)\;dy=\frac{d}{ds}\int_{0}^{s}dt\int_{\partial B(0,t)}f(\omega)\;dS(\omega)=\int_{\partial B(0,s)}f(\omega)\;dS(\omega)=s^{d-1}\int_{S^{d-1}}f(s\omega)\;dS(\omega).$$

Click here to read full post »

22 March, 2015

Divergence of Harmonic Series on a Sequence of Decreasing Sub-Domains of $\mathbb{N}$

The series $\sum_{n\in\mathbb{N}}n^{-p}$ diverges if $p\leq1$ and converges if $p>1$, and so it may seem plausible that (being a "bifurcation point" of this condition) the harmonic series $\sum_{n\in\mathbb{N}}n^{-1}$ could converge on some proper subset of $A\subset\mathbb{N}$. This is obvious if $A$ is finite. If $A$ is infinite, then a moment's thought reveals that there are many subsets on which the harmonic series converges since its terms contain any other series with terms in $\mathbb{N}^{-1}$. So for instance $$\sum_{n\in\mathbb{N}}n^{-2}=\frac{\pi^{2}}{6},$$ $$\sum_{n\in\mathbb{N}}\frac{1}{n!}=e,$$ $$\sum_{n\in\mathbb{N}}\frac{1}{2^{n}}=2,$$ and so on. Given that rather "large" subsets of $\mathbb{N}$ lead to convergence of the harmonic series, the following result was somewhat surprising to me when I was first asked to prove it.

Claim. Let $$A_\epsilon := \{a \in \mathbb{N} : 1 - cos(a) < \epsilon\}.$$ Then $$\sum_{n\in A_\epsilon } \frac{1}{n}$$ diverges for all $0<\epsilon<1.$

Proof. For $0<\epsilon< 1$, the inequality $1-\cos(a)< \epsilon$ has solutions for $$a\in(2k\pi-\theta,2k\pi+\theta)$$ where $\theta=\cos^{-1}(1-\epsilon)$ (note that $\theta\in(0,\frac{\pi}{2})$ and by using a Taylor expansion, it is easy to see $\theta=O(\epsilon^{\frac{1}{2}})$, although all that is important is $\theta\to0$ as $\epsilon\to0$). For there to be any positive integers $a:=a_{k}$ in such an interval, it is necessary and sufficient that
$$\frac{[2k\pi-\theta]}{[2k\pi+\theta]}<1,$$
where $[\cdot]$ is the "floor" function (round down, e.g. truncate the decimals). Intuitively, this condition just says there is an integer in the $k$th solution interval (note that there could be multiple integral solutions in a $k$th interval, though this is not very important since we are mostly interested in the case for small $\epsilon$; furthermore, since $\theta=O(\epsilon^{\frac{1}{2}})$, then once $\epsilon$ is sufficiently small (say $\epsilon<0.1,$ so that $2\theta$).

From the above observations and the fact that $2\pi<6.3$ (circumference of the unit circle), it is not difficult to ascertain that $\#A=\infty$ (the cardinality of the set $A$). Therefore $A$ is countable with its elements forming an "approximate" arithmetic sequence of integers in that sense that for
$$D:=\max_{a_{i}\in A}|a_{i+1}-a_{i}|<\infty,$$
on "average" the difference of two successive integers is approximately $D$ (having analytical results that are sharp is unnecessary in the present situation as we are only after qualitative facts like convergence).

We can now determine whether or not the sum converges. Define sequences $a_{j}:=\frac{1}{j}$ for $j\in A $, and $0$ otherwise, and $b_{j}:=\frac{1}{j}$ for all $j=1,2,\ldots.$ Then $c_{j}:=\frac{a_{j}}{b_{j}}=1$ for $j\in A$, and $0$ otherwise. Therefore, $c_{j}$ has a sum which looks like $$1+0+\ldots+0+1+0+\ldots+0+1+\ldots$$ Define one more sequence $d_{j}:=1$ if $j=a_{1}D$ ($a_{1}$ being the first integral solution to the original inequality) and $0$ otherwise (in other words, $d_{j}$ really is an arithmetic sequence with common difference $D$). Recall from the theory of Cesaro summation that for zero-spacing $D$, $$\frac{1+0+\ldots+0+1+0+\ldots+0+\ldots+0+1_{n}}{n}\to\frac{1}{D+1}\;as\;n\to\infty$$ (note because Cesaro summation is an averaging process, the limit holds even if there is a finite number of instances of improper spacing for a finite number of terms). Consequently, $$\frac{d_{1}+\ldots+d_{j}}{n}=\frac{1}{D+1}\;as\;n\to\infty$$ (see previous parenthetical remark). Consequently, \begin{align*} \lim\limits_{n\to\infty}\frac{1}{n}\sum\limits_{j=1}^{n}\frac{a_{j}}{b_{j}} &=\lim\limits_{n\to\infty}\frac{1}{n}\sum\limits_{j=1}^{n}c_{j}\\ &\geq\lim\limits_{n\to\infty}\frac{1}{n}\sum\limits_{j=1}^{n}d_{j}\\ &=\frac{1}{D+1}\\ &>0 \end{align*} for all $\epsilon>0$, no matter how small (note that $D$ behaves something like $O(\theta^{-1})$, and by extension something like $O(\epsilon^{-\frac{1}{2}}).$ It follows that $$\sum\limits_{j=1}^{\infty}a_{j}=\infty,$$ e.g. diverges for every $\epsilon>0$ (if you don't see why or don't recognize the convergence theorem used, just apply the summation by parts formula to $\sum a_{j}$ together with the established bound).

21 March, 2015

Does the Trigonometric Harmonic Series Converge?

It is well known that the harmonic series $H(x)=\sum_{n=1}^{\infty} xn^{-1}=+\infty$ for every $x\neq0$, but what about the trigonometric harmonic series $T(x)=\sum_{n=1}^{\infty}e^{inx}n^{-1}$? Obviously for $k=1,2,\ldots$ we have $T(2k\pi)=H(1)=+\infty$. It is an interesting fact that the cancellation properties inherent in $T$ imply convergence. This is relatively straight-forward to prove this using a modificaton of Leibniz's alternating series test. More remarkable is that the convergence is actually absolute.

In order to investigate the convergence of
$$(1)\;\;\;\;\;T(x)=\sum_{n=1}^{\infty}\frac{e^{inx}}{n}<\infty,$$
first note that
$$\lim_{n\to\infty}|z^{n}|\to0$$
for every $z\in\mathbb{C}$ with $|z|<1$. Since
$$1>\frac{1}{n}>\frac{1}{n+1}>0$$ for all $n>1$, we find $\frac{1}{n}\searrow0$ (monotonically decreases to zero) and so Dirichlet's test implies
$$(2)\;\;\;\;\;\sum\limits_{n=1}^{\infty}\frac{z^{n}}{n}<\infty,$$
the convergence taking place and being absolute for every $z$ with $|z|<1$. To deal with the boundary $|z|=1$, note that if $|z|=1$ and $z\neq1$ (i.e. $z\neq1+0i$), then we have
$$\left|\sum_{n=1}^{N}z^{n}\right|=\left|\frac{1-z^{N+1}}{1-z}\right|\leq\frac{2}{1-z}<\infty.$$
The upper bound $M=\frac{2}{1-z}$ is independent of $N$ and so (2) holds for all $|z|\leq1$, except when $z=1$. Putting $z\mapsto e^{inx}$ shows that (1) converges absolutely for every $x\neq 2k\pi$ ($k=1, 2, \ldots$).

To carry out the actual summation for $T(x)$ is a tedious exercise in complex analytic methods, and the resulting formulas are unworkable (although again rather remarkably, they contain only elementary functions). Another approach is to recognize that $T(x)$ is the Fourier transform (series) of some periodic function with Fourier coefficients $\hat{f}(0)=0$ and for $n>1$
$$\hat{f}(n)=\frac{1}{n}.$$
Despite this, the computation is relatively straight-forward for certain values of $x$. For example, take $x=1$ and note that
$$T(1)=\sum_{n=1}^{\infty}\frac{e^{in}}{n}.$$
Writing
$$\int\left(\underbrace{(e^{iz})^{1}+(e^{iz})^{2}+\ldots}_{\text{geomtric series with ratio }r=e^{iz}}\right)dz=\int\frac{e^{iz}}{1-e^{iz}}\;dz,$$
we find that (with $u=1-e^{iz}$)
$$\sum_{n=1}^{\infty}\frac{1}{in}(e^{iz})^{n}=i\int\frac{du}{u}=i\ln(1-e^{iz}).$$
Combining all of this together, we obtain
$$\begin{align*}
T(1)
&=\left(i\ln(1-e^{iz})\right)\Big|_{z=1}\\
&=i\ln\left(e^{i/2}\left((e^{-i/2}-e^{i/2}\right)\right)\\
&=i\ln\left(e^{i/2}\right)+i\ln\left(2i\sin\left(-\frac{1}{2}\right)\right)\\
&=-\frac{1}{2}+i\left(\ln(-i)+\ln\left(\sin\left(\frac{1}{2}\right)\right)\right)\\
&=-\frac{1}{2}+\frac{\pi}{2}+i\ln\left(2\sin\left(\frac{1}{2}\right)\right)
\end{align*}$$
Since
$$T(1)=\sum_{n=1}^{\infty}\left(\frac{\cos n}{n}+i\frac{\sin n}{n}\right),$$
taking real and imaginary parts yields
$$T(1)=\frac{-\ln(2-2\cos(1))}{2}+i\frac{\pi-1}{2}.$$

The graphic at the beginning of the post shows the graph of $\sin n/n$ on the $(n,x)$ plane.

20 March, 2015

A Rigorous Proof of Ito's Lemma

In this post we state and prove Ito's lemma. To get directly to the proof, go to II Proof of Ito's Lemma.

For all its importance, Ito's lemma is rarely proved in finance texts, where one often finds only a heuristic justification involving Taylor's series and the intuition of the "differential form" of the lemma. There are various reasons for this. Ito's lemma is really a statement about integration, not differentiation. Indeed, differentiation is not even defined in the realm of stochastic processes due to the non-differentiability of Brownian paths. Thus, in order to present a proof of Ito's lemma, one must first cover stochastic integrals and prior to that the basic properties of Brownian motion, topics which for reasons of scope/audience cannot always be covered. However, even more mathematically inclined texts only provide a sketch and skirt over technical details of convergence. The purpose of this article is to remedy this situation and we begin with

I. MOTIVATION AND A REVIEW OF ORDINARY CALCULUS

If $f$ is $k+1$ times differentiable then Taylor's theorem asserts
$$(1)\;\;\;\;f(t+h)-f(t)=hf'(t)+\frac{h^{2}}{2}f''(t)+\ldots+\frac{h^{k+1}}{(k+1)!}f^{(k+1)}(t^{*})$$
where $t^{*}\in[t,t+h]$ if $h>0$ and $t\in[t+h,t]$ if $h<0$.

Fix $T>0$ ($T$ not necessarily small) and consider the difference $f(T)-f(0)$. This can be computed as a sum of non-overlapping differences, i.e. if $\Pi=\{t_{0}=0,t_{1},\ldots,t_{n}=T\}$ is a partition of $[0,T]$, then with the aid of (1) using $h=t_{i+1}-t_{i}$, we get

$$\begin{align*}
(2)\;\;\;\;f(T)-f(0)&=\sum_{i=0}^{n-1}f(t_{i+1})-f(t_{i})\\
&=\sum_{i=0}^{n-1}f'(t_{i})(t_{i+1}-t_{i})+\frac{1}{2}\sum_{i=0}^{n-1}f''(t_{i})(t_{i+1}-t_{i})^{2}+\sum_{i=0}^{n-1}o\left(||\Pi||^{2}\right).\end{align*}$$

As $n\to\infty$ (or $||\Pi||\to0$, i.e. $\max_{i}(t_{i+1}-t_{i})\to0$), we get
$$\sum_{i=0}^{n-1}f'(t_{i})(t_{i+1}-t_{i})\to\int_{0}^{T}f'(s)\;ds$$
and for $k\geq2$
$$\frac{1}{k!}\sum_{i=0}^{n-1}f^{(k)}(t_{i})(t_{i+1}-t_{i})^{k}\leq||\Pi||^{k-1}\sum_{i=0}^{n-1}f^{(k)}(t_{i})(t_{i+1}-t_{i})\to0\cdot\int_{0}^{T}f^{(k)}(s)\;ds=0.$$

That is, $f(T)-f(0)=\int_{0}^{T}f'(s)\;ds$, which is the second fundamental theorem of calculus. Now suppose $f$ and $g$ are smooth functions with $k+1$ derivatives and consider the composition $h=f\circ g$. The familiar chain rule implies $h$ is differentiable and that
$$(3)\;\;\;\;h'(t)=f'(g(t))g'(t).$$

By substituting $h$ into (2) and computing $h^{(k)}$ iteratively according to (3), we get
$$(4)\;\;\;\;f(g(T))-f(g(0))=\int_{0}^{T}f'(g(x))g'(x)\;dx.$$

We shall now see what happens when $g$ is not differentiable. In that case, $h$ is not differentiable, and (1) through (4) are no longer valid. However, we can write (4) instead as
$$(5)\;\;\;\;f(g(T))-f(g(0))=\int_{0}^{T}f'(g(x))\;dg$$
where the integral is now taken as a Riemann-Stieltjes intergal. If $g$ is differerntiable, then (5) reduces to (4), but (5) still makes sense even if $g$ is merely continuous (continuity is needed since $\int h\;dg$ is not well-defined if $h$ and $g$ share a common discontinuity, and $h=f(g(t))$ will in general be discontinuous wherever $g$ is). Moreover, since $f$ is smooth, we may rewrite (2) as
$$\begin{align*}
(6)\;\;\;\;f(g(T))-f(g(0))&=\sum_{i=0}^{n-1}f(g(t_{i+1}))-f(g(t_{i}))\\
&=\sum_{i=0}^{n-1}f'(g(t_{i}))(g(t_{i+1})-g(t_{i})) + \frac{1}{2}\sum_{i=0}^{n-1}f''(g(t_{i}))(g(t_{i+1})-g(t_{i}))^{2}+\ldots\end{align*}$$

Despite $g$ being non-differentiable, if it is sufficently "nice" then the terms converge to the same values as in (2) and we will recover (5). A useful sufficient condition is that $g$ be continuous and of bounded variation. This means
$$[g](T)=\sup_{\Pi}\sum_{i\in\Pi}|g(t_{i+1})-g_(t_{i})|<\infty.$$
It is easy to prove that if $g$ is differentiable, then it is of bounded variation, since then an easy application of the above (or the mean-value theorem) gives (for a norm decreasing sequence of partitions $\Pi_{1},\Pi_{2},\ldots$)
$$[g](T)=\lim_{n\to\infty}\sum_{j\in\Pi_{n}}|g(t_{j+1})-g(t_{j})|=\int_{0}^{T}|g'(t)|\;dt<\infty.$$
For $\int f\;dg$, $g$ being of bounded variation and not sharing common discontinuities with $f$ is usually the most general sufficient condition used when considering existence, though this is not strictly necessary. When $g$ is not of bounded variation, then $\int f\;dg$ may or may not exist, and it may even exist conditionally on the particular sample point used in the approximating sums, as we shall see below.

Now, the Ito lemma deals with the special case $g(t)=W(t)$ where $W$ is a Brownian motion sample path. It turns out that for $\omega$ a.s. that
$$[W](T)=\infty,$$
and
$$[W,W](T):=\sup_{\Pi}\sum_{i\in\Pi}|W(t_{i+1})-W(t_{i})|^{2}=\infty.$$
The latter quantity is called the quadratic (or second) variation of $W$. For continuous functions $g$, $[g,g](T)\equiv0$ (this follows from estimating the higher order terms in (2)). Moreover,
$$[W]^{(3)}(T):=\sup_{\Pi}\sum_{i\in\Pi}|W(t_{i+1})-W(t_{i})|^{3}=0.$$
In fact, we have $[W]^{(\alpha)}(T)=\infty$ for $\alpha\leq2$ and $[W]^{(\alpha)}(T)=0$ for $\alpha>0$, a.s. $\omega$. It would seem that the regularity on which integration theory depends on so directly (i.e. variation of the integrator) is not tractable for $W$. It turns out though, that we can obtain something useful by weakening the definition slightly. Let $\Pi_{1},\Pi_{2},\ldots$ be a sequence of partitions with $||\Pi_{n}||\to0$ as $n\to\infty$. Then we can redefine the quadratic variation as
$$[W,W](T):=\lim_{n\to\infty}\sum_{i\in\Pi_{n}}|W(t_{i+1})-W(t_{i})|^{2}.$$
Unfortunately, even this is not well-defined without further qualification. The reason that the supremum definition for the quadratic variation is a.s. infinite is due to the fact that it is possible for any $C>0$ to find a sequence of partitions $\{\Pi^{C}_{n}\}_{n}$ so that the above definition is equal to $C$ for some fixed sample path $\omega$. However, the limit converges to $T$ in $L^{2}(\Omega)$ (or in probability, if you prefer). That is to say, it converges in the $L^{2}$ norm to some random variables $Q(\omega)$ so that $Q(\omega)=T$ a.s. $\omega$ (recall that $L^{2}$ limits are defined only up to a set of measure $0$). It turns out that if we make the further restriction that $\Pi_{1}\supset \Pi_{2}\supset \Pi_{3},\ldots...$ and that $\sum_{n=1}^{\infty}||\Pi_{n}||<\infty$, then the limit also holds $\omega$ a.s. pointwise (Borel-Cantelli). In the remainder of this post we will not distinguish between these modes of convergence and state freely that $[W,W](T)=T$, without furthe reference to any technicalities with this claim.

Since $[W](T)=\infty$ and $[W,W](T)=T$, we must take care in computing the various limits appearing in
$$\begin{align*}(7)\;\;\;\;f(W(T))-f(W(0))&=\sum_{i=0}^{n-1}f(W(t_{i+1}))-f(W(t_{i}))\\
&=\sum_{i=0}^{n-1}f'(W(t_{i}))(W(t_{i+1})-W(t_{i})) + \frac{1}{2}\sum_{i=0}^{n-1}f''(W(t_{i}))(W(t_{i+1})-W(t_{i}))^{2}+\ldots\end{align*}$$

Since $[W,W](T)=T<\infty$, is follows that $[W]^{(k)}(T)=0$ for all $k\geq3$ by a simple estimate as as been done several times above. Thus the $\ldots$ terms can safely be ignored. And since $\sup_{t\in[0,T]}|f''(t)|<\infty$, the second sum converges. We shall see that it converges to
$$\int_{0}^{T}f''(W(s))\;ds.$$
(Incidentally, this is where the commonly used, though mathematically meaningless, notation $dWdW=dt$ comes from). The first term also converges, though this is not immediately obvious since the Riemann-Stieltjes theory does not apply to it as the integrator $W(t)$ is not of bounded variation. It turns out that it converges to
$$\int_{0}^{T}f'(W(s))\;dW$$
where the integral is what is known as an Ito integral. This integral is constructed exactly like a Riemann-Stieltjes integral, except that the sample point used in the approximating sums must always be the left-hand point of the interval. Differerent approximation schemes (i.e. mid-point, right-point, etc.) lead to different limiting values. If the mid-point is used, it is referred to as the Strochonivich integral. We shall not need this integral here. The reason that the Ito integral is used (i.e. left-hand point approximation) is that $f(W(t_{i}))$ is interpreted as the position we take in a stock at time $t_{i}$ with the information available at time $t_{i}$, and the capital gain on the stock is then $f(W(t_{i}))(W(t_{i+1})-W(t_{i}))$ if we assume the stock price follows a Brownian motion (which strictly speaking it doesn't, but we shall ignore this fact here since it can be corrected by replacing $W$ with geometric Brownian motion $X$). Taking the limit as $\max|t_{i+1}-t_{i}|\to0$ and then summing the individual gains gives us the net capital gains on a portfolio resulting from taking positions $f(W(t))$ in continuous time.

In light of the above, we conclude that
$$(8)\;\;\;\;f(W(T))-f(W(0))=\int_{0}^{T}f'(W(s))\;dW(s)+\frac{1}{2}\int_{0}^{T}f''(W(s))\;ds.$$
Compare this to (4), and we see that we obtain one, and only one, extra term $\frac{1}{2}\int_{0}^{T}f''(W(s))\;ds$, can be traced back to the fact $[W,W](T)=T$ and $[W]^{k}=0$ for $k\geq2.$ This is often recast in differential notation (which again, is mathematically meaningless)
$$(9)\;\;\;\;df=f'dW+\frac{1}{2}f''dt.$$

The mathematically meaningful form is (8), though (9) is used more often for calculations since it is accompanied by what is known as a "box" calculus that facilitates computations. This will be discussed in more detail below.

II. PROOF OF ITO'S LEMMA

Let $\{W(t)\}_{t\geq0}$ be a standard Brownian motion with the natural filtration $\{\mathcal{F}_{t}\}_{t\geq0}$, and $f(x,t)\in\mathcal{C}^{2}(\mathbb{R}\times[0,T])$ jointly in $(x,t)$. We will consider the stochastic process $\Delta(t)=f(W(t),t)$, which is clearly adapted to $\{
\mathcal{F}_{t}\}_{t\geq0}.$

We take the following preliminary facts for granted, and defer to previous blog posts covering Brownian motion and stochastic integration for proofs.

Almost surely, we have the variation formulas $[W]^{1}(t)=+\infty,[W]^{2}(t)=t$ and $[W]^{k}(t)=0$ for $k\geq3$.
Almost surely, we have the convergence of $\lim_{||\Pi_{[0,T]}||\to0}\sum_{i=1}^{n}\Delta(t_{i})(W(t_{i+1})-W(t_{i}))$ for any continuous and adapted process $\Delta(t)$. We denote this limit by $\int_{0}^{T}\Delta(t)\;dW(t)$ and refer to it as the Ito integral of $\Delta$. The limit is taken in $L^{2}(\Omega).$

Theorem (Ito's Lemma). With the notation above, we have for all $T>0$ $$\begin{align*}f(W(T),T)-f(W(0),0)=\\\int_{0}^{T}f_{t}(W(t),t)\;dt+\int_{0}^{T}f_{x}(W(t),t)\;dW(t)+\frac{1}{2}\int_{0}^{T}f_{xx}(W(t),t)\;dt.\end{align*}$$ We sometimes write for $f=f(W(t),t)$ $$df=f_{t}dt+f_{x}dW+\frac{1}{2}f_{xx}dt.$$

Proof. Fix $T>0$ and let $\Pi=\{t_{0}=0,t_{1},\ldots,t_{n}=T\}$ be a partition of $[0,T]$ and compute using Taylor's expansion
$$\begin{align*}
f(W(T),T)-f(0,0)&=\sum_{i=0}^{n-1}(f(W(t_{i+1}),t_{i+1})-f(W(t_{i}),t_{i}))\\
&=\sum_{i=0}^{n-1}f_{t}(W(t_{i}),t_{i})(t_{i+1}-t_{i})\\
&+\sum_{i=0}^{n-1}f_{x}(W(t_{i}),t_{i})(W(t_{i+1})-W(t_{i}))\\
&+\frac{1}{2}\sum_{i=0}^{n-1}f_{xx}(W(t_{i}),t_{i})(W(t_{i+1})-W(t_{i}))^{2}\\
&+\sum_{i=0}^{n-1}O((t_{i+1}-t_{i})(W(t_{i+1})-W(t_{i})))\\
&+\sum_{i=0}^{n-1}O((t_{i+1}-t_{i})^{2})\\
&+\sum_{i=0}^{n-1}O((W(t_{i+1})-W(t_{i}))^{3})\\
&:= A+B+C+D+E+F.\end{align*}$$

The left hand side is unaffected by taking limits as $||\Pi||\to0$, and so we may do so in computing the right hand side terms. Without loss of generality we assume $\Pi$ is uniform, so we consider equivalently $n\to\infty.$

The regularity of $f$ implies that
$$A\to\int_{0}^{T}f_{t}(W(t),t)\;dt\;\text{as}\;n\to\infty,$$
the integral being an ordinary Lebesgue (Riemann) integral. By item 2 above we have
$$B\to\int_{0}^{T}f_{x}(W(t),t)\;dW(t)\;\text{as}\;n\to\infty,$$
the integral being an Ito integral as discussed here. To deal with $D$, $E$ and $F$ we estimate
$$|D|\ll_{\beta}\sup_{0\leq i\leq n}|W(t_{i+1})-W(t_{i})|\sum_{i=0}^{n-1}(t_{i+1}-t_{i})\ll_{\beta}T\sup_{0\leq i\leq n}|W(t_{i+1})-W(t_{i})|,$$
$$|E|\ll_{\beta}\sup_{0\leq i\leq n}|t_{i+1}-t_{i}|\sum_{i=0}^{n-1}(t_{i+1}-t_{i})\ll_{\beta}T\sup_{0\leq i\leq n}|t_{i+1}-t_{i}|,$$
and
$$|F|\ll_{\beta}\sup_{0\leq i\leq n}|W(t_{i+1})-W(t_{i})|\sum_{i=0}^{n-1}(W(t_{i+1})-W(t_{i}))^{2}.$$
Appealing to item 2 above we then conclude (since the maps $t\mapsto t$ and $t\mapsto W(t)$ are continuous) that
$$D,E,F\to0\;\text{as}\;n\to\infty.$$
It remains to establish the limit
$$C\to\frac{1}{2}\int_{0}^{T}f_{xx}(W(t),t)\;dt\;\text{as}\;n\to\infty.$$
Intuitively this should be true since $[W]^{2}(T)=T,$ a fact that we sometimes write as $dWdW=dt.$ However, a rigorous proof requires some effort, and this is precisely the point in the proof (assuming Brownian motion and stochastic integration are covered) that almost every mathematical finance text skips over. (Note that theorem has already been proved in the special case that $f=p(x,t)$, a second degree polynomial; as an example, consider the special case $f(x,t)=\frac{1}{2}x^{2}$ in order to compute the Ito integral $\int_{0}^{T}W(t)\;dW(t)$).

Because this fact is of interest in and of itself, we isolate the proof that $C\to\frac{1}{2}\int_{0}^{T}f_{xx}(W(t),t)\;dt\;\text{as}\;n\to\infty$ in the following lemma.

Lemma. Let $f$ be a bounded continuous function on $[0,T]$ and $\{W(t)\}_{t \geq 0}$ a standard one-dimensional Brownian motion. Then almost surely $$\sum_{i=0}^{n-1} f(W(t_{i}))(W(t_{i+1})-W(t_{i}))^{2}\to\int_{0}^{T}f(W(t))\;dt\;\text{as}\;n\to\infty$$ where $n\to\infty$ means (WLOG) $\Pi=\{t_{0}=0,t_{1},\ldots,t_{n}=T\}$ is a uniform partition of $[0,T]$ and $|\Pi| := \max_j |t_j-t_{j-1}|\to0$.

Proof. Since $t \mapsto f(W(t))$ is (almost surely) continuous, $$\sum_{i=0}^{n-1} f(W_{t_{i}})(t_{i+1}-t_{i}) \to \int_0^T f(W(t))\;dt\;\text{as}\;n\to\infty.$$
Therefore, it suffices to show

$$I_n := \sum_{i=0}^{n-1} f(W(t_{i})) \bigg[ (W(t_{i+1})-W(t_{i}))^2 - (t_{i+1}-t_{i}) \bigg] \to 0\;\text{as}\;n\to\infty.$$

At this point it is convenient to define $\Delta t_{i} := t_{i+1}-t_{i}$ and $\Delta W_i := W(t_{i+1})-W(t_{i})$. Recalling that $\{W(t)^2-t\}_{t \geq 0}$ is a martingale with respect to the canonical filtration $(\mathcal{F}_t)_{t \geq 0}$, we compute

$$\begin{align*} &\quad \mathbb{E} \bigg( f(W(t_{i})) f(W(t_{i-1}))[\Delta W_i^2 - \Delta_i]\Delta W_i^2-\Delta_i]\bigg)\\ &= \mathbb{E} \bigg( \mathbb{E} \bigg( f(W(t_{i})) f(W(t_{i-1})) [\Delta W_i^2 - \Delta_i] [\Delta W_i^2-\Delta_i] \mid \mathcal{F}_{t_{i}} \bigg) \bigg) \\ &= \mathbb{E} \bigg( f(W(t_{i})) f(W(t_{i-1})) [\Delta W_i^2-\Delta_i] \underbrace{\mathbb{E} \bigg( \Delta W_i^2 - \Delta_i \mid \mathcal{F}_{t_{i}} \bigg)}_{\mathbb{E}(\Delta W_i^2-\Delta i)=0} \bigg) = 0, \end{align*}$$

and thus

$$\mathbb{E}(I_n^2) = \mathbb{E}\left(\sum_{i=0}^{n-1} f(W(t_{i}))^2 (\Delta W_i^2-\Delta_i)^2 \right).$$

(Observe that the cross-terms vanish.) Using that $f$ is bounded and $W(t)-W(s) \sim W(t-s) \sim \sqrt{t-s} W(1)$ we find

$$\begin{align*} \mathbb{E}(I_n^2) &\leq \|f\|_{\infty}^2 \sum_{i=0}^{n-1} \mathbb{E}\bigg[(\Delta W_i^2-\Delta_i)^2\bigg] \\ &= \|f\|_{\infty}^2 \sum_{i=0}^{n-1} \Delta_i^2 (\mathbb{E}(W_1^2)-1)^2 \\ &\leq C |\Pi| \sum_{i=0}^{n-1} \Delta_i = C |\Pi| T \end{align*}$$

for $C := \|f\|_{\infty}^2 (\mathbb{E}(W_1^2)-1)^2$. Letting $|\Pi| \to 0$, the claim follows.

III. CLARIFICATION OF "ALMOST SURE" CONVERGENCE

We assume the reader is familiar with the various lines of convergence in real analysis: pointwise, uniform, almost uniform, in measure/probability, $L^{p}$, etc. This short section is just to help clarify what is meant by almost sure convergence in the context of this and related topics.

Statements of convergence involving Brownian motion are almost always established in $L^{2}(\Omega,P)$, which in turn implies convergence in probability because Chebyshev's inequality states for a sequence of random variables $X_{n}$ and proposed limit $X$ that
$$P(|X_{n}-X|\geq\epsilon)\leq\frac{1}{\epsilon^{2}}\mathbb{E}\left[|X_{n}-X|^{2}\right]\to0\;\text{as}\;n\to\infty\;\text{for all}\;\epsilon>0\;\text{fixed}.$$

For example, in the proof of Ito's lemma we really proved that $$\lim_{n\to\infty}\sum_{i=0}^{n-1}f(W(t_{i-1}),t_{i-1})(W(t_{i+1})-W(t_{i}))^{2}=\int_{0}^{T}f(t)\;dt$$ in $L^{2}(\Omega)$, and by consequence, almost surely. To clarify, this means that for almost every sample path, or outcome $\omega\in\Omega$, we have
$$\lim_{n\to\infty}X_{n}(\omega):=\lim_{n\to\infty}\sum_{i=0}^{n-1}f(W(t_{i-1}),t_{i-1})(W(t_{i+1})-W(t_{i}))^{2}=\int_{0}^{T}f(t)\;dt.$$

The case is similar to proving things like almost surely $[W,W](t)=t$ and almost surely $\int f(t)\;dW(t)$ exists in the Ito sense.

Bifurcating Lease Embedded FX Derivatives

Section I. Overview

Suppose an entity enters into an agreement to lease property and make rental payments each month, but that the fixed notional underlying the lease payments is denominated in some other currency. This introduces an exposure for the lessee (but not for the lessor) since now they must pay a domestic currency equivalent of some fixed amount in a foreign currency - in other words, they pay Lease Payment x Exchange Rate, whatever that might be at the time the payment becomes due. If the entity is a corporate entity, accounting regulations require the entity to "bifurcate" the embedded derivative from the contract and account for it as though it was a legitimate derivative, per the rules of derivative accounting. This introduces accounting complexities, but the problem must also of course be solved from a valuation point of view.

In this post we consider lease agreements as just described, as well as those with caps and floors on the exchange rate with strikes contractually written into the agreements.

Section II. Valuation Methodology

For leases determined to have embedded derivatives (from the point of view of the domestic entity), we value the embedded derivative as a strip of component derivatives corresponding to each future cash flow. That is, each cash flow represents the notional (denominated in the foreign currency CUR1) for each component derivative, and value of the lease embedded derivative is the aggregate value of these component derivatives.

Our FX convention is CUR1/CUR2 where this rate is the \# units of CUR2 per 1 unit of CUR1 - such a quantity has units [CUR2]/[CUR1]. We refer to CUR2 as the domestic, functional and settlement currency and CUR1 as the foreign, deal and notional currency.

Our valuation methodology is based on usual market-practice - in particular, no arbitrage and discounted cash flow principles. For options, we use the additional assumption of no-arbitrage for an asset price following a simple geometric Brownian motion (Black-Scholes-Merton model). Consider a present valuation date $t$, future maturity date $T>t$, future cash flow $N=N(T)$, corresponding strike rate $K=K(T)$, forward rate $F=F_{t}(T)$, discount rate $D=D_{t}(T)$ and volatility $\sigma=\sigma_{t}(K,T)$. Let $V=V_{t}(T,N,K,F,D,\sigma)$ denote the value of a derivative written on CUR1/CUR2 with the previous parameters. Then our previous assumptions lead us to the following valuation formulas: $$(1)\;\;\;\;V^{\text{fwd}}_{t}(T)=N(T)\cdot(F_{t}(T)-K(T))\cdot D_{t}(T)$$
$$(2)\;\;\;\;V^{\text{call}}_{t}(\Phi(d_{+})F_{t}(T)-\Phi(d_{-})K)\cdot D_{t}(T),$$
and
$$(3)\;\;\;\;V^{\text{put}}_{t}(T)=(\Phi(-d_{-})K-N(-d_{+})F_{t}(T))\cdot D_{t}(T).$$
In (2) and (3) we define
$$d_{\pm}=\frac{1}{\sigma_{t}(K,T)\sqrt{T-t}}\left[\log\left(\frac{F_{t}(T)}{K(T)}\right)\pm\frac{1}{2}\sigma_{t}(K,T)^{2}(T-t)\right]$$
and
$$\Phi(x)=(2\pi)^{-1/2}\int_{-\infty}^{x}e^{-y^{2}}{2}\;dy,$$
the standard normal cumulative distribution function.
(Note the dependence of $\sigma_{t}$ on $(K,T)$ is due to the nature of FX option markets exhibiting term structure variation and ``smiles.'')

Section III. Extraction Methodology

Section III(a). Specifying the Strike

We extract the embedded derivative in accordance to the principle that the stated value of the cash flow at inception of the lease agreement should be such that the value of the embedded derivative at inception is $0$. We appy this principle to the forward component of the embedded derivative, and approximate it by assuming the cancellation between the cap and floor values (because one is a short position and the other is a long position - see below) would net $0$ if we assumed that they constitute a forward in combination. This is exactly true from put-call parity when the strikes are the same, but only approximately true if they are different (which they must be since otherwise the combination of the three instruments would net $0$ and there would be no embedded derivative). In particular, if the lease agreement has $i=1,2,3,\ldots,n$ future cash flows, a cap $\overline{S}=\overline{S}(T_{i})$ and a floor $\underline{S}=\underline{S}(T_{i})$, then for each corresponding component derivative we set (where again, $t=0$ is the inception date of the lease): $$(4)\;\;\;\;K^{\text{fwd}}(T_{i})=F_{0}(T_{i}),$$ $$(4)\;\;\;\;K^{\text{cap}}(T_{i})=\overline{S}(T_{i}),$$ and $$(4)\;\;\;\;K^{\text{flr}}(T_{i})=\underline{S}(T_{i}).$$ Accounting rules indicate that this is the proper approach from a valuation point of view.

Section III(b). Specifying the Derivative - A Decomposition

In keeping with our notation, we let $L_{t}(T_{i})$ denote the present fair value at time $t$ of the future cash flow made at time $T_{i}$. This is always a negative quantity from the entity's point of view. The idea in order to obtain the embedded derivative is to separate the risky portion of this value from the non-risky portion. In particular, we decompose $L_{t}(T_{i})$ as
$$L_{t}(T_{i})=B_{t}(T_{i})+Z_{t}(T_{i}),$$
where $B_{t}(T_{i})$ only depends on $t$ through the discounting at time $t$ $D_{t}(T_{i})$ (in particular, it is independent of market variables like $F_{t}(T_{i})$), and $Z_{t}(T_{i})$ is a function of all random market variables inherent in $L_{t}(T_{i})$. There are an infinite number of ways to structure such a decomposition, but accounting guidance discussed above is equivalent to certain initial and terminal conditions which allow us to uniquely solve for $B(T_{i})$ and $Z(T_{i})$.

Section III(c). Specifying the Derivative - Forward Only Case

If the lease payment $L_{t}(T_{i})$ lacks any optional features, then its payoff is
$$L_{T_{i}}(T_{i})=-N\cdot S_{T_{i}}$$
and therefore its fair present value for $0<t<T_{i}$ is given by $$(7)\;\;\;\;L_{t}(T_{i})=-N(T_{i})\cdot F_{t}(T_{i})\cdot D_{t}(T_{i}).$$ Observe that this quantity has units of CUR2 and this is the present value of what the entity has to pay at time $T_{i}$. Since it depends on the forward rate $F_{t}(T_{i})$, it has an exposure to CUR1/CUR2 and is therefore risky. The ideas previous discussed involves decomposing $L_{t}(T_{i})$ into two parts $$L_{t}(T_{i})=B_{t}(T_{i})+Z_{t}(T_{i}),$$ where $B_{t}(T_{i})$ only depends on $t$ through $D_{t}(T_{i})$ (in particular, it is independent of $F_{t}(T_{i})$), and $Z_{t}(T_{i})$ is a function of $F_{t}(T_{i}).$ The initial condition $$Z_{0}(T_{i})=0$$
and
terminal payoff condition
$$L_{T_{i}}(T_{i})=B_{T_{i}}(T_{i})+Z_{T_{i}}(T_{i})=N(T_{i})\cdot F_{T_{i}}(T_{i})\cdot D_{T_{i}}(T_{i})=-N(T_{i})\cdot S_{T_{i}}$$
allow us to uniquely solve for the payoff of $B(T_{i})$ and $Z(T_{i})$, the principle of rational pricing and the fact that $B_{t}(T_{i})$ is a constant in $t$ (ignoring discounting) then gives us $L$, $B$, and $Z$ for all $0<t<T_{i})$. Indeed, from the terminal condition we have
$$Z_{T_{i}}(T_{i})=-N(T_{i})\cdot S_{T_{i}}-B_{T_{i}}(T_{i})$$
and from the initial condition
$$B_{0}(T_{i})=L_{0}(T_{i})=-N(T_{i})\cdot F_{0}(T_{i})\cdot D_{0}(T_{i})=-N(T_{i})\cdot K^{\text{fwd}}(T_{i})\cdot D_{0}(T_{i}).$$
Hence (dropping the ``fwd'' from $K$),
$$Z_{T_{i}}(T_{i})=-N(T_{i})\cdot(K(T_{i})-S_{T_{i}}).$$
This shows that the payoff of $Z(T_{i})$ is equal to a short position in CUR1/CUR2 with notion $N(T_{i})$. Applying our reasoning above, we discover $Z_{t}(T_{i})$ is given by (2). Explicitly,
$$(8)\;\;\;\;Z_{t}(T_{i})=-N(T_{i})\cdot(K(T_{i})-F_{t}(T_{i}))\cdot D_{t}(T_{i})$$br /> and
$$(9)\;\;\;\;B_{t}(T_{i})=-N(T_{i})\cdot K(T_{i}).$$

Section III(c). Specifying the Derivative - Ranged Forward Case (Caps & Floors)

If $L_{t}(T_{i})$ has optional features, then the initial condition $Z_{0}(T_{i})$ is replaced by the value of these optional features using the strike prices given by (5) and (6). For a ranged forward, we have a cap $\overline{S}(T_{i})$ and a floor $\underline{S}(T_{i})$. With our FX convention CUR1/CUR2, the terms ``cap'' and ``floor' are really as such from the counter-party's perspective, or from the entity's perspective when considering the value of the overall lease $L_{t}(T_{i})$ (a cap and floor on how much the entity has to pay). However, when considering the value of $Z_{t}(T_{i})$ from the entity's, the cap $\overline{S}(T_{i})$ is an upper-bound on how much CUR2 can weaken against CUR1, hence a floor ($=1/\overline{S}(T_{i})$) on their losses from their short position in the forward component of the embedded derivative. Conversely, the floor $\underline{S}(T_{i})$ is a lower-bound on how much CUR2 can strengthen against CUR1 hence a cap ($=1/\underline{S}(T_{i})$) on their gains.

The previous paragraph shows that $Z_{t}(T_{i})$ is a sum the sum of three distinct derivatives $\sum_{k=1}^{3}Z^{k}_{t}(T_{i})$ - a short position in a put option on CUR1/CUR2 struck at $\underline{S}(T_{i})$, a long position in a call option on CUR1/CUR2 struck at $\overline{S}(T_{i})$, and a short position in a forward on CUR1/CUR2 struck at $K(T_{i})=F_{0}(T_{i}).$ This can be proved as we did for the case of a forward, where the initial condition is taken to be $$Z_{0}(T_{i})=\sum_{k=1}^{3}Z^{k}_{0}(T_{i})=V^{\text{call}}_{0}(T_{i})-V^{\text{put}}_{0}(T_{i})+\underbrace{V^{\text{fwd}}_{0}(T_{i})}_{=0},$$
as given by (1), (2) and (3), respectively.
The terminal condition is
$$L_{T_{i}}(T_{i})=\left\{\begin{array}{ll}-N(T_{i})\cdot\overline{S}(T_{i}),&S_{T_{i}}>\overline{S}(T_{i})\\-N(T_{i})\cdot S_{T_{i}},&\underline{S}(T_{i})\leq S_{T_{i}}\leq\overline{S}(T_{i})\\-N(T_{i})\cdot \underline{S}(T_{i}),&S_{T_{i}}<\underline{S}(T_{i}).\end{array}\right.$$
It follows that
$$B_{0}(T_{i})=-N(T_{i})\cdot K(T_{i})\cdot D_{0}(T_{i})$$
and hence
$$B_{t}(T_{i})=-N(T_{i})\cdot K(T_{i})\cdot D_{t}(T_{i})$$
for all $0<t<T_{i}.$ Now,
$$Z_{T_{i}}(T_{i})=L_{T_{i}}(T_{i})-B_{T_{i}}(T_{i})=\left\{\begin{array}{ll}K(T_{i})-\overline{S}(T_{i}),&S_{T_{i}}>\overline{S}(T_{i})\\K(T_{i})-S_{T_{i}},&\underline{S}(T_{i})\leq S_{T_{i}}\leq\overline{S}(T_{i})\\K(T_{i})-\underline{S}(T_{i}),&S_{T_{i}}<\underline{S}(T_{i}).\end{array}\right.$$
One verifies easily that this is equal to
$$Z_{T_{i}}(T_{i})=-(S_{T_{i}}K(T_{i}))+\max(S_{T_{i}}-\overline{S}(T_{i}),0)-\max(\underline{S}(T_{i})-S_{T_{i}},0)$$
which are the payoff functions of the indicated derivatives. Thus,
$$(10)\;\;\;\;Z_{t}(T_{i})=-A+B-C$$
for all $0<t<T_{i}$ where $A$ is given by (1), $B$ by (2) and $C$ by (3).

Section IV Lease Modifications - An Introduction

In a subsequent post I will elaborate on the bifurcation and valuation of modifications to lease agreements. For now, let us keep in mind the above consider a typical lease cash flow $L_{t}(T)$ with a notional of $N$. At the time the lease is entered into, FASB requires bifurcation of any implied derivative $Z$. Suppose $Z$ is just an FX forward (short CUR1/CUR2). At inception ($t=0$) the strike is $K_{0}=F_{0}(T)$, the forward rate corresponding to the future time $T$ as calculated at time $t=0$. The value of $Z$ at any time $0<t<T$ is
$$Z_{t}(N_{0},K_{0},T)=N_{0}\cdot(K_{0}-\cdot F{t}(T))\cdot D_{t}(T),$$
where $D_{t}(T)$ is the discount factor for term $T$ at time $t$. This valuation methodology makes the embedded derivative $0$ at inception of the lease.

Suppose at some time $0<\tau<T$ we have the modification $N_{0}\mapsto N_{\tau}<N_{0}$ (the lease payment decreases). Then this is economically equivalent to maintaining the unmodified lease and entering into another lease with notional $\Delta N_{0,\tau}:=N_{0}-N_{\tau}$ as a lessor at the time of modification $\tau$. FASB would then require the lessor to put the resulting embedded derivative on their balance sheet that time. The value of this derivative (since it is equivalent to a long position in CUR1/CUR2 or short CUR2/CUR1) is
$$\tilde{Z_{t}}(\Delta N_{0,\tau},K_{\tau},T)=\Delta N_{0,\tau}\cdot(F_{t}(T)-K_{\tau})\cdot D_{t}(T).$$
Now, from an operational lease accounting point of view, the P/L at the cash flow date is just $N_{0}-\Delta N_{0,\tau}=N_{\tau}.$ Therefore, the net embedded derivative of this overall lease contract is $Z+\tilde{Z}$ (the ``+'' is actually a ``-'' since we modeled $\tilde{Z}$ as a long position). Thus, the derivative's value at all times $\tau<t<T$ is
$$\begin{align*}
Z^{\tau}_{t}(N_{\tau},K_{\tau},T)
&=Z_{t}(N_{0},K_{0},T)+\tilde{Z_{t}}(\Delta N_{0,\tau},K_{\tau},T)\\
&=N_{0}\cdot(K_{0}-F_{t}(T))\cdot D_{t}(T)+\Delta N_{0,\tau}\cdot(F_{t}(T)-K_{\tau})\cdot D_{t}(T)\\
&=D_{t}(T)\Big[N_{0}K_{0}-N_{0}F_{t}(T)+N_{0}F_{t}(T)-N_{0}K_{\tau}-N_{\tau}F_{t}(T)+N_{\tau}K_{\tau}\Big]\\
&=N_{0}\cdot(K_{0}-K_{\tau})\cdot D_{t}(T)+N_{\tau}\cdot(K_{\tau}-F_{t}(T))\cdot D_{t}(T)\\
&=N_{0}\Delta K_{0,\tau}D_{t}(T)+N_{\tau}\cdot(K_{\tau}-F_{t}(T))\cdot D_{t}(T)\\
&=N_{\tau}(K_{\tau}-F_{t}(T))\cdot D_{t}(T)+C
\end{align*}$$
where the constant $C$ is the settlement price at time $T$ made at time $\tau$ and discounted to time $\tau<t<T$.

17 March, 2015

Can a Derivative's Value Exceed the Underlying Notional Value?

On a recent project I valued some derivatives, the results of which the client balked at because the values exceeded the notional on which they were written. So is it ever possible for a derivative's valuation to exceed its underlying notional?

The answer, of course, depends, and first we need to clarify what we mean by a derivative's value exceeding its notional. A typical derivative (like an option or future) is written on some underlying asset with price $S_{t}$ and some quantity or notional $N$. The term quantity is frequently used for assets like stocks and the term notional for assets like currencies - so in the latter case, if I have USD/EUR call option, then I view the asset as the US dollar (that I want to buy a call option on) priced in the European Euro (that is what the USD/EUR exchange rate is - the cost of a US dollar in Euros), with a notional (i.e. quantity) equal to (say) $\$100,000,000$ USD.

Notice that the quantity/notional has units in the underlying asset and that the spot price $S_{t}$ has units of value in the numeraire/settlement currency per 1 asset. Hence, when we ask if a derivative's value can exceed its underlying notional, we are really asking whether the value at time $t$, denoted $V_{t}$, can exceed the quantity $NS_{t}$, which has units in the settlement currency (EUR in the above example). In other words, we ask whether
$$V_{t}>NS_{t}$$
can hold without introducing an arbitrage.

Essentially, the purpose of the above discussion was to express precisely what we mean for the valuation to exceed the underlying notional and moreover, to emphasize that derivative's valuation cannot be directly compared to the notional in order to answer the question, since since the units are not the same - the underlying notional $N$ needs to be multiplied by the spot price $S_{t}$ so that each has units in the valuation currency.

The classic counter-example to answering this question affirmatively in all instances comes from considering a call on option written on $N$ of some asset with price $S_{t}$. If you value this option at time $t$, then it is clear that
$$V_{t}<NS_{t},$$
for otherwise one could short a covered option at no cost and an arbitrage would exist. But this argument no longer holds for instruments with payoffs that are not artificially bounded by some optionality mechanism. Indeed, the example from my experience involved an FX forward on CUR1/CUR2 (to be generic). For such a forward, let $K$ be the strike, $N$ the notional (denominated in CUR1), $D$ the discount factor and $\alpha$ the CUR2/CUR1 exchange rate ($1/S_{t}$). Then if the entity is in the short position we have
$$-\alpha N\cdot(F-K)\cdot D>N$$
$$(F-K)<-\frac{1}{\alpha D}$$
$$F<K-\frac{1}{\alpha D}.$$
Hence, if the forward rate is sufficiently small (i.e. price of CUR1 declined) with respect to the inception strike, then the value of the forward will exceed the notional as an asset. Conversely,
$$-\alpha N\cdot(F-K)\cdot D<-N$$
$$(F-K)>\frac{1}{\alpha D}$$
$$F>K+\frac{1}{\alpha D}$$
Hence, if the forward rate is sufficiently large (i.e. the price of CUR1 increased) with respect to the inception strike, then the value of the forward will exceed the notional as a liability.

It is not difficult to come up with similar bounds for other basic instruments such as swaps either.