Showing posts with label Stochastic Processes. Show all posts
Showing posts with label Stochastic Processes. Show all posts

20 March, 2015

A Rigorous Proof of Ito's Lemma

In this post we state and prove Ito's lemma.  To get directly to the proof, go to II Proof of Ito's Lemma.

For all its importance, Ito's lemma is rarely proved in finance texts, where one often finds only a heuristic justification involving Taylor's series and the intuition of the "differential form" of the lemma.  There are various reasons for this.  Ito's lemma is really a statement about integration, not differentiation.  Indeed, differentiation is not even defined in the realm of stochastic processes due to the non-differentiability of Brownian paths.  Thus, in order to present a proof of Ito's lemma, one must first cover stochastic integrals and prior to that the basic properties of Brownian motion, topics which for reasons of scope/audience cannot always be covered.  However, even more mathematically inclined texts only provide a sketch and skirt over technical details of convergence.  The purpose of this article is to remedy this situation and we begin with

I. MOTIVATION AND A REVIEW OF ORDINARY CALCULUS

If $f$ is $k+1$ times differentiable then Taylor's theorem asserts
$$(1)\;\;\;\;f(t+h)-f(t)=hf'(t)+\frac{h^{2}}{2}f''(t)+\ldots+\frac{h^{k+1}}{(k+1)!}f^{(k+1)}(t^{*})$$
where $t^{*}\in[t,t+h]$ if $h>0$ and $t\in[t+h,t]$ if $h<0$.

Fix $T>0$ ($T$ not necessarily small) and consider the difference $f(T)-f(0)$.  This can be computed as a sum of non-overlapping differences, i.e. if $\Pi=\{t_{0}=0,t_{1},\ldots,t_{n}=T\}$ is a partition of $[0,T]$, then with the aid of (1) using $h=t_{i+1}-t_{i}$, we get

$$\begin{align*}
(2)\;\;\;\;f(T)-f(0)&=\sum_{i=0}^{n-1}f(t_{i+1})-f(t_{i})\\
&=\sum_{i=0}^{n-1}f'(t_{i})(t_{i+1}-t_{i})+\frac{1}{2}\sum_{i=0}^{n-1}f''(t_{i})(t_{i+1}-t_{i})^{2}+\sum_{i=0}^{n-1}o\left(||\Pi||^{2}\right).\end{align*}$$

As $n\to\infty$ (or $||\Pi||\to0$, i.e. $\max_{i}(t_{i+1}-t_{i})\to0$), we get
$$\sum_{i=0}^{n-1}f'(t_{i})(t_{i+1}-t_{i})\to\int_{0}^{T}f'(s)\;ds$$
and for $k\geq2$
$$\frac{1}{k!}\sum_{i=0}^{n-1}f^{(k)}(t_{i})(t_{i+1}-t_{i})^{k}\leq||\Pi||^{k-1}\sum_{i=0}^{n-1}f^{(k)}(t_{i})(t_{i+1}-t_{i})\to0\cdot\int_{0}^{T}f^{(k)}(s)\;ds=0.$$

That is, $f(T)-f(0)=\int_{0}^{T}f'(s)\;ds$, which is the second fundamental theorem of calculus.  Now suppose $f$ and $g$ are smooth functions with $k+1$ derivatives and consider the composition $h=f\circ g$.  The familiar chain rule implies $h$ is differentiable and that
$$(3)\;\;\;\;h'(t)=f'(g(t))g'(t).$$

By substituting $h$ into (2) and computing $h^{(k)}$ iteratively according to (3), we get
$$(4)\;\;\;\;f(g(T))-f(g(0))=\int_{0}^{T}f'(g(x))g'(x)\;dx.$$

We shall now see what happens when $g$ is not differentiable.  In that case, $h$ is not differentiable, and (1) through (4) are no longer valid.  However, we can write (4) instead as
$$(5)\;\;\;\;f(g(T))-f(g(0))=\int_{0}^{T}f'(g(x))\;dg$$
where the integral is now taken as a Riemann-Stieltjes intergal.  If $g$ is differerntiable, then (5) reduces to (4), but (5) still makes sense even if $g$ is merely continuous (continuity is needed since $\int h\;dg$ is not well-defined if $h$ and $g$ share a common discontinuity, and $h=f(g(t))$ will in general be discontinuous wherever $g$ is).  Moreover, since $f$ is smooth, we may rewrite (2) as
$$\begin{align*}
(6)\;\;\;\;f(g(T))-f(g(0))&=\sum_{i=0}^{n-1}f(g(t_{i+1}))-f(g(t_{i}))\\
&=\sum_{i=0}^{n-1}f'(g(t_{i}))(g(t_{i+1})-g(t_{i})) + \frac{1}{2}\sum_{i=0}^{n-1}f''(g(t_{i}))(g(t_{i+1})-g(t_{i}))^{2}+\ldots\end{align*}$$

Despite $g$ being non-differentiable, if it is sufficently "nice" then the terms converge to the same values as in (2) and we will recover (5).  A useful sufficient condition is that $g$ be continuous and of bounded variation.  This means
$$[g](T)=\sup_{\Pi}\sum_{i\in\Pi}|g(t_{i+1})-g_(t_{i})|<\infty.$$
It is easy to prove that if $g$ is differentiable, then it is of bounded variation, since then an easy application of the above (or the mean-value theorem) gives (for a norm decreasing sequence of partitions $\Pi_{1},\Pi_{2},\ldots$)
$$[g](T)=\lim_{n\to\infty}\sum_{j\in\Pi_{n}}|g(t_{j+1})-g(t_{j})|=\int_{0}^{T}|g'(t)|\;dt<\infty.$$
For $\int f\;dg$, $g$ being of bounded variation and not sharing common discontinuities with $f$ is usually the most general sufficient condition used when considering existence, though this is not strictly necessary.  When $g$ is not of bounded variation, then $\int f\;dg$ may or may not exist, and it may even exist conditionally on the particular sample point used in the approximating sums, as we shall see below.

Now, the Ito lemma deals with the special case $g(t)=W(t)$ where $W$ is a Brownian motion sample path.  It turns out that for $\omega$ a.s. that
$$[W](T)=\infty,$$
and
$$[W,W](T):=\sup_{\Pi}\sum_{i\in\Pi}|W(t_{i+1})-W(t_{i})|^{2}=\infty.$$
The latter quantity is called the quadratic (or second) variation of $W$.  For continuous functions $g$, $[g,g](T)\equiv0$ (this follows from estimating the higher order terms in (2)).  Moreover,
$$[W]^{(3)}(T):=\sup_{\Pi}\sum_{i\in\Pi}|W(t_{i+1})-W(t_{i})|^{3}=0.$$
In fact, we have $[W]^{(\alpha)}(T)=\infty$ for $\alpha\leq2$ and $[W]^{(\alpha)}(T)=0$ for $\alpha>0$, a.s. $\omega$.  It would seem that the regularity on which integration theory depends on so directly (i.e. variation of the integrator) is not tractable for $W$.  It turns out though, that we can obtain something useful by weakening the definition slightly.  Let $\Pi_{1},\Pi_{2},\ldots$ be a sequence of partitions with $||\Pi_{n}||\to0$ as $n\to\infty$.  Then we can redefine the quadratic variation as
$$[W,W](T):=\lim_{n\to\infty}\sum_{i\in\Pi_{n}}|W(t_{i+1})-W(t_{i})|^{2}.$$
Unfortunately, even this is not well-defined without further qualification.  The reason that the supremum definition for the quadratic variation is a.s. infinite is due to the fact that it is possible for any $C>0$ to find a sequence of partitions $\{\Pi^{C}_{n}\}_{n}$ so that the above definition is equal to $C$ for some fixed sample path $\omega$.  However, the limit converges to $T$ in $L^{2}(\Omega)$ (or in probability, if you prefer).  That is to say, it converges in the $L^{2}$ norm to some random variables $Q(\omega)$ so that $Q(\omega)=T$ a.s. $\omega$ (recall that $L^{2}$ limits are defined only up to a set of measure $0$).  It turns out that if we make the further restriction that $\Pi_{1}\supset \Pi_{2}\supset \Pi_{3},\ldots...$ and that $\sum_{n=1}^{\infty}||\Pi_{n}||<\infty$, then the limit also holds $\omega$ a.s. pointwise (Borel-Cantelli).  In the remainder of this post we will not distinguish between these modes of convergence and state freely that $[W,W](T)=T$, without furthe reference to any technicalities with this claim.

Since $[W](T)=\infty$ and $[W,W](T)=T$, we must take care in computing the various limits appearing in
$$\begin{align*}(7)\;\;\;\;f(W(T))-f(W(0))&=\sum_{i=0}^{n-1}f(W(t_{i+1}))-f(W(t_{i}))\\
&=\sum_{i=0}^{n-1}f'(W(t_{i}))(W(t_{i+1})-W(t_{i})) + \frac{1}{2}\sum_{i=0}^{n-1}f''(W(t_{i}))(W(t_{i+1})-W(t_{i}))^{2}+\ldots\end{align*}$$

Since $[W,W](T)=T<\infty$, is follows that $[W]^{(k)}(T)=0$ for all $k\geq3$ by a simple estimate as as been done several times above.  Thus the $\ldots$ terms can safely be ignored.  And since $\sup_{t\in[0,T]}|f''(t)|<\infty$, the second sum converges.  We shall see that it converges to
$$\int_{0}^{T}f''(W(s))\;ds.$$
(Incidentally, this is where the commonly used, though mathematically meaningless, notation $dWdW=dt$ comes from).  The first term also converges, though this is not immediately obvious since the Riemann-Stieltjes theory does not apply to it as the integrator $W(t)$ is not of bounded variation.  It turns out that it converges to
$$\int_{0}^{T}f'(W(s))\;dW$$
where the integral is what is known as an Ito integral.  This integral is constructed exactly like a Riemann-Stieltjes integral, except that the sample point used in the approximating sums must always be the left-hand point of the interval.  Differerent approximation schemes (i.e. mid-point, right-point, etc.) lead to different limiting values.  If the mid-point is used, it is referred to as the Strochonivich integral.  We shall not need this integral here.  The reason that the Ito integral is used (i.e. left-hand point approximation) is that $f(W(t_{i}))$ is interpreted as the position we take in a stock at time $t_{i}$ with the information available at time $t_{i}$, and the capital gain on the stock is then $f(W(t_{i}))(W(t_{i+1})-W(t_{i}))$ if we assume the stock price follows a Brownian motion (which strictly speaking it doesn't, but we shall ignore this fact here since it can be corrected by replacing $W$ with geometric Brownian motion $X$).  Taking the limit as $\max|t_{i+1}-t_{i}|\to0$ and then summing the individual gains gives us the net capital gains on a portfolio resulting from taking positions $f(W(t))$ in continuous time.

In light of the above, we conclude that
$$(8)\;\;\;\;f(W(T))-f(W(0))=\int_{0}^{T}f'(W(s))\;dW(s)+\frac{1}{2}\int_{0}^{T}f''(W(s))\;ds.$$
Compare this to (4), and we see that we obtain one, and only one, extra term $\frac{1}{2}\int_{0}^{T}f''(W(s))\;ds$, can be traced back to the fact $[W,W](T)=T$ and $[W]^{k}=0$ for $k\geq2.$  This is often recast in differential notation (which again, is mathematically meaningless)
$$(9)\;\;\;\;df=f'dW+\frac{1}{2}f''dt.$$

The mathematically meaningful form is (8), though (9) is used more often for calculations since it is accompanied by what is known as a "box" calculus that facilitates computations.  This will be discussed in more detail below.


II. PROOF OF ITO'S LEMMA

Let $\{W(t)\}_{t\geq0}$ be a standard Brownian motion with the natural filtration $\{\mathcal{F}_{t}\}_{t\geq0}$, and $f(x,t)\in\mathcal{C}^{2}(\mathbb{R}\times[0,T])$ jointly in $(x,t)$.  We will consider the stochastic process $\Delta(t)=f(W(t),t)$, which is clearly adapted to $\{
\mathcal{F}_{t}\}_{t\geq0}.$

We take the following preliminary facts for granted, and defer to previous blog posts covering Brownian motion and stochastic integration for proofs.
  1. Almost surely, we have the variation formulas $[W]^{1}(t)=+\infty,[W]^{2}(t)=t$ and $[W]^{k}(t)=0$ for $k\geq3$.
  2. Almost surely, we have the convergence of $\lim_{||\Pi_{[0,T]}||\to0}\sum_{i=1}^{n}\Delta(t_{i})(W(t_{i+1})-W(t_{i}))$ for any continuous and adapted process $\Delta(t)$.  We denote this limit by $\int_{0}^{T}\Delta(t)\;dW(t)$ and refer to it as the Ito integral of $\Delta$.  The limit is taken in $L^{2}(\Omega).$
Theorem (Ito's Lemma).  With the notation above, we have for all $T>0$ $$\begin{align*}f(W(T),T)-f(W(0),0)=\\\int_{0}^{T}f_{t}(W(t),t)\;dt+\int_{0}^{T}f_{x}(W(t),t)\;dW(t)+\frac{1}{2}\int_{0}^{T}f_{xx}(W(t),t)\;dt.\end{align*}$$  We sometimes write for $f=f(W(t),t)$ $$df=f_{t}dt+f_{x}dW+\frac{1}{2}f_{xx}dt.$$

Proof.  Fix $T>0$ and let $\Pi=\{t_{0}=0,t_{1},\ldots,t_{n}=T\}$ be a partition of $[0,T]$ and compute using Taylor's expansion
$$\begin{align*}
f(W(T),T)-f(0,0)&=\sum_{i=0}^{n-1}(f(W(t_{i+1}),t_{i+1})-f(W(t_{i}),t_{i}))\\
&=\sum_{i=0}^{n-1}f_{t}(W(t_{i}),t_{i})(t_{i+1}-t_{i})\\
&+\sum_{i=0}^{n-1}f_{x}(W(t_{i}),t_{i})(W(t_{i+1})-W(t_{i}))\\
&+\frac{1}{2}\sum_{i=0}^{n-1}f_{xx}(W(t_{i}),t_{i})(W(t_{i+1})-W(t_{i}))^{2}\\
&+\sum_{i=0}^{n-1}O((t_{i+1}-t_{i})(W(t_{i+1})-W(t_{i})))\\
&+\sum_{i=0}^{n-1}O((t_{i+1}-t_{i})^{2})\\
&+\sum_{i=0}^{n-1}O((W(t_{i+1})-W(t_{i}))^{3})\\
&:= A+B+C+D+E+F.\end{align*}$$

The left hand side is unaffected by taking limits as $||\Pi||\to0$, and so we may do so in computing the right hand side terms.  Without loss of generality we assume $\Pi$ is uniform, so we consider equivalently $n\to\infty.$

The regularity of $f$ implies that
$$A\to\int_{0}^{T}f_{t}(W(t),t)\;dt\;\text{as}\;n\to\infty,$$
the integral being an ordinary Lebesgue (Riemann) integral.  By item 2 above we have
$$B\to\int_{0}^{T}f_{x}(W(t),t)\;dW(t)\;\text{as}\;n\to\infty,$$
the integral being an Ito integral as discussed here.  To deal with $D$, $E$ and $F$ we estimate
$$|D|\ll_{\beta}\sup_{0\leq i\leq n}|W(t_{i+1})-W(t_{i})|\sum_{i=0}^{n-1}(t_{i+1}-t_{i})\ll_{\beta}T\sup_{0\leq i\leq n}|W(t_{i+1})-W(t_{i})|,$$
$$|E|\ll_{\beta}\sup_{0\leq i\leq n}|t_{i+1}-t_{i}|\sum_{i=0}^{n-1}(t_{i+1}-t_{i})\ll_{\beta}T\sup_{0\leq i\leq n}|t_{i+1}-t_{i}|,$$
and
$$|F|\ll_{\beta}\sup_{0\leq i\leq n}|W(t_{i+1})-W(t_{i})|\sum_{i=0}^{n-1}(W(t_{i+1})-W(t_{i}))^{2}.$$
Appealing to item 2 above we then conclude (since the maps $t\mapsto t$ and $t\mapsto W(t)$ are continuous) that
$$D,E,F\to0\;\text{as}\;n\to\infty.$$
It remains to establish the limit
$$C\to\frac{1}{2}\int_{0}^{T}f_{xx}(W(t),t)\;dt\;\text{as}\;n\to\infty.$$
Intuitively this should be true since $[W]^{2}(T)=T,$ a fact that we sometimes write as $dWdW=dt.$  However, a rigorous proof requires some effort, and this is precisely the point in the proof (assuming Brownian motion and stochastic integration are covered) that almost every mathematical finance text skips over.  (Note that theorem has already been proved in the special case that $f=p(x,t)$, a second degree polynomial; as an example, consider the special case $f(x,t)=\frac{1}{2}x^{2}$ in order to compute the Ito integral $\int_{0}^{T}W(t)\;dW(t)$).

Because this fact is of interest in and of itself, we isolate the proof that $C\to\frac{1}{2}\int_{0}^{T}f_{xx}(W(t),t)\;dt\;\text{as}\;n\to\infty$ in the following lemma.

Lemma.  Let $f$ be a bounded continuous function on $[0,T]$ and $\{W(t)\}_{t \geq 0}$ a standard one-dimensional Brownian motion. Then almost surely $$\sum_{i=0}^{n-1} f(W(t_{i}))(W(t_{i+1})-W(t_{i}))^{2}\to\int_{0}^{T}f(W(t))\;dt\;\text{as}\;n\to\infty$$ where $n\to\infty$ means (WLOG) $\Pi=\{t_{0}=0,t_{1},\ldots,t_{n}=T\}$ is a uniform partition of $[0,T]$ and $|\Pi| := \max_j |t_j-t_{j-1}|\to0$.

Proof.  Since $t \mapsto f(W(t))$ is (almost surely) continuous, $$\sum_{i=0}^{n-1} f(W_{t_{i}})(t_{i+1}-t_{i}) \to \int_0^T f(W(t))\;dt\;\text{as}\;n\to\infty.$$
Therefore, it suffices to show
$$I_n := \sum_{i=0}^{n-1} f(W(t_{i})) \bigg[ (W(t_{i+1})-W(t_{i}))^2 - (t_{i+1}-t_{i}) \bigg] \to 0\;\text{as}\;n\to\infty.$$

At this point it is convenient to define $\Delta t_{i} := t_{i+1}-t_{i}$ and $\Delta W_i := W(t_{i+1})-W(t_{i})$.  Recalling that $\{W(t)^2-t\}_{t \geq 0}$ is a martingale with respect to the canonical filtration $(\mathcal{F}_t)_{t \geq 0}$, we compute

$$\begin{align*} &\quad \mathbb{E} \bigg( f(W(t_{i})) f(W(t_{i-1}))[\Delta W_i^2 - \Delta_i]\Delta W_i^2-\Delta_i]\bigg)\\ &= \mathbb{E} \bigg( \mathbb{E} \bigg( f(W(t_{i})) f(W(t_{i-1}))  [\Delta W_i^2 - \Delta_i]  [\Delta W_i^2-\Delta_i] \mid \mathcal{F}_{t_{i}} \bigg) \bigg) \\ &= \mathbb{E} \bigg( f(W(t_{i})) f(W(t_{i-1}))  [\Delta W_i^2-\Delta_i]  \underbrace{\mathbb{E} \bigg( \Delta W_i^2 - \Delta_i \mid \mathcal{F}_{t_{i}} \bigg)}_{\mathbb{E}(\Delta W_i^2-\Delta i)=0} \bigg) = 0, \end{align*}$$

and thus

$$\mathbb{E}(I_n^2) = \mathbb{E}\left(\sum_{i=0}^{n-1} f(W(t_{i}))^2 (\Delta W_i^2-\Delta_i)^2 \right).$$

(Observe that the cross-terms vanish.)  Using that $f$ is bounded and $W(t)-W(s) \sim W(t-s) \sim \sqrt{t-s} W(1)$ we find

$$\begin{align*} \mathbb{E}(I_n^2) &\leq \|f\|_{\infty}^2 \sum_{i=0}^{n-1} \mathbb{E}\bigg[(\Delta W_i^2-\Delta_i)^2\bigg] \\ &= \|f\|_{\infty}^2 \sum_{i=0}^{n-1} \Delta_i^2  (\mathbb{E}(W_1^2)-1)^2 \\ &\leq C |\Pi| \sum_{i=0}^{n-1} \Delta_i = C |\Pi| T \end{align*}$$


for $C := \|f\|_{\infty}^2 (\mathbb{E}(W_1^2)-1)^2$. Letting $|\Pi| \to 0$, the claim follows.



III.  CLARIFICATION OF "ALMOST SURE" CONVERGENCE

We assume the reader is familiar with the various lines of convergence in real analysis: pointwise, uniform, almost uniform, in measure/probability, $L^{p}$, etc.  This short section is just to help clarify what is meant by almost sure convergence in the context of this and related topics.

Statements of convergence involving Brownian motion are almost always established in $L^{2}(\Omega,P)$, which in turn implies convergence in probability because Chebyshev's inequality states for a sequence of random variables $X_{n}$ and proposed limit $X$ that
$$P(|X_{n}-X|\geq\epsilon)\leq\frac{1}{\epsilon^{2}}\mathbb{E}\left[|X_{n}-X|^{2}\right]\to0\;\text{as}\;n\to\infty\;\text{for all}\;\epsilon>0\;\text{fixed}.$$

For example, in the proof of Ito's lemma we really proved that $$\lim_{n\to\infty}\sum_{i=0}^{n-1}f(W(t_{i-1}),t_{i-1})(W(t_{i+1})-W(t_{i}))^{2}=\int_{0}^{T}f(t)\;dt$$ in $L^{2}(\Omega)$, and by consequence, almost surely.  To clarify, this means that for almost every sample path, or outcome $\omega\in\Omega$, we have
$$\lim_{n\to\infty}X_{n}(\omega):=\lim_{n\to\infty}\sum_{i=0}^{n-1}f(W(t_{i-1}),t_{i-1})(W(t_{i+1})-W(t_{i}))^{2}=\int_{0}^{T}f(t)\;dt.$$

The case is similar to proving things like almost surely $[W,W](t)=t$ and almost surely $\int f(t)\;dW(t)$ exists in the Ito sense.


23 February, 2015

Why is Brownian Motion Almost Surely Continuous?



A user from the Quant StackExchange recently asked why the regularity of condition of Brownian motion, namely almost sure continuity, is what it is: almost sure?  Why can't this be upgraded to Brownian motion being surely continuous?

The answer to the latter question is that, actually, it can and very often is.  The answer to the former question is that stipulating almost sure continuity is required in order to make the defining conditions of Brownian motion axiomatic, while still encompassing all of the methods of construction.

Indeed, the most common construction of Brownian motion (or at least the most direct) is through an application of Kolmogorov's extension theorem (the details of this approach can be found in Durrett).  But due to technical issues arising from measure theory (which are actually quite natural), the resulting construction leads to realizations of Brownian motions that are discontinuous.

On the other hand, the approach to constructing Brownian motion from the limit of scaled random walks actually leads to surely continuous realizations.  There are two available routes one can go when having this approach in mind: (a) construct Brownian motion paths directly (i.e. pointwise) from scaled random walks (one common way to do this is by appropriately specifying Brownian motion on the dyadic intervals, interpolating between, and taking limits) or (b) construct Brownian motion by obtaining the Wiener measure on the space of continuous functions beginning at the origin from the induced measures on this space obtained from the scaled random walks on $\mathbb{Z}^{\infty}_{2},$ the space of sequences with values of $-1$ or $1$.

The user also asked whether an explicit example of a discontinuous Brownian motion path could be exhibited.  The following is my complete answer to this and the above questions.

____________________________

Exhibiting a counter-example is straight-forward enough.  For example, let $B_{t}(\omega)$ be a Brownian motion and $\mathcal{T}(\omega)$ a stopping time on $(\Omega,\mathbb{P})$ with a continuous distribution.
Then with
$$B'_{t}(\omega)=\left\{\begin{array}{ll}B_{t}(\omega),&t\neq\mathcal{T}(\omega)\\B_{t}(\omega)+1,&t=\mathcal{T}(\omega),\end{array}\right.$$
$B'_{t}(\omega)$ satisfies (1) and (2) below, but is discontinuous precisely when $t=T(\omega)$.  Therefore, $B_{t}(\omega)$ is a particular realization of Brownian motion that is not everywhere continuous.

There are lots of other ways to obtain a "bad" Brownian motion.  Another example is
$$B'_{t}(\omega)=B_{t}(\omega)\mathbb{1}_{\{B_{t}(\omega)\;\text{irrational}\}},$$
but this is less straight-forward to prove.

____________________________

The reason for stipulating almost sure continuity has to do with the way one constructs Brownian motion, and the issue can be completely dispensed with dependent on one's approach.
The usual presentation in finance texts is the abstract one, namely given a probably space $(\Omega,\mathbb{P})$, one has a Brownian motion $B_{t}(\omega)$ on this space if
  1. For every set of times $0\leq t_{1}<t_{2}<\ldots<t_{n}$ the increments $B_{t_{1}},B_{t_{2}}-B_{t_{1}},\ldots,B_{t_{k}}-B_{t_{k-1}}$ form a mutually independent set of random variables on $(\Omega,\mathbb{P}).$
  2.  The increments above are normally distributed with mean $0$ and variance $\Delta t$.
  3. For almost every $\omega\in\Omega$ the path $t\mapsto B_{t}(\omega)$ is continuous.
Most texts also include a section that sketches a concrete realization of Brownian motion as the limit of scaled random walks.  If one does this rigorously, one sees that (3) upgrades to for every $\omega\in\Omega$

Indeed, if we start with $(\Omega,\mathbb{P})$ satisfying the above and let $\mathcal{P}$ denote the collection of continuous functions $[0,\infty)\to\mathbb{R}$ with $p(0)=0$, then we get from (3) the inclusion map
$$\mathcal{i}:\Omega\to\mathcal{P},$$
defined on a set $\Omega'\subset\Omega$ of full measure, and the push-forward measure of $\mathbb{P}$ onto $\mathcal{P}$ under this inclusion map turns out to be equal to the Wiener measure $\mathbb{W}$ on $\mathcal{P}$, which is unique.

Conversely, one can construct $(\mathcal{P},\mathbb{W})$ directly by starting with the set $\mathcal{P}$ (where every element of this set is continuous a priori) and demonstrating that the measures $\mu_{N}$ on $\mathbb{Z}^{\infty}_{2}$ arising from the appropriately scaled random walks $S_{t}^{N}(\omega)$ ($\omega\in\mathbb{Z}^{\infty}_{2})$ induce a collection of tight measures on $\mathcal{P}$ which converge weakly to $\mathbb{W}$:
$$\mu_{N}\Longrightarrow\mathbb{W}\;\text{(weakly)}$$
One then defines
$$\tilde{B}_{t}(\omega):=p(t)\in\mathcal{P}$$
and readily shows that under $\mathbb{W}$, $\tilde{B}_{t}$ satisfies (1)-(3) and that therefore
$$\tilde{B}_{t}(\omega)=B_{t}(\omega),$$
but that now *every* Brownian motion is continuous.

The equivalence of the implications above show the existence of Brownian motion is essentially tantamount to the existence of a Wiener measure on $\mathbb{W}$ arising from the sequence of measures arising naturally from the scaled random walks.  If one starts from the goal of obtaining this measure, one gets continuity for *every* Brownian motion $p(t)=B_{t}(\omega)$.

____________________________

Other constructions of Brownian motion require us stipulate almost sure continuity due to technicalities arising from measure theory on product spaces.  The quickest construction of Brownian motion in this direction is by applying Kolmogorov's extension theorem on a suitable class of processes; details can be found in Durrett.