24 October, 2015

Analyzing the Definition of Independence

One of the most fundamental concepts of probability theory is that of independence.  The concept is intuitive and captures the idea that two experiments are independent if the outcome of one does not affect the outcome of the other.  What we mean here by experiment is a measurable space $\Omega$ (called the sample space) consisting of points $\omega\in\Omega$ that represent all the possible outcomes of our experiment, and a $\sigma$-algebra $\mathcal{F}$ consisting of all possible combinations of outcomes (events) represented by subsets $A\in\Omega$.  Additionally, there is a measure $\mathbb{P}$ that assigns $1$ to the entire sample space $\Omega$ and that is countably additive on $\mathcal{F}$ in the sense that whenever $\{A_{n}\}_{n\geq0}$ is an at-most countable set of disjoint events of $\mathcal{F}$ we have $\mathbb{P}(\cup_{n} A_{n})=\sum_{n}\mathbb{P}(A_{n})$.  We then have the following formal definition of independence:




Definition (Independence).  Let $(\Omega,\mathcal{F},\mathbb{P})$ be a probability space and let $A,B\in\mathcal{F}$ be two events.  We say $A$ and $B$ are independent if $\mathbb{P}(A\cap B)=\mathbb{P}(A)\mathbb{P}(B)$.

While the precise definition is simple, it does not clearly connect back to the intuitive definition of independence.  A common justification for the definition is that $\mathbb{P}(A\cap B)=\mathbb{P}(A)\mathbb{P}(B|A)=\mathbb{P}(B)\mathbb{P}(A|B)$ and that both of these expressions are equal to $\mathbb{P}(A)\mathbb{P}(B)$ when $A$ and $B$ are "independent" of each other.  But this reasoning is essentially circular with respect to our definition above and does not help resolve the apparent lack of connection between our given mathematical model of independence and our intuitive notion of it.

In order to clarify the definition, let's begin by trying to understand how independence can occur within our model.  It should be reasonably clear that if $\Omega$ is "one-dimensional" (a concept we shall not attempt to define rigorously here) then $A$ and $B$ can never be independent.  To convince ourselves of this, imagine $\Omega$ being a bounded and connected subset of $\mathbb{R}^{2}$ (think venn diagrams) and imagine two measurable events $A$ and $B$.  Then unless one of the events has probability $0$ of occurring, it is always possible to refine our estimate of the other occurring given that one has already occurred ($A$, say):  $\mathbb{P}(B|A)\geq \mathbb{P}(B)$.  In other words, if $A$ and $B$ are events composed of outcomes $\omega$ from the same same sample space $\Omega$, then they can never be independent unless one has probability zero.

This would seem to preclude our proposed definition from consideration.  To dispense with this conclusion, let's start by imaging our sample space $\Omega\subset\mathbb{R}^{2}$ as being the collection of outcomes from repeating an experiment twice.  In particular, we suppose the existence of two sets $\Omega_{1}\subset\mathbb{R}$ and $\Omega_{2}\subset\mathbb{R}$ so that $\omega=(\omega_{1},\omega_{2})\in\Omega_{1}\times\Omega_{2}=\Omega.$  Our experimental outcomes are now being modeled as a "two dimensional" in the sense that each outcome relies on two coordinates.  We make the further assumption that $\mathbb{P}=\mathbb{P}_{1}\times\mathbb{P}_{2}$ in the sense that if $E_{1}\in\Omega_{1}$ and $E_{2}\in\Omega_{2}$, then $\mathbb{P}(E_{1}\times E_{2})=\mathbb{P}(E_{1})\cdot\mathbb{P}(E_{2}).$  In other words, $\mathbb{P}$ is the product measure on $\Omega$ obtained from the "marginal" measures $\mathbb{P}_{1}$ and $\mathbb{P}_{2}$ on $\Omega_{1}$ and $\Omega_{2}$, respectively.

Now returning to our events $A$ and $B$, suppose that $A$ consists only of outcomes $\omega$ that are described completely by $\omega_{1}$, and the same for $B$ with $\omega_{2}$.  More precisely, suppose $A'\in\Omega_{1}$ and $B'\in\Omega_{2}$; then $A=A'\times\Omega_{2}$ and $B=\Omega_{1}\times B'$.  We then have
$$\mathbb{P}(A\cap B)=\mathbb{P}(A'\times\Omega_{2}\cap\Omega_{1}\times B')=\mathbb{P}(A'\cap\Omega_{1}\times B'\cap\Omega_{2})=\mathbb{P}(A'\times B')=\mathbb{P}_{1}(A')\cdot\mathbb{P}_{2}(B')$$
Observing that $P_{1}(A')=\mathbb{P}(A'\times\Omega_{2})=\mathbb{P}(A)$ and similarly for $B'$ and $B$, we get
$$\mathbb{P}(A\cap B)=\mathbb{P}(A)\cdot\mathbb{P}(B)$$
In particular, the events $A$ and $B$ are independent with respect to our mathematical definition, and this conforms to our intuition about what we have said about these events.