2  Potential outcomes

2.1 Potential outcomes

The notations we consider are grounded in the potential outcomes framework, a framework initiated by Neyman in 1923 for Randomized Controlled Trials (RCT), and then popularized in the 70’s by Rubin.

Consider an individual \(i\). We denote \(T_{i}\) the random variable corresponding to the treatment assignment, which takes values 1 if the individual is treated and 0 otherwise, that is \(T_{i} \in \{0,1\}\). We also define \(Y_{i}^{(1)}\) (resp \(Y_{i}^{(0)}\)) as the potential outcome of this individual if the treatment was given (resp not given).

\[ Y_{i}^{(\text{\tiny obs})}={Y_{i}^{(T)}}=\left\{\begin{array}{ll} {Y_{i}^{(0)}} & {\text { if } T_{i}=0} \\ {Y_{i}^{(1)}} & {\text { if } T_{i}=1} \end{array}\right. \]

As a consequence, for this individual, we also have one missing potential outcome, denoted by \(Y^{(\text{\tiny mis})}\): \[ Y^{(\text{\tiny mis})}={Y^{(1-T_i)}}=\left\{\begin{array}{ll} {Y_{i}^{(1)}} & {\text { if } T_{i}=0} \\ {Y_{i}^{(0)}} & {\text { if } T_{i}=1} \end{array}\right. \]

In order to know if the treatment is effective or not for this individual we are interested in \(\delta_{i} = Y_{i}^{(1)} - Y_{i}^{(0)}\), the individual treatment effect (ITE). However, by allocating treatment (\(T_{i}=1\)) or control (\(T_{i}=0\)) to this individual, one can only one out of the two potential outcomes, which is either \(Y_{i}^{(1)}\) \(Y_{i}^{(0)}\). This is known as the fundamental problem of causal inference.

2.2 Identification of the ATE

Although we cannot compute the individual treatment effect, we can estimate the Average Treatment Effect (ATE), which is the average causal effect of the treatment on the population using randomization.

Assume we observe \(n\) individuals, indexed by \(i \in \mathcal{I}\). For each unit \(i\), and for each treatment level there are corresponding potential outcomes \(Y_i^{(0)}\) and \(Y_i^{(1)}\). For \(t \in \{0, 1\}\), we define the random variables \(Y^{(t)}\) and \(T\) that associate to \(i \in \mathcal{I}\) respectively \(Y_i^{(t)}\) and \(T_i\).

Definition 2.1: Average Treatment Effect (ATE)
We define the Average Treatment Effect as: \[\tau = \mathbb{E}\left[Y^{(1)} - Y^{(0)}\right]\]

\(\mathbb{E}[Y^{(t)}]\) cannot be estimated by computing \(\mathbb{E}[Y^{(\text{\tiny obs})}|T=t]\) which is the mean of \(Y^{(\text{\tiny obs})}\) for the sub-population that received treatment \(t\) because in general the sub-population that received treatment \(T=t\) is different (for instance older people, sick people…) from the population that did not. To properly measure the causal effect, two similar groups of people are needed. Hence \(\mathbb{E}[Y^{(t)}]\) is the expectancy of \(Y^{(\text{\tiny obs})}\) had everybody received treatment \(T=t\). This is often denoted with the \(do\) operator as \(\mathbb{E}[Y^{(\text{\tiny obs})} | do(T=t)]\). Note that this operator is interventional and offers to see the ATE as the expected difference of outcome had everybody received and not received the treatment. Since again due to the fundamental problem of causal inference we cannot observe both \(Y_i^{(0)}\) and \(Y_i^{(1)}\). We need to build similar groups of people.