Exposing Selection Bias

Selection Bias Severity

SBS=[P(yx)P(yx)][P(yxs)P(yxs)]\text{SBS} = [P(y_x) - P(y_{x'})] - [P(y_x|s) - P(y_{x'}|s)]
Colors visualize

Selection Biased

P(x)=0.5,P(x)=0.5P(x) = 0.5, P(x') = 0.5
P(yx)=0.5P(y|x) = 0.5
P(yx)=0.5P(y|x') = 0.5
P(x,y)=0.25,P(x,y)=0.25,P(x,y)=0.25,P(x,y)=0.25P(x,y) = 0.25, P(x,y') = 0.25, P(x',y) = 0.25, P(x',y') = 0.25
0.25P(yx)0.750.25 \leqslant P(y_x) \leqslant 0.75
0.25P(yx)0.750.25 \leqslant P(y_{x'}) \leqslant 0.75
DAG with confounding and a selection node