A probability game
Assume that two positive numbers are given, \(X\) and \(Y\), with unknown joint probability distribution \(P\), and \(X\neq Y\) a.s.
A player draws randomly one of the numbers and has to guess if the number is smaller or larger than the other unrevealed number, i.e., let \(U\sim Bernoulli(\tfrac{1}{2})\) independent of \(X, Y\), then the player sees \(Z_{1} = UX + (1-U)Y,\) and \(Z_{2} = (1-U)X + UY\) remains unseen.
A random guess (coin-flip) would due to the sampling \(U\), indepedently of \(F\), have probability \(\tfrac{1}{2}\) of correct guessing. The question is if we can find a better strategy?
Assume we can sample \(\widetilde{Z}\) from a distribution with support covering \(X\) and \(Y\). Then the player can instead guess that \(Z_{1}>Z_{2}\) iff \(Z_{1}>\widetilde{Z}\). In this case the player guesses correct with probability
\begin{align*}
&\mathbb{P}(Z_{1}>\widetilde{Z}, Z_{1}>Z_{2}) + \mathbb{P}(Z_{1}<\widetilde{Z}, Z_{2}>Z_{1}) = \\\
&\qquad \mathbb{P}(Z_{1}>\widetilde{Z}, Z_{1}>Z_{2}, U=1) +
\mathbb{P}(Z_{1}>\widetilde{Z}, Z_{1}>Z_{2}, U=0) \\\
&\qquad + \mathbb{P}(Z_{1}<\widetilde{Z}, Z_{2}>Z_{1}, U=1) +
\mathbb{P}(Z_{1}<\widetilde{Z}, Z_{2}>Z_{1}, U=0) = \\\
&\qquad \tfrac{1}{2}\mathbb{P}(X>\widetilde{Z}, X>Y) + \tfrac{1}{2}\mathbb{P}(Y>\widetilde{Z}, Y>X) +
\tfrac{1}{2}\mathbb{P}(X<\widetilde{Z}, X<Y) + \tfrac{1}{2}\mathbb{P}(Y<\widetilde{Z}, Y<X)
\end{align*}
Applying \(\mathbb{P}(A)+\mathbb{P}(B) = \mathbb{P}(A\cup B)+\mathbb{P}(A\cap B)\), we have
\begin{align*}
&\tfrac{1}{2}\Big[\mathbb{P}(X>Y)+\mathbb{P}(X>Y, X>\widetilde{Z}>Y) + \\\
&\qquad \mathbb{P}(X<Y) + \mathbb{P}(X<Y, X<\widetilde{Z}<Y) \Big] \\\
&\qquad= \tfrac{1}{2}\left\{1+\mathbb{P}(\widetilde{Z}
\text{ between } X \text{ and } Y)\right\},
\end{align*}
which under the given assumptions is strictly larger than \(\tfrac{1}{2}\). The player should intuitively try to to choose the distribution of \(\widetilde{Z}\) that “separates” \(X\) and \(Y\).
Example in python
import sys
import numpy as np
import scipy.linalg as linalg
from scipy.stats import norm, bernoulli, uniform
if sys.version_info.major < 3 or sys.version_info.minor<6:
raise Exception("Must be using Python >= 3.6")
def rexp(n: int): return np.exp(-uniform.rvs(size=n))
class game:
"""Simulation of simple probability game"""
def __init__(self,
n: int,
gen=lambda n: -np.log(uniform.rvs(size=n)),
rho=.5):
R = np.matrix([[1, rho], [rho, 1]]);
L = linalg.cholesky(R)
xy = np.matrix(np.resize(norm.rvs(size=2*n), (n,2)))*L
self.u = bernoulli.rvs(0.5, size=n)
self.w = np.array(xy[range(len(xy)), self.u])
self.z = np.array(xy[range(len(xy)), 1-self.u])
self.x = np.array(xy[:,[0]])
self.y = np.array(xy[:,[1]])
self.g = gen(n)
self.guess = self.w>self.g
self.true = self.w>self.z
n = int(1e5)
sim = game(n)
print(np.mean(sim.true==sim.guess))
0.60851
Example in R
sim <- function(n, delta=0, ...) {
xy <- exp(mets::rmvn(n, ...)) + delta
u <- rbinom(n, 1, 0.5)
w <- xy[cbind(seq(n), u+1)]
z <- xy[cbind(seq(n), 2-u)]
wmax <- w>z
cbind(w=w, z=z, wmax=wmax, x=xy[,1], y=xy[,2])
}
n <- 1e5
val <- sim(n=n, rho=0.5)
z0 <- rexp(n, 1)
guess <- val[, "w"]>z0
mean(guess==val[, "wmax"])
[1] 0.6037
val <- sim(n=1e5, rho=0.5, mu=c(-3,3))
z0 <- rexp(n, 1)
guess <- val[, "w"]>z0
mean(guess==val[, "wmax"])
[1] 0.96077
val <- sim(n=1e5, rho=0.5, mu=c(10,10))
z0 <- rexp(n, 1)
guess <- val[, "w"]>z0
mean(guess==val[, "wmax"])
[1] 0.49956