The article https://online.wsj.com/public/resources/documents/VirtuOverview.pdf is a neat little illustration of a simple asymptotic toy distribution given an initial probability of a win or loss per-trade. It is used as an example to illustrate the basic methodology behind the working market-maker business – develop a small edge and scale this up as cheaply as possible to maximise the probability of overall profit.
If we take $p=0.51$ as the probability of a win per-trade and then after $n$ transactions we will have a number of ‘wins’ k that will vary from 0 to n. We model each trade as the outcome of a binomial 0-1 trial.
In order to come out at breakeven or better, the number of wins k needs to be at least $\frac{n}{2}$. Using the binomial distribution this can be modelled as:
$P\left(n>\frac{k}{2}\right) = \sum_{\frac{k}{2}}^\infty \frac{n!}{k!(n-k)!}p^k(1-p)^{n-k}$
As the binomial distribution converges to a normal $\mathcal{N}(np, np(1-p))$ as n gets large, we can use the distribution below to model the win/loss probability over n:
$ \int_{\frac{k}{2}}^\infty \mathcal{N}\left(np, np(1-p) \right) dx $
Which is
$ \int_{\frac{k}{2}}^\infty \frac{1}{\sigma\sqrt{2}\pi}e^{-\frac{1}{2}\frac{x-\mu}{\sigma}^2} dx$
Where $\mu=np$ and $\sigma^2=np(1-p)$
This can be modelled in R
> p <- 0.51 > n <- 100 > 1-pnorm(q=n/2, mean=n*p,sd=sqrt(n*p*(1-p))) [1] 0.5792754 > n <- 1000 > 1-pnorm(q=n/2, mean=n*p,sd=sqrt(n*p*(1-p))) [1] 0.7364967
Showing that with a win probability of 51% 100 trades gives us a 57% probability of breakeven or better and 1000 trades gives us a 73% chance of breakeven or better.
We can plot the probability of breakeven holding p constant and changing n from 1 to 1000:
n<-seq(1,1000) > y <- 1-pnorm(q=n/2, mean=n*p,sd=sqrt(n*p*(1-p))) > library(ggplot2) > library(scales) > qplot(n,y)+scale_y_continuous(label=percent)
Which produces the following graph
Which shows the convergence to a sure 100% probability of profit as n gets large.
To make it more interesting we can generate different paths for n from 1 to 10000 but also vary the win probability from say 45% to 51% and look at the paths as we vary n and p:
n <- seq(1,10000) p<- 0.5 y <- 1-pnorm(q=n/2, mean=n*p,sd=sqrt(n*p*(1-p))) plot(n, y, type='l', ylim=c(0,1)) probs <- seq(0.45, 0.55, length.out = 100) for (pr in seq_along(probs)){ p<-probs[pr] y<-1-pnorm(q=n/2, mean=n*p,sd=sqrt(n*p*(1-p))) lines(x=n,y=y,col=ifelse(y<0.5,rgb(1,0,0,.5),rgb(0,1,0,.5))) }
Which shows the probabilities of breakeven or better given a number of different starting win/loss probabilities and a varying number of trades. The path with $p=0.5$ is shown in black.