A Simple Model with Major Insights on Selection Bias

I mentioned earlier that I have been reading (and rereading) Scott Page’s masterful book “The Model Thinker”. If you are serious about understanding models and data as either a leader or as an analyst, it’s a valuable and accessible read.

Today I want to take an example from Chapter 3 about biased selection processes and how even somewhat moderate biases can quickly compound to yield major differences in long-term outcomes. The example domain is the number of male v. female CEOs but the principles apply broadly.

If you learn something in the following, give credit to Scott Page….and if you see any errors, blame me.

Female CEOs and Selection Processes

Here is the basic setup:

  1. Suppose that to reach the level of CEO, one must be promoted 15 times over the course of 30 years.

  2. Further suppose that the probability of a male being promoted in a given two-year period (\(P_m\)) is .5 while that of a female (\(P_f\)) is .4.

Males in this hypothetical process would therefore enjoy a 25% advantage in promotion probability (.5/.4 = 1.25), a non-trivial leg up but perhaps not seen as absurdly high on the surface.

But then we think about the setup some more, realizing that this 50%/40% advantage plays out over and over, once at each of the 15 possible promotion points. If we play out this simplified probabilistic process 15 times over 30 years, we would end up with close to 30 times as many male CEOs than female CEOs.

Really? 30:1 from a 10% difference?

It turns out we can frame this as an \(X^n\) problem, where \(X\) is the probability of promotion and \(n\) is the number of promotion steps. Applied to our hypothetical case we get the following probabilities for reaching CEO:

  1. Males: \(.5^{15}\)
  2. Females: \(.4^{15}\)

For reference, this is just like the probability of flipping a coin and getting a string of heads: we have a 50% chance of getting heads on the first flip, a 25% chance of getting two heads on the first two flips(\(.5^2\)), a 12.5% chance of getting three heads on the first three flips (\(.5^3\)) and so on.

We are just applying this same idea to our respective male and female promotion rates of .5 and .4.

What we get is that the chances of a male becoming a CEO are \(.5^{15}\) v. females at \(.4^{15}\).

Independently, those numbers are both tiny (there are very few CEOs after all), but their ratio is roughly 28:1 in favor of males (\(.5^{15}/.4^{15}\)).

The figure below represents this simple process and the emerging differences through eight promotion steps (where area corresponds to the proportion promoted). I stopped at eight rounds to keep the circles visible.

As you can see, a repeated 50% to 40% advantage quickly shifts the total balance of promotions.

In hard numbers, think of it like the following:

  • If you start with 1 million males in this process, you’ll end up with 30.5 CEOs at the end of 30 years (\(1,000,000 \times .5^{15} = 30.5\))
  • If you start with 1 million females, you’ll end up with 1.07 CEOs (\(1,000,000 \times .4^{15} = 1.07\))

Time, Hierarchies, and Process

The lesson above is that small differences/ biases/ preferences can compound over time to create major differences in endpoint outcomes. The challenge of human cognition is that our daily intuitions and smaller-scale thinking makes it impossible for us to immediately apprehend that a 28:1 difference can emerge from a repeated 50% to 40% selection advantage.

This is a strong reminder that we need thoughtful caution when trying to understand and address large differences at process endpoints. Said differently, big differences in output don’t necessarily imply big differences in input.

To further our understanding (based on another Page example) consider now a different subset of organizations in which the male-to-female CEO ratio is now 3:1, roughly one tenth of that in the above example.

Clearly the promotion process for this second group is moving toward something more fair/less imbalanced, right?

Not necessarily.

If we still require 15 promotions to be CEO then yes, this 3:1 ratio could only be met by a much smaller difference in promotion rates (here \((.5/.465) = 1.075\)) which, when raised to the 15th power, gives us 3 (i.e. 3:1 males to females).

But what if our seemingly “fairer” 3:1 organizations are just flatter and therefore require fewer promotions to reach the CEO slot?

What if we keep the same .5/.4 promotion probability advantage for males but instead only require 5 promotions to become CEO instead of 15?

The math is direct: \(.5^{5}/.4^{5} = 3.05\).

In this case then, our second set of organizations have a less skewed endpoint outcome (3:1 instead of 28:1) but it’s not because they have reduced the bias at each decision step. Rather, the same biased process is just played out fewer times. The results are better, but not for the reasons we expect.


Granting numerous caveats, I would suggest at least the following two lessons from our \(X^n\) model in the context of possible selection bias/preference analysis:

  1. Consider the possible role of compounded \(X^n\)-type processes when evaluating endpoint differences of interest at your organization
  2. Consider how the same basic process can play out differently in different organizations or even in different business units. Start with the basic \(X^n\) model, asking about the \(X\) value for our groups of interest as well as the \(n\) value. Remember too that organizations can differ by both \(X\) and \(n\) among many other differences.

To this I would add that correctly detecting, let alone solving, any actual problem with a genuinely biased selection process won’t come down to just estimating \(X\) and \(n\) and calling it a day.

Your data will be noisy.

It will likely be limited in scale.

It will also be subject to real-world contexts, constraints, and caveats.

All of this is absolutely guaranteed to muddy the waters.

And yet our basic \(X^n\) model is still incredibly useful. Why?

First, it’s a tractable tool that forces us to think concretely about process, measurement, and outcome.

Second, as a result, we can apply the model to our HR data and get some rough insights on any differences in promotion rates, frequency, and steps required to reach a given level.

This then serves as a concrete starting point for discussions with managers, directors, and other leaders about consistency, talent processes, and talent process improvement.

Together this represents a major step towards developing mature HR analytics processes and improving human capital decisions.

Contact Us

Yes, I would like to receive newsletters from HR Analytics 101.