The median is not the message

Sonnie Bailey

3 September 2018

One of my best friends was recently diagnosed with lung cancer.

Among other things (including reevaluating my own life and my values and priorities), this prompted me to re-visit an article by Stephen Jay Gould, titled “The Median is Not the Message”.

When I first read the article, I promised myself that I’d re-read it if I was ever diagnosed with cancer, or someone I loved was diagnosed with cancer.

Gould tells the story of being diagnosed with “abdominal mesothelioma, a rare and serious cancer usually associated with exposure to asbestos”. One of the first things he read was that “mesothelioma is incurable, with a median mortality of only eight months after discovery”.

How would you respond if you’d been diagnosed with an illness, and the median mortality rate was eight months?

Gould says:

“I suspect that most people, without training in statistics, would read such a statement as ‘I will probably be dead in eight months'”.

Before I read this article, I may have been the same.

But that’s not the smart way to think about it:

“When I learned about the eight-month median, my first intellectual reaction was: fine, half the people will live longer; now what are my chances of being in that half. I read for a furious and nervous hour and concluded, with relief: ****ed good. I possessed every one of the characteristics conferring a probability of longer life: I was young; my disease had been recognized in a relatively early stage; I would receive the nation’s best medical treatment; I had the world to live for; I knew how to read the data properly and not despair.”

He then talks about the distribution of variation:

“I immediately recognized that the distribution of variation about the eight-month median would almost surely be what statisticians call “right skewed.” … After all, the left of the distribution contains an irrevocable lower boundary of zero (since mesothelioma can only be identified at death or before). Thus, there isn’t much room for the distribution’s lower (or left) half – it must be scrunched up between zero and eight months. But the upper (or right) half can extend out for years and years, even if nobody ultimately survives. The distribution must be right skewed, and I needed to know how long the extended tail ran – for I had already concluded that my favorable profile made me a good candidate for that part of the curve.

The distribution was indeed, strongly right skewed, with a long tail (however small) that extended for several years above the eight month median. I saw no reason why I shouldn’t be in that small tail, and I breathed a very long sigh of relief.”

In other words – when we hear someone talk about an “average” figure, it often doesn’t tell us much. We often think of a normal distribution, where there is a central figure, and everything surrounds the mean/median in a symmetrical fashion:

But many distributions aren’t like this. Gould talks about a right-skewed distribution:

And there are other variations, including, for example, bimodal distributions:

An example includes peak restaurant hours (lunch and dinner). Perhaps the classic example of this is where we measure the heights of a random sample of adults. The heights of women will peak around one point, and the heights of men will peak around another. If we separated the two, the distributions would be modal and would resemble a normal distribution. Other examples include book prices (soft copy, hard copy; in fact book prices may even be tri-modal now, with electronic copies being a separate price point). In some cases, income distributions within certain professions can also be multi-modal.

Alternatively, the distribution could be all over the place, with no real rhyme or reason.

Knowing a single data point doesn’t tell us much about the bigger picture. Sometimes, we’re not even given that much information – for example, when someone says “average”, do they mean “mean” or “median”?

Of course, this doesn’t just relate to medical diagnoses. Some examples that relate to money or retirement include:

  • There’s a common statistic that the six months after retirement is the most “dangerous” period in a person’s life, because the mortality rate spikes significantly after people retire. We can read lots of things into this: perhaps it’s dangerous because people lack purpose, social identity, and structure in their day. But the most likely reason for this is that terminal illness is a reason many people have to retire. The spike doesn’t necessarily relate to healthy people who retire – it relates to unhealthy people who have to retire for health reasons.
  • When it comes to retirement expenditure guidelines, it’s great to have a rough figure for what people in “retirement” spend, but you also need to consider that these figures capture a wide range of people in very different circumstances at different stages of their retirements. It would be better if we could get a feel for what expenditure looked like for people at a similar age, with a similar health status, with a similar housing status, and similar employment status (since the guidelines include people who still generate an income to supplement their NZ Super – and not surprisingly, these people tend to be at the upper end of spenders). Single figures representing what people might “need” to save at retirement are a starting point, but by no means should they be the final word in terms of what you need to save.

If there’s one thing I’d like you to take from this article, it’s to remember that the median is not the message. And if you’re ever in a situation where you’re given a simple figure like a prognosis for an illness that you or a loved one are diagnosed with, that you should step back and give some thought to what your situation might be.

In my friend’s situation, for example, the 5-year median mortality rate for someone at his stage of lung cancer isn’t great. But consider:

  • The mortality rates are drawn from the entire sample of people who are diagnosed with the same illness. Most of these people are much older than 37. A 70-year-old with other health issues isn’t going to be able to fight any given form of cancer as effectively as someone who is otherwise healthy. A 37-year-old is likely to be able to withstand more aggressive treatment, which might increase the probability of treatment success.
  • Five-year mortality rates are based on findings from people who were diagnosed at least five years ago – in many cases, much longer than five years ago. Treatments change and hopefully improve all the time. All else being equal, the survival rate should improve as time goes on, thanks to medical research and technological innovation.
  • A single figure, like a median, says nothing about the distribution of mortality. As with Gould, there is likely to be a skew to the right, with some people who receive this diagnosis living for many years.

The key isn’t to focus on the single figure, because the median isn’t the message. The key is to ask where you might be within the distribution. And where possible, what factors are under your control – to increase the likelihood of good outcomes, and decrease the likelihood of bad outcomes.

Other articles you may like:

“Buy things, not experiences”

“Buy things, not experiences”

The sweet and the sour

The sweet and the sour

How I money (2022)

How I money (2022)

“Market cap” can be misleading

“Market cap” can be misleading

The right ragrets

The right ragrets

The 4% rule is a mind-killer

The 4% rule is a mind-killer