What a 19th Century Cholera Outbreak Can Teach Us About 21st Century Analytics

965px-Snow-cholera-map-1 (1)

Between August 31st and Sept 2nd, 1854 in the Soho district of London, a cholera epidemic claimed the lives of over 100 men, women, and children. By the end of the outbreak a month later, over 600 people were dead. What makes this outbreak noteworthy, however, is not its severity nor its location.

Crowded 19th century cities with questionable waste practices where certainly no strangers to city-wide bouts of mass sickness. Rather, what distinguishes this well-documented episode is the insightful efforts of epidemiologist John Snow and the lessons his work 19th century work holds for 21st century analytics.

Unlike many of his peers, Snow did not subscribe to the “miasma” theory (i.e. bad air) of illness and disease. Although the germ theory was not fully established at this time, his work and outlook nonetheless emphasized patterns of movement, geography, and common sources of possible contamination.

Through painstaking footwork, numerous interviews, and a simple but incredibly clever data visualization, Snow eventually identified the source of the outbreak: a hand pump on Broad Street with an unfortunate proximity to a leaky cesspit that was dumping bacteria into this popular water source.

While this incident is widely regarded as a seminal event in the history of modern epidemiology and public health, it also holds relevance for us today because it embodies three critical components of impactful analytics practice: Description, Relation, and Prediction.

Description 

This is the single most important step in the hierarchy of analytics and yet it is strangely overlooked and underappreciated. In one sense, the description of the problem was obvious: people are dying of cholera. But Snow dug deeper, identifying common linkages across the ages and sexes of those who died during the outbreak.

The map featured at the top of this post beautifully captures his systematic and thorough dedication to descriptive analytics. Snow drew a little line on the map at the residence of each cholera victim. What emerges from this series of little marks is a clear clustering of those deaths around the Broad Street pump and the critical insight that the water pump, not bad air, might be the culprit.

Without the aid of computers, machine learning, or modern visualization tools, Snow found a way to represent and clearly communicate a key common link among the victims. It clarified his thinking in 1854 and provides a timeless model of descriptive analytics today.

Relation

By crisply representing who lived where, Snow was able to identify a likely source of the outbreak. But just as importantly, the map also provided a means of tying together seemingly unrelated cases from other parts of the city. For example, one sickened resident did not live near the pump and had no obvious source of exposure. This would seem to disconfirm any pump-based theory…until it was discovered that having once lived near Broad Street, this unfortunate soul fell in love with the taste of the water there, going so far as to pay a fellow resident to bring her a regular supply.

Similarly, a group of school children who succumbed to cholera also lived some distance from the guilty pump but tragically sampled its water on the way to school. These exceptions helped prove the rule, with Snow’s map again playing the key role.

Prediction

Quality analytics don’t always involve prediction, but if one claims to truly understand an underlying process, reasonable and reasonably accurate predictions should follow. In this instance, Snow’s prediction is clear: take off the handle and the outbreak should whither.

Although officials were skeptical, they reluctantly removed the handle and the number of deaths dropped precipitously thereafter. It of course possible to attribute the decline of the outbreak to other causes (and Snow himself by some reports was skeptical of the impact of removing the handle). Nonetheless, the collective, systematic, and logical picture that he painted strongly suggests that this intervention likely saved the lives of fellow Londoners.

“A problem well put is half solved.”

                                       -John Dewey

Machine learning and big data are the hot topics in human capital analytics and with good reason. The wealth of available data coupled with new, increasingly powerful tools is continually opening new avenues for intense and productive data exploration.

Yet, the saga of John Snow is a powerful reminder that the right data for the right question trumps processing, volume, velocity, and variety every time. Snow’s efforts were obviously critical to the denizens of London in 1854 but his story holds a continuing and surprising relevance for today’s analytics practitioners.

Like this post?

Get our FREE Turnover Mini Course!

You’ll get 5 insight-rich daily lessons delivered right to your inbox.

In this series you’ll discover:

  • How to calculate this critical HR metric
  • How turnover can actually be a GOOD thing for your organization
  • How to develop your own LEADING INDICATORS
  • Other insightful workforce metrics to use today

There’s a bunch more too. All free. All digestible. Right to your inbox.

Yes! Sign Me Up!

Comments or Questions?

Add your comments OR just send me an email: john@hranalytics101.com

I would be happy to answer them!

Photo Credit:

“Snow-cholera-map-1” by John Snow – Published by C.F. Cheffins, Lith, Southhampton Buildings, London, England, 1854 in Snow, John. On the Mode of Communication of Cholera, 2nd Ed, John Churchill, New Burlington Street, London, England, 1855. Licensed under Public Domain via Commons – https://commons.wikimedia.org/wiki/File:Snow-cholera-map-1.jpg#/media/File:Snow-cholera-map-1.jpg

Contact Us

Yes, I would like to receive newsletters from HR Analytics 101.