We get a lot of news each day, both public and personal. How should that news change our views – our predictions – about the world and our lives? (More)

**The Signal and the Noise, Part II: How Should The News Change Your Views?**

This week *Morning Feature* considers Nate Silver’s new book *The Signal and the Noise: Why So Many Predictions Fail – But Some Don’t*. Yesterday we looked at four common reasons for weak predictions. Today we see the two most common methods for scientific predictions and why scientists – and the rest of us – should adopt the better method. Tomorrow we’ll conclude with why we need to make better predictions, and how to better sift through the predictions we see and hear.

Nate Silver is a mathematician who gained his reputation as a baseball statistical analyst before shifting to politics in 2008, when he correctly predicted the presidential winner in 49 states and all 35 U.S. Senate races. His FiveThirtyEight blog at the

New York Timesis widely cited by campaigns and media sources, and in 2009TimeMagazine included him in their 100 Most Influential People listing. He has a B.A. in economics from the University of Chicago and has written forSports Illustrated,Newsweek,Slate,Vanity Fair, and many other publications.

**Once Upon a Mammogram**

Most women over age 50 have had at least one mammogram. Many had our first in our 40s and, for many, it was not only painful but frightening … and unnecessary.

Roughly one-in-eight women will develop breast cancer at some point in their lives, making it a serious health risk. As with most cancers, early detection greatly increases the survival rate, and a mammogram will detect about 80% of breast cancers. Thus, women were told to get mammograms every year or two, starting at age 40.

Yet that medical advice changed in 2009, as the U.S. Preventive Services Task Force recommended that women under age 50 *should not* get a mammogram unless suggested by a doctor based on the woman’s personal and family medical history. Why did the Preventive Services Task Force change their recommendation?

**Meet Thomas Bayes**

The reason has to do with someone you’ve probably never heard of, unless you’re a mathematician: Thomas Bayes. A minister and mathematician in London during the 18th century, Bayes developed a theory to predict the likelihood of an event, based on a prior estimate and new information. His work is called Bayes Theorem, and it’s a simple equation. To solve it you need three values:

*Prior*– How likely was this event,*before*you found any new information.*Signal*– How often would the new information predict the event, if the event*is*happening (a “true positive”)?*Noise*– How often would the new information predict the event, if the event*is not*happening (a “false positive”)?

Each of those numbers is expressed as a probability, where 1.00 = 100% (certain) and 0.00 = 0% (impossible). The probability of a 50-50 coin flip is 0.50. Then plug the numbers into this equation:

Okay, I know, your eyes just glazed over. But trust me and walk through an example with me. Like … mammograms for women under 50.

**Twice Upon a Mammogram**

We know from the link above that a mammogram will detect 80% of breast cancers, so our *Signal* value is 0.80. But what is our *Prior*? How likely are women ages 40-50 to have breast cancer? A review by the American Cancer Society found that only about 1% of women under 50 have breast cancer. And what about the *Noise*: women who do not have breast cancer but would still get a ‘positive’ mammogram? A University of California San Francisco paper found “false positives” mammograms are not rare; Silver estimates their probability at about 7%. So we get these values:

*Prior*= 0.01*Signal*= 0.80*Noise*= 0.07

Plug those values into the Bayes Theorem equation and you get (0.008)/(0.008 + .069) = 0.103 … or about 10%. In other words, if a woman under 50 gets a ‘positive’ mammogram, *there’s only a 10% that chance she really has breast cancer*.

But how can that be, if mammograms will detect 80% of cancers? The answer *feels* wrong, but think of it this way. Assume 1000 typical women under age 50 go in for mammograms. On average, only 10 (1% *Prior* estimate) of those women actually have breast cancer, and the mammogram will detect it in 8 cases (80% *Signal* of true positives). Of the other 990 women – who do not have breast cancer – mammograms will falsely ‘detect’ cancer in 69 cases (7% *Noise* of false positives). Thus, only 8 of the 77 ‘positive’ mammograms (about 10%) are cases where women actually have breast cancer. The rest are women who are frightened, and probably sent for more tests and perhaps even biopsies … unnecessarily.

That’s why the Preventive Services Task Force said a woman under 50 should not get a mammogram unless her doctor suggests it based on her personal and family medical history. That personal or family history may make her much more likely to have breast cancer. In Bayesian terms: her own history may raise the *Prior* enough that the *Signal* of a ‘positive’ mammogram, for her, would outweigh the *Noise*.

**Once Upon a Debate**

Yes, new information should change our views of the world and our lives. But we often weigh new information poorly in light of what we already had good reason to believe. Take the presidential debate Wednesday night, which left most Republicans aglow with delight and many Democrats downcast with fear. Yes, most mainstream pundits and instant polls agree that Mitt Romney won the debate. But how much should that event change our views on President Obama’s electoral chances?

Not much, actually.

Before the debate, the *New York Times*‘ Nate Silver had President Obama an 85% favorite to win on November 6th, so that’s a reasonable *Prior*. That article also includes instant polling data on the prior presidential debates. Because this was the first presidential debate of 2012, let’s look at only first debate outcomes in each election since 1984 (when instant polling began):

*Winning Election*– The winners of the first debates won only 3 of those 7 elections. On the other hand….*Gaining in Polls*– The winners of the first debates gained in the polls in 4 of those 7 elections. In the other 3, first debate winner*dropped*in the polls.

We can use that data to estimate the *Signal* and *Noise* values for winning a first debate. In terms of winning the election, the *Signal* is 0.43 (4-of-7) and the *Noise* is 0.57. In terms of gaining at least something in the polls, those values are reversed. Of course, seven is a very tiny sample set, so we might boost the *Signal* a bit because debates seem like they should matter and there’s too little data to prove they don’t. On the other hand, the polls show there are few undecided voters this year and early voting has already begun in several states, so we might lower the *Signal* a bit because the debates can’t change as many votes. Let’s be generous and say winning the debate was 51% *Signal* and 49% *Noise*.

We then plug those numbers into Bayes Theorem and find President Obama should be about an 84% favorite to win the election, after having lost the first debate. In other words, that debate outcome shouldn’t change our prediction much at all. In fact Silver now projects President Obama an 87% favorite, because of other new polls.

Does this mean you have to grind through Bayes Theorem every time you ponder a medical test or news story? No. Tomorrow we’ll discuss a quick-and-dirty substitute that is usually ‘close enough,’ especially when you recognize that your values for *Prior*, *Signal*, and *Noise* will often be broad estimates.

Instead, Bayes Theorem says you should weigh news more cautiously, that you should not completely disregard previous data, and that you must be aware of the personal biases that shape your estimates of the *Prior*, the *Signal*, and the *Noise*.

+++++

Happy Friday!

“But but but” … (I hear you thinking) … ” if early detection is so important for cancer survival, shouldn’t we just retest the women who get ‘positive’ mammograms?”

Well, we could. If we retest all 77 ‘positive’ mammograms, we would expect to get about 11 more ‘positive’ outcomes: 6 of the 8 cases (80%) of women who do have breast cancer, and 5 of the 69 cases (7%) of women who don’t. So after two rounds of (quite painful) testing, our results are:

* 6 women with two ‘positive’ results who do have breast cancer (true positives);

* 5 women with two ‘positive’ results who don’t have breast cancer (false positives); and,

* 4 women

who do have breast cancer but did not get two ‘positive’ results(false negatives).The base rate for breast cancer in women under 50 – the

Prior– is just too low to justify routine mammograms.Good morning! ::hugggggs::

I didn’t have space to include the other common statistical prediction method, which attempts to estimate “statistical significance” based on normal curve analysis. This is the method currently employed by most correlation studies, where a study found X and also found Y some percentage of the time. The usual method is to look at a statistical value called

standard deviation. In a normal curve, 95% of the sample lies within two standard deviations of the mean, so a result outside that should happen less than 5% of the time (1 in 20 cases) … and that is the most common test for a “statistically significant” outcome.There are three problems with that method:

1) The 5% threshold is arbitrary. It’s commonly used, but that does not mean it’s scientifically meaningful.

2) Not all data fit a normal distribution curve, so results that are two standard deviations off the mean may be more (or less) common in a given setting. Rigorous scientists can try to account for that, but often it’s hard to know the actual distribution in Realworldia (see yesterday’s

Morning Featurefor four common reasons).3) Even assuming that 5% figure is accurate, that means 1-in-20 such results should be merely random coincidence. Will tens of thousands of correlation studies published every year, that adds up to a lot of noise.

Silver argues – and many scientists are starting to agree – that correlation studies should use Bayes Theorem, with the

Priorbased on previous data and/or scientists’ confidence in a theory that predicts some relationship between the events being studied. If such a theory has shown good results when applied elsewhere, or we have previous data that support this application, that newSignalshould make us more confident of such predictions. But without a well-tested theory or previous data, theNoiseshould make us wary of mere correlation results.Good morning! ::hugggggs::

This was a great discussion. Yes, my eyes glazed at the formula, but not at the discussion.

Mammograms are a great example. It seems we have cancers growing in our bodies all the time. Some extraordinary percentage of people of all ages, when autopsied, have cancers. (I think it was in the 90% range.) Most of those will be beaten by the body in various ways.

The thing with mammograms is two-fold. Breast tissue, it has been discovered, is extraordinarily sensitive to radiation. Hence the development of so-called “low-dose” mammography. Then you have another problem: Many cancers are safely encysted… until you break the cyst open and the cancer metastasizes, all too likely with a mammogram.

Medical recommendations have thus been rapidly changing. Most doctors recommend self-exam, a baseline mammogram at 50, and not another until 60. Most breast cancers occur in women who are 70 or older. So how much harm do we do with “preventive testing?” That’s still being argued, of course, but caution is rapidly replacing the insistence that every woman over 40 should be having an annual test.

The same thing has happened with PAP smears. They used to be an annual thing until it was realized that A) most cervical cancers grow extremely slowly B) they were getting too many false positives and C) once every 5-15 years is more than sufficient.

This kind of thinking, as illustrated here, is very important. I’m glad to know you’ll give us an easy method we can use ourselves.

I’m sorry your eyes glazed over. On the other hand, I encourage all readers to pick an example (or just make up numbers) and work through Bayes Theorem on your own. The equation

looksmore difficult than it is.Once you’ve played with the numbers a couple of times, you can then use this online Bayes Theorem calculator to do the math. For the debate example, I used 0.85 as the

Priorfor Hyp 1 (President Obama will win election), 0.15 as thePriorfor Hyp 2 (President Obama will lose election), 0.49 as theNoiseunder Hyp 1 (losing debate predicts President Obama will win election) and 0.51 as theSignalunder Hyp 2 (losing debate predicts President Obama will lose election). Click “Compute” and you get the revised probabilities for each hypothesis.That calculator also allows you to test multiple hypotheses (e.g.: 20% chance I’ll have a great day, 35% chance I’ll have a good day, 30% chance I’ll have a ho-hum day, 15% chance I shouldn’t have gotten out of bed) as well as multiple new pieces of information (e.g.: woke up on time, spilled coffee, hit all green lights on the way to work, boss is not at work today, my inbox is empty).

Or you could just trust your Bippiescope…. 😉

Good morning! ::hugggggs::

No, the formula doesn’t look difficult. 😉 Really. I just skipped it because I couldn’t imagine how to apply it. My life isn’t very full of data I could plug in. Understanding how it works, though, is essential. It will make me think differently about things when I’m making judgments.

You (and Nate Silver) gave me that.

Fascinating stuff. I remember the change in the mammograms and some people who were convinced that it was all about saving money or that it was one of those “panels.”

When mammograms began, we didn’t have any data to plug into the formula. And since early detection was important we wanted more women to have them more often. I also think that getting paid by the test added to the frequency.

I think this is really interesting. I like that Obama’s chances are essentially the same. I hope this whole debate discussion gets more people to feeling like their vote is really needed.

Your last point is very important, addisnana:

We should read any prediction about Realworldia to include the phrase “if other events happen as they usually have.”

E.g.: Silver’s prediction of President Obama’s likelihood to win in November assumes that no other big events intervene, and that both parties do their historical levels of GOTV work. If we grassroots Democratic activists think the election is already won and let Republicans outwork us on GOTV,

Silver’s prediction no longer applies.As for mammograms, I agree that it made sense to suggest them for women starting at age 40, before we had enough data to know that mammograms for women under 50 produced too many false positives (

Noise) to be reliable. In fact, that policy change illustrates the most important point about Bayes Theorem:We must update our predictions – properly weighted – as we get new information.We’ll explore that point in depth tomorrow.

Good morning! ::hugggggs::