Photo credit: Facebook
Saturday, October 26, 2024
Chatting with "Ann"
Friday, October 25, 2024
Fibbers of the Fourth Estate
Patrick Soon-Shiong: Is a newspaper a $500 million toy?
Photo credit: David Paul Morris, Bloomberg
The Wall Street Journal writes: "Donald Trump has opened a narrow lead in the presidential race....." In the next paragraph, the newspaper notes that Trump's lead is "within the poll['s] margi[n] of error, meaning that either candidate could actually be ahead."
The Journal is lying. It knows full well that the race is too close to call. A poll is only a sample, subject to error. One cannot ignore errors that may exceed the estimated margin of victory.
The Journal also knows that most readers don't understand the margin of error, especially when it won't explain this simple statistical concept in the main story. So The Journal can safely assume that readers will ignore its weasel words in the second paragraph as just some confusing nonsense. The Journal has its Presidential cake and eats it, too. Most readers will accept its lie that Trump is winning. If any reader challenges this, it can always point to the weasel words.
Political pros who should know better argue to me that this lying really doesn't matter. After all, we're just talking about a couple of percentage points between the candidates. Well, it matters a hell of a lot if The CBS Evening News leads with "New polls show that the White House race is a dead heat" or with "New polls show that Trump leads."
The Journal, The Washington Post, and The New York Times -- and therefore the news media in general -- have lied about the Presidential polls throughout the race, by ignoring the margin of error. Probably this is because "It's a close race" is not as thrilling a headline as "X is winning." But I cannot dismiss the possibility that the reporters, editors, or executives of the newspaper skew the headlines in favor of their candidate. At The Los Angeles Times, the editor of editorials resigned a few days ago because the owner, Patrick Soon-Shiong, refused to let the newspaper endorse a Presidential candidate. It is a small step from interfering in a newspaper's editorials to interfering with its headlines, although I have no evidence of such headline-management.
One major story in this tight race is how the media's fibbing with statistics has affected the donations, strategizing, and voter choices that will determine the wee hours of November 6. One thing for sure: We won't read that story in The Wall Street Journal, The Washington Post, or The New York Times. -- Leon Taylor, Seymour, Indiana, tayloralmaty@gmail.com
References
Katie Robertson. L.A. Times Editorial Chief Quits After Owner Blocks Harris Endorsement - The New York Times . October 23, 2024.
Aaron Zitner. Exclusive | Trump Takes Narrow Lead Over Harris in Closing Weeks of Race, WSJ Poll Shows - WSJ. October 23, 2024.
Wednesday, October 23, 2024
Boys and girls together, not
Danny Lopez: Learning from The Donald
Photo credit: Danny Lopez for Indiana
Always something brewing in Indiana politics.
In the 39th State District, longtime Republican incumbent Jerry Torr won't run for re-election to the Statehouse. The Democrat candidate is Matt McNally, who got 48% of the vote when he ran against Torr two years ago.
The Republican candidate is Vice President of Community Affairs and Corporate Relations for the Indiana Pacers organization, Danny Lopez, basically a spokesman and a tenderfoot in politics. What's interesting is that Lopez opposes transgender boys in female sports. The Pacers, a professional male basketball team, want good relationships with the LGBTQ+ community and are between the ol' rock and hard place. They issued a non-endorsement endorsement of Lopez.
A Lopez ad features a Hoosier sweetheart right out of the pages of Booth Tarkington. Call her Katie. "I play volleyball for my school. I love being on the team with my girlfriends. If Matt McNally has his way, me and my friends will be taking turns on the bench."
Don't cry, Katie. In Indiana, transgender boys haven't been able to play in K-12 girls' sports for at least 12 years. Two years ago, the legislature banned such activity and over-rode Governor Eric Holcombs' veto. Before then, the Indiana High School Athletic Association had enforced a trans-youth policy for a decade. Not that it took much work: Only two transgender students had applied to play on teams, reported IndyStar.
Lopez is beating a dead mare. But a lot of people like to watch. In less than three months, Republicans have spent more than $65 million on ads attacking transgender-friendly policies, reports The New York Times. Asked last week in a Fox Town Hall what to do about transgender athletes in women's sports, Trump said it was "such an easy question": "You just ban it."
McNally's own TV ads attack Lopez for his "radical' opposition to "reproductive rights." This would play like a charm in New York. But welcome to the Hoosier State, where then-President Donald Trump beat now-President Joe Biden 57% to 41% in 2020. Trump need not break into a cold sweat about Indiana this year, either.
The 39th District is in Carmel, just north of Indianapolis (population 103,000): White (80%), affluent (78% home ownership rate, median household income $133,000), educated (74% college degree-holders), slightly female (52%). It might normally back reproductive rights like abortion, but McNally's Doomsday ads will backfire. Well, probably: There is no good political polling in this part of Indiana. And Trump took Hamilton County, where Carmel is, 52% to 45% in 2020. Don't touch that dial. -- Leon Taylor, Seymour, Indiana tayloralmaty@gmail.com
References
Caitlin Doombos. Trump pledges to end transgender athletes playing women's sports . New York Post. October 16, 2024.
Gregg Doyel. Indiana state rep candidate Danny Lopez's ad could hurt Pacers, Fever . IndyStar. October 15, 2024.
Shane Goldmacher. Trump and Republicans Bet Big on Anti-Trans Ads Across the Country - The New York Times . October 8, 2024.
Leslie Bonilla Mun~iz. Checking out the key Indiana House races up for grabs this year – Inside INdiana Business . October 23, 2024.
Even more games newspeople play
What, me worry about margins of error?
The Wall Street Journal's poll of the seven swing states finds that either former President Donald Trump or Vice President Kamala Harris leads by 2% or less, except in Nevada, where Trump is up by 6%. The Journal says the margin of error in each state is +/- 4%. Although Trump leads by 6% in Nevada, the Journal says this lead too is "within the margin of error."
Say what? In the technical notes, we read:
"A candidate’s lead—the difference between two candidates’ percentages in a poll—has its own margin of error. This is because the margin of error for a lead is calculated to account for the margins of error around both candidates’ percentages.
"In most cases, a candidate’s apparent lead must be at least two times the poll’s basic margin of error to say a candidate is actually in the lead. In this case, the poll’s basic margin of error of 4 percentage points would require a lead of 8 percentage points to clearly show a lead."
This is a misunderstanding. Trump really is winning in Nevada.
A simple example may clear up the confusion. Suppose that I toss a fair coin. The chances of a head are one-half. And the chances of a tail are one half.
Suppose that we observe a head. What was the probability of a head?
The Journal would reason something like this: "Well, the chance of a head was one-half, and the chance of a tail was one-half. Either outcome could occur, and their probabilities are independent. That is, the chance of a head does not affect the chance of a tail. The probability of two independent events is the product of their probabilities. Therefore the chance of a head is one-half times one-half, or one-fourth."
Uh, no. The probability of a head is one-half. There are two outcomes, heads and tails. If a head occurs, a tail cannot. Given the head, the probability of a tail is not one-half. It is zero.
The same thing in political polls. We ask the interviewee if she would vote for Trump or Harris. If she chooses Trump, she cannot simultaneously choose Harris. So the only margin of error -- which measures the dispersion in responses for Harris in the sample -- that matters is for Harris. Once Harris is chosen, the choice of Trump is no longer a random variable. Its standard error, which determines the margin of error, is zero.
The margin of error for a Trump victory is 4%, not 8%. The +/- 4%, which The Journal incorrectly calls a margin of error, describes the probability that either Trump leads by up to 4% or Harris leads by up to 4% in the poll sample, given that the race is actually a tie.
What The Journal probably has in mind is something like the difference in votes for Harris in two periods. For example, we may observe that Harris took 50% of the sample this month and only 45% last month. We want to know whether the difference, 5%, is more than a fluke. In this case, we can reasonably treat the two events -- a Harris win last month and a Harris win this month -- as independent. That Harris won last month need not affect her chances of winning this month. So, in determining whether the two Harris shares differ significantly, we should consider the probability of each share independently.
But that is not the case for the event in which an interviewer says this month that she favors Harris. There is not an independent probability that she favors Trump. The Trump probability is zero.
In practical terms, The Journal's error matters little, this time. But in a race this close, one must take care to interpret future poll results correctly.
The major newspapers -- The New York Times, The Washington Post, and The Wall Street Journal -- are abominable at reporting political poll statistics. They have misreported the Presidential race at every stage. And their mistakes have probably changed the race, by misleading donors and campaign coordinators. -- Leon Taylor, Seymour, Indiana tayloralmaty@gmail.com
Friday, September 13, 2024
More games newspeople play
Do the voices add up? Photo credit: Unsplash
As always, The Washington Post offers, er, interesting political math.
It tells us that in the seven battleground states that would probably determine the electoral vote in the Presidential election, Vice President Kamala Harris leads in three, former President Donald Trump leads in two, and the other two are ties. A tie is defined as a margin of a quarter of a percentage point. No explanation.
But digging into the story, we read that "every state is within a normal-sized polling error of 3.5 points and could go either way." In other words, all seven states are too close to call. Neither Harris nor Trump leads in any of them merely because they lead in the sample. The sample is never a perfect reflection of all voters, and one must consider how imperfect the reflection may be before basing conclusions on the sample.
For example, suppose that in Pennsylvania, Harris led in the sample by one vote. Would we conclude that she is winning the Pennsylvania race? Surely not. A one-vote margin is tiny. It is very likely that one winning vote results from an error, such as voters who misunderstand the question. So we would not put much faith in the conclusion that Harris is truly ahead.
How large must the margin be, then, for us to conclude that it gives us good information? The answer to that question is a statistic called the "margin of error."
The Post tells us that the margin of error is 3.5%. The usual interpretation is that if the margin exceeds 3.5%, then the probability is 95% that the leader in the sample is truly winning.
But here is where the Post math gets really interesting. The explanatory notes say that the 3.5% estimate is based on a calculation that in "the last few presidential cycles...the average modeled polling error in competitive states was 3.5 percentage points." Which presidential cycles? Which competitive states? Were they the same as this year's seven battleground states? Who knows?
Well, OK, 3.5% is the "average" polling error. I presume that this means that chances are 50% that the candidate ahead by 3.5% in the sample is actually winning the race. I presume wrong. Reading on: "...To account for this [3.5% polling error], our averages factor in the 90th percentile possible error (i.e., how bad would the error be in the worst 10% of cases)." In other words, chances are 90% -- not 95%, not 50% -- that the candidate ahead by 3.5% in the sample is truly winning. Feel free to scratch your head.
Freaky fractions
Sports fans, here's the score. Usually, the margin of error is based on the probability distribution. This is the range of probabilities for particular outcomes. For example, a probability distribution for the outcome of one coin toss is 50% no head (that is, a tail) and 50% one head (no tail). The distribution for the outcome of 100 coin tosses can give us the probability of zero heads (or 100 tails), the probability of one head (99 tails), and so forth. All distinct probabilities sum to 100%. For example, on a coin toss, there is a 50% chance of no head and a 50% chance of one head, adding up to 100%.
The distribution of a Harris margin may be the probability of minus 100% (that is, she got no votes), the probability of minus 99% (she got 1% of the vote), etc. We could also look at fractions like minus 99.9%.
The most common distribution used is the normal. This has a bell shape: Small probabilities at the extremes (like minus 100% of the vote for Harris, or plus 100%) and large probabilities in the middle (like a zero margin for Harris, that is, both candidates get the same vote).
The probability distribution is a theory. But it leads to accurate conclusions when correctly handled. For example, if we observe that Harris loses 100% of a well-executed and large poll sample, we may confidently conclude that she is not winning the race. To calculate the precise margin of error, one fits out the probability distribution by using information from the poll sample.
But The Post derives its margin of error not from a probability distribution but from recent actual errors. Its information came not from the current sample but from past performance. How it gets from this estimate based on past empirics to the present theoretical one is beyond me. Perhaps it assumes the same probability distribution for past election cycles as for the present poll samples, but The Post says nothing about this. It looks to me as if it arrived at its estimates essentially by playing pin-the-tail-on-the-donkey.
To recap: Harris is winning in three states! No, wait a minute. We're not sure. It could be an error. No, wait. We're not sure how to calculate the possible error. No, wait....
This matter is serious, and not just for nerds like me. The Post is wrong about how close the race is. It's too close to call not only across the battleground states on average, but in every battleground state. The Post's nonchalance would lead campaigns to understate the need for staff, volunteers, ads, and money in most battleground states.
The Post's FAQ asks: "Are you going to release the code of your model?" The newspaper replies: "We really want to and are working on that." Outstanding. Shouldn't The Post have released the code when it published the results? One delays code publication to clean up confusion and error. Why didn't The Post clean up the code first?
Continuing: "When we release the code, we're also hoping to publish a more technical explanation." In other words, The Post did not think through its assumptions, since one does so by writing out their justification. The Post winged it.
Democracy dies in darkness. And The Post is smashing the lamps. -- Leon Taylor, Seymour, Indiana, tayloralmaty@gmail.com
References
Lenny Bronner, Diane Napolitano, Kati Perry, and Luis Melger. Harris vs. Trump 2024 presidential polls: Who is ahead? - Washington Post September 13, 2024.
Thursday, September 12, 2024
The misshape of things to come
Going south? Photo credit: NBC News.
The New York Times writes: "With the Kursk incursion, Mr. [Rustem] Umerov [Ukraine's defense minister] argued, Ukraine has demonstrated it can invade, and even occupy, Russian territory without igniting World War III, according to two officials.
"But American officials say it is too early to reach that conclusion, because there are many ways for Mr. [Vladimir] Putin [Russia's president] to retaliate."
November 5, for example -- the day of the Presidential election in the United States. If Putin seeks to win his war with Ukraine, his cheapest means may be to ensure, by hook or crook, the election of former President Donald Trump. In the Republican candidate's debate Tuesday with the Democrat candidate, Vice President Kamala Harris, Trump refused repeatedly to say he wanted Ukraine to win the war. Instead, he said he wanted to end the war and if elected would do so in 24 hours by phoning Putin and Ukrainian President Volodymyr Zelensky. The implication, as Harris said, was that Trump would force Zelensky to concede the war by threatening to cut off military aid to Ukraine.
However, even Harris did not seem to understand how a Russian victory would affect Central Asia. Harris said Putin would next target Poland. This, I think, is ludicrous. Poland has belonged to NATO since 1999. An invasion of Warsaw would activate the NATO requirement that all members defend the one under attack. That would mean World War III, and Putin is not so stupid as to risk it. More likely he would target a nation that does not belong to NATO and that has relatively little strategic interest for the US and Europe -- Kazakhstan. Leon Taylor, Seymour, Indiana tayloralmaty@gmail.com
References
David Sanger, Helene Cooper, and Erich Schmitt. Biden Poised to Approve Ukraine’s Use of Long-Range Western Weapons in Russia - The New York Times (nytimes.com) September 12, 2024.
Sunday, August 18, 2024
The games newspeople play
A winner or just a statistic? Photo credit: Britannica
Is Kamala Harris winning? The Washington Post and The New York Times would love to tell you. What they won't tell you is that they are either hopelessly confused or lying.
Let's start with today's Post. "Vice President Kamala Harris holds a narrow lead over former President Donald Trump in the presidential election, a notable improvement for Democrats in a contest that a little more than a month ago showed President Joe Biden and Trump in a dead heat, according to a Washington Post-ABC News-Ipsos poll....Given the margin of error in this poll, which tests only national support, Harris's lead among registered voters is not considered statistically significant."
Big news, sports fans! Harris is winning! Our poll
says so! But...wait a minute...it's not statistically significant, which
means...um...hmm.
Friends and neighbors, you can't have it both ways.
Either Harris is winning, or she isn't. The rule of thumb is this: If the poll
margin is within the margin of error, you cannot deduce with 95% confidence
that either Harris or Trump is winning. The race is a
dead heat.
But.
Digging into the story, we learn from a graph that
the poll has a margin of error of plus or minus 2.5%. Harris's lead in the poll
is 4%, or 49% to 45%. So Harris is winning. The poll margin, 4%, is larger
than the margin of error, 2.5%. It is statistically significant: In other words, the result is very likely to hold as well outside of the sample, for the country in general. The
Post is breathtakingly mucked up.
Its confusion probably arises from its
misunderstanding of the plus-or-minus designation. The idea is this: We want to
test the hypothesis that the race is a tie. That would happen if Harris
is neither winning nor losing. Our usual criterion is that we will accept that
the race is a dead heat unless we are 95% confident that Harris is either
winning or losing. Well, if Harris's lead exceeds 2.5%, it is not a dead heat. And
had Trump's lead exceeded 2.5%, that is, had Harris's lead been -2.5%, it would
not have been a dead heat. It is not the case that Harris's margin must
exceed 2 times 2.5% for us to conclude that the poll result is statistically
significant, that is, that the race is not a dead heat.
The further conclusion, that Harris is winning, is easy to confirm once we see that the race cannot be a tie. The intuition is this: If Harris is so far in the lead that we can reject the possibility that the race is a tie, then we can also reject the possibility that Trump is winning.
In this case, The Post lucked out. Its headline correctly said the poll indicated that Harris was winning. But there is a more important point: The Post doesn't know what the hell it is doing.
Grrr. Now hear this: The margin of error is calculated under the assumption that the polling was perfect. Even when the polling sample is an accurate mirror of likely voters, the outcome of the poll may not be accurate. There is still a good chance that too many Harris supporters were interviewed. There is also a good chance that too many Trump supporters were interviewed.
A simple example will show what I mean. Suppose that
we have a class of 100 students: half receive As, and half receive Bs. (Welcome
to grade inflation.) We take a sample of 10 students. Even if the sampling is
utterly fair, there is still a chance that at least 6 of the 10 students
sampled received As. Based on the sample, we wrongly conclude that the majority
of students received As. In reality, only half did.
Because of such possibilities, statisticians test the idea that of all likely voters, half favor Trump and half favor Harris. We can dismiss this hypothesis of a dead heat if a large-enough share of the sample favors either Harris or Trump. The "margin of error" reflects how large the share of respondents backing one candidate must be if we are to dismiss the possibility of a dead heat, assuming perfect polling. If the pollsters made mistakes, and they usually do, the actual margin of error is even larger than the one usually reported.
Moving right along…
Remarks later in the story indicate what The Times
is really up to: "As the Harris and Trump
campaigns rush to define each other in the remade race, voters see a choice
between strength and compassion. Voters were about equally likely to see each
candidate as qualified and change-makers, but significantly more voters viewed
Mr. Trump as a strong leader....When voters were asked who 'cares about people
like me,' Ms. Harris had a slight edge over Mr. Trump: 52 percent compared to
48 percent."
The Times seems to think that it can report results from a sample as if they held for the population, as long as it labels results that hold with 95% confidence as "significant." (The poll margin for the question about strong leaders was 8%, much larger than the margin of error.) The Old Grey Lady is playing word games. Most readers are not statisticians. When The Times says Harris has "a slight edge," they think that The Times is talking about the population, that is, the world outside of the sample. They view the word "significantly" as redundant.
So The
Times has it both ways. It tricks readers into believing that it is
breaking news by declaring a winner. And, if challenged, it can always point out that, after all, it
did refer to significance, sort of.
A simple example will show why this is a cardinal sin. Suppose that the president of General Motors misstated stock earnings for years. The Securities and Exchange Commission would investigate and probably force the president to resign. He might even face charges of fraud. Shouldn't we hold the nation's leading newspapers to standards at least as high? In a close race like this, newspaper reports of the polls affect donors. The money is sloshing towards Harris because of her perceived momentum. How real is that perception?
Another common dodge of newspapers: Well, do we really need 95% confidence that Candidate X is winning? Surely 90% is good enough. In that case, the margin of error is smaller, and the poll margin may now exceed it. We can then conclude that X is winning.
It is true that 95% is an arbitrary standard. But it is also a universal one, especially in political polling. To apply 95% in some cases and (quietly) 90% to others, which The Post is fond of doing, amounts to moving the goalposts at halftime for some games but not for others. It makes it hard to compare poll results. Is Harris ahead in certain polls because she is winning the race, or because those polls adjusted the margin of error until she was "winning"?
Noncardinal sins....
There's more, much more. The Times writes: "The polls show some risk for Ms. Harris as she rallies Democrats to her cause, including that more registered voters view her as too liberal (43 percent) than those who say Mr. Trump is too conservative (33 percent)."
It is not clear why The Times says this shows "some risk" for
Harris. Perhaps it means that voters are significantly more likely to view
Harris as liberal than to view Trump as conservative. If that's what it means,
it should provide evidence. The margin of error applies here, too. But even if it is true that voters are more likely
to judge Harris as liberal than to judge Trump as conservative, why would this
be a risk for Harris?
The Times
continues, "For now, [Harris] is edging ahead of [Trump] among critical
independent voters." Evidence?
The Times also writes, "Mr. Trump and Ms. Harris are tied at 48 percent across an average of the four Sun Belt states in surveys conducted Aug. 8 to 15. That marks a significant improvement for Democrats compared with May, when Mr. Trump led Mr. Biden 50 percent to 41 percent across Arizona, Georgia, and Nevada in the previous set of Times/Siena Sun Belt polls, which did not include North Carolina."
The Times is comparing apples to oranges. One sample includes North Carolina, the other doesn't. And it is not clear why it bothers to make this top-level comparison. The polls differ from state to state, and The Times has specific information about each. Why not stick to reporting the results for each state poll rather than aggregate them into an inferior measure?
If The Times simply must have a nation-level conclusion, it
can view each state poll as an observation in a sample of all state polls and
compute the probability that Harris is leading nationwide, given the sample of
state polls. Do the same for May.
Finally, a puzzle. In Arizona, The Times/Siena poll reports Harris
ahead in the sample by 5% among likely voters and 4% among registered voters.
But the poll by the Competitiveness Coalition/Public Opinion Strategies about
two weeks earlier reported Trump ahead by 5% among likely voters. And The
Hill/Emerson College poll about three weeks before The Times/Siena poll
reported Trump ahead among registered voters by 5%. Were these differences due
to Harris's momentum? Or were they due to differences in how the polls were
conducted?
Source: The New York Times
Anyway, these conundrums pertain to Central Asia because the political polling in the region, or at least the reporting of it, is anything but perfect. The margin of error is correspondingly higher. In short, politics pervade political statistics. --Leon Taylor, Seymour, Indiana, tayloralmaty@gmail.com
Notes
For helpful comments, I thank but do not implicate Annabel
Benson, Richard Green, and Mark Kennet. Parts of this post draw upon my earlier
articles on my Facebook page.
References
Dan Balz, Scott Clement, and Emily Guskin. Kamala Harris holds slight national lead over Donald Trump, Post-ABC-Ipsos poll finds - The Washington Post August 18, 2024.
Shane Goldmacher and Ruth Igielnik. Kamala
Harris Puts Four Sun Belt States Back in Play, Times/Siena Polls Find - The New
York Times (nytimes.com) August 17,
2024.
Lisa Lerer and Ruth Igielnik. Harris
Leads Trump in Three Key States, Times/Siena Polls Find - The New York Times
(nytimes.com) August 10, 2024.