Does this man need a friend in the political polling industry?
The headline of the hour! Former UN Ambassador Nikki Haley leads Florida Governor Ron DeSantis for second place in the Iowa caucuses, according to a poll by The Des Moines Register, coordinated with NBC News and Mediacom. Or so says The Wall Street Journal.
But not the poll itself. Selzer & Company reports a margin of error of plus or minus 3.7%. For some reason, the Selzer results are rounded off to the nearest integer. They list, as the first choice of respondents among Republicans for President, Nikki Haley (20%) and Ron DeSantis (16%). Since Haley is up by a reported 4% over DeSantis, The Journal concludes that she is in second place.
But it is possible, for example, that DeSantis has 16.4% and Haley 19.5%. In that case, the margin is 3.1%, which is well within the reported margin of error. Other combinations are possible. In other words, The Journal cannot conclude that Haley leads DeSantis without looking at the results specified to a tenth of a percent.
The Journal was not the only one to err. The New York Times reported that Haley was "narrowly leading" DeSantis. In fact, The Times, always eager to go one step beyond the competition, made not one glaring error but two. It implied that Trump had a larger share of likely voters than all its rivals. "Mr. Trump’s support in the survey eclipsed that of all the other candidates combined, 48 percent to 45 percent." But 3% is well within the margin of error of 3.7%.
NBC News said Haley was "narrowly edging" (a redundancy) past DeSantis..."although the gap is within the poll's margin of error," which is as senseless as it looks. Any poll is just a sample of the target population -- in this case, likely voters in the caucuses. The margin of error indicates the chances that the sample does not accurately reflect the population. If Haley's edge in the sample is within the margin of error, we cannot conclude that she has an edge in the population. But the point is that we can't conclude even this. We need to look at the poll results specified to tenth. Who knows? Maybe Haley really is leading DeSantis by more than 3.7%.
Two plus two plus the news
A little-known fact is that The Journal, The Times, and NBC News require applicants to flunk a third-grade arithmetic test before hiring them as political reporters. But why on earth did the "highly-regarded" (cough, cough) polltaker, Selzer and Company, round off the published results to the integer? Were I DeSantis, I'd be talking to a lawyer. This gaffe could ruin his chances at the Vice Presidency.
Wait, there's more. Selzer writes: "Responses for all contacts were adjusted by age, sex, and congressional district to reflect their proportions among voters in the list." So, this was not even a random sample. Randomness means that you pick an observation at random from the population -- in this case, from all likely voters. It does not mean that you weed out certain observations according to an ad hoc rule. By definition, the use of a rule destroys randomness.
Advocates of shaping a sample usually protest that they just want to match the population as closely as possible. But they miss the point. The population of "all likely voters in Iowa caucuses" is not the same as the population of "all likely voters in Iowa caucuses with 50.1% males," even if that happens to be the proportion of males at this particular moment. The proportion of males in the general population changes over time and must be treated as random, not fixed.
In addition, the sample shaper has the problem of deciding which characteristics to shape. Selzer chose "age, sex, and congressional district." Why these factors and not others? Once one starts shaping a sample, one starts slipping down a rabbit hole, farther and farther from randomness.
The margin of error is usually calculated under the assumption that the sample is purely random. When it is not, how much larger should the margin of error be?
Moreover, many results reported in news stories dealt not with the total sample but with such sub-samples as college-educated respondents. The margin of error varies inversely with the size of the sample: The smaller the sample, the larger the margin of error. This is because a smaller sample holds less information. Yet Selzer reported only the margin of error for the total sample (3.7%), implying that it also applied to the sub-samples. It does not.
On top of that, the fact that Selzer fiddled with the total sample in terms of subpopulations, like gender and Congressional district, implies that the lack of randomness might have been severe for these sub-samples.
Unfortunately, these problems might also attach to other polls used by the media, such as the Main Street poll of Suffolk University, from which USA Today mines so many articles.
Given their careless statistical thinking, what are some of these polls really worth? Is it any wonder that many political polls predict poorly? Inquiring minds want to know.
As I write, the 538 blog reports that 34% of the expected vote in Iowa is in. DeSantis leads Haley 20% to 19%. This is essentially a tie for second place. There are good reasons for caution in interpreting the results of a poll in a close race, especially when the polling company appears as careless as Selzer. -- Leon Taylor, Anne Arundel County, Maryland tayloralmaty@gmail.com
References
Lisa Lerer and Michael Gold. Election Live Updates: Latest Iowa Caucus, Trump and 2024 News - The New York Times (nytimes.com) January 14, 2024.
John McCormick and Eliza Collins. As Trump Dominates in Iowa, Haley Has Eye on New Hampshire - WSJ January 14, 2024.
Mark Murray, Alex Tabet, and Sarah Dean. Final Iowa poll: Trump maintains dominant lead before caucuses. NBC News - Breaking News & Top Stories - Latest World, US & Local News | NBC News January 13, 2024. Updates of NBC's coverage played with the wording but still insisted that Haley led DeSantis.
No comments:
Post a Comment