Select Page

Abraham Lincoln once expressed the desire, in a time of civil war, to preserve a government that was “of the people, by the people, for the people.” What he did not say was that such government has also always been of the data, by the data, and sometimes for the data. Democratic governance has been fundamentally data-driven for a very long time. Representation in the US depends on a constitutional requirement, instituted at the founding, for an “actual enumeration” of the population every 10 years: a census designed to ensure that the people are represented accurately, in their proper places, and in proportion to their relative numbers.

A complete national count is always a monumental task, but the most recent actual enumeration faced unprecedented challenges. The 2020 census had first to overcome the Trump administration’s ill-conceived effort to add a citizenship question. Then it spent half the year in the field straining to count every person during a pandemic that made knocking on strangers’ doors particularly difficult. A series of devastating hurricanes and wildfires added to the challenge. And yet, in late April 2021, the professional staff of the US Census Bureau managed to fulfill the constitution’s mandate and revealed state-level population totals, translating those into an apportionment of the 435 seats of the US House and a corresponding number of votes in the electoral college. (The apportionment occurred automatically according to an algorithm, called “equal proportions” or “Huntington-Hill,” that is prescribed by law.) Now, just last month, we learned that some of those numbers were, most likely, wrong.

The Census Bureau’s Post-Enumeration Survey (PES) went back out into the field, reinterviewing a sample of people from throughout the country, and then compared the new, more in-depth survey to the results of the census. Analyzing this comparison, the bureau now estimates that the 2020 census overcounted in eight states and undercounted in six. To give a sense of the scale of these errors, the PES reported with 90 percent confidence that New York’s state population was overcounted by anywhere from 400,000 to over 1 million additional people, or 1.89 to 4.99 percent of the population. Considering the circumstances of the count, such low error rates should be considered impressive, and yet such differences can have big consequences when the last seat in the US House has, since 1940, been decided by as few as 89 people and no more than 17,000. Much of the initial commentary on the PES results has focused on the horse-race implications of the errors, pointing out that more of the states that were overcounted were blue states, while more of those undercounted were red. The errors, apparently favoring one party over another, have even been labeled “a scandal” and the census written off as “a bust.”

These are overreactions, and yet the question remains: What should we do about these small, but both statistically and politically significant errors?

This is a conundrum that our nation’s leaders have wrestled with since the founding. Over the course of the last century, two distinct approaches have dominated. One depends on funneling money and energy into mobilizing more census takers and toward other systemic reforms that preemptively reduce error. The other involves statisticians who have worked to develop techniques that can measure error precisely and then make corrections to the census counts. Both of these approaches remain important, and yet the scale of the 2020 miscounts suggests that an older method for dealing with census error should be revived: We should expand the House and the electoral college, so that few or no states lose representation in the face of an uncertain count. We should try to count better and fix what errors we can, but our democracy will be more robust if we also lower the stakes of each census. Representation need not be a zero-sum game.

The earliest known reference to a census undercount came from Thomas Jefferson, then secretary of state, who wrote in 1791 about the prior year’s census, the nation’s first. Jefferson wrote his correspondents in Europe, assuring them that the American population was a few percentage points larger than officially declared. It’s hard to say if this was indeed the case, but the story makes clear that concerns about omissions and undercounts began more than two centuries ago. In subsequent decades, disasters and administrative failures caused serious omissions, such as when the official charged with counting Alabama’s residents died in office before completing his work on the 1820 census, or when many of California’s records (including the entirety of San Francisco County) burned after the 1850 census.