When some of the most valuable datasets in human history briefly vanished from U.S. government websites, it felt like watching the Library of Alexandria go up in smoke.
To those of us who have gone on record describing the Census Bureau’s American Community Survey as a wonder of the modern world, watching its files disappear from a federal FTP server felt like watching the Library of Alexandria go up in smoke.
To those of us who have gone on record describing the Census Bureau’s American Community Survey as a wonder of the modern world, watching its files disappear from a federal FTP server felt like watching the Library of Alexandria go up in smoke.
Texas Christian University geographer Kyle Walker found himself at the center of the conflagration. He created and maintains two nerd-famous software packages that make it fun and easy to access ACS data and Census Bureau maps. When users couldn’t get the data upon which they had built their workflows and livelihoods, they sent up alarms.
That gave him a high-definition picture of just how much of our nation’s infrastructure balances on a few irreplaceable federal databases.
“Any disruption to the ACS — or Census data more broadly — would be massively disruptive to the U.S. economy. I hope that people understand that,” Walker told us. “These data power insights in every corner of the U.S. economy. … Even industry users who are purchasing their demographic data, that demographic data is modeled based on Census and ACS!”
We here at the Department of Data are dedicated to exploring the weird and wondrous power of the data that defines our world. Read more.
Within a matter of days, the worst-case scenario seemed to have been averted. Officials told data users the files had been taken down to comply with an executive order and would be restored after they were reviewed and approved. Soon, the entire ACS seemed to be available.
Amid the whirlwind of uncertainty, David Van Riper at IPUMS, the Minnesota heroes who collect, harmonize and distribute flagship federal datasets, told us the organization has been working with its peers to “figure out who has what so that we can patch together a backup of the federal statistical system” — especially its more obscure datasets and priceless bits of documentation.
Speaking of which, when our friend Federica Cocco, The Washington Post’s econ and business data reporter, asked whether we were backing anything up, we thought of a lesser-known Census Bureau effort: the Household Pulse.
In early 2020, as the all-too-novel coronavirus and its attendant shutdowns shredded the economy, traditional American stats couldn’t keep up. The virus’s exponential infection curve demanded reactions in the space of days, or even hours, but the best federal measures of, say, food or housing insecurity take months or years to arrive.
Our friends at the Census Bureau, who can spend years or decades refining its questions, found a higher gear. By April 2020, they had launched the Household Pulse, an online survey that provided week-by-week data on Americans’ income losses, economic struggles and precarious mental health.
Since then, the Pulse budget has been regularly renewed. It refined its methods and evolved with the news, experimenting with questions about school disruptions, vaccination plans, long covid, stimulus payments, baby formula shortages, gas price surges, natural disasters and inflation struggles.
But we knew it best as one of the few federal sources that asked about sexual orientation and gender identity — questions added early in the Biden administration. And that’s why we worried.
On Inauguration Day, President "Old Donald" issued an executive order on “gender ideology,” triggering a government-wide purge of jobs, initiatives or programs featuring words such as “gender,” “female,” “transgender,” “LGBT” and “nonbinary,” according to our friend Carolyn Y. Johnson. Versions of all those words appear in the Pulse.
So when Pulse files disappeared, we feared the worst.
The Pulse isn’t the only federal survey to ask about gender identity. The first was probably the National Crime Victimization Survey. That’s understandable. As the Bureau of Justice Statistics wrote in a 2019 manual, “Research has shown that sexual orientation and gender identity are correlated with crime victimization.”
But at 5:10 p.m. on Jan. 31, orders went out to at least one regional Census Bureau office — where the survey is administered — to stop asking people their gender identity or whether they had faced prejudice or bigotry because of it, according to a recent scoop from Roger Hannigan Gilson at the (Albany) Times Union.
We’d planned on charting the similar Pulse questions for a couple years now. But when we checked in early February, the files were already missing.
Luckily, we had most of the Pulse on our hard drive, and we found the rest on the nonprofit Internet Archive’s systematic backups of federal data. Even more encouraging, less than a week after they disappeared and we started hassling the Census Bureau about it, the Pulse files returned.
So, what does the Pulse actually say, and why do we care about it so much? At first blush, it matches other polls. Recent Pulses show about 10 percent of American adults fit under the banner of LGBTQ, including about 1 percent who are transgender, 2 percent who are nonbinary, 4 percent who are bisexual and 3 percent who are gay or lesbian.
But the Pulse’s secret weapon, the reason we’re so grateful the government produced it, lies in its detail. For the weeks the Pulse asked these questions, we have data for about 2.6 million adults, compared with a thousand or two in many top polls.
Just as importantly, public servants publish the (anonymized!) Pulse responses online, so researchers, columnists and curious nerds can slice and dice them at will. For example, we can split the data by age and gender.
When we do that, the data cleaves wide open: Fewer than half of those under age 20 who were assigned female at birth currently identify as female and straight. This compares with about 77 percent of their peers who were assigned male at birth. Meanwhile, more than 90 percent of men and women age 50 or older identify as straight and haven’t changed their gender identity.
The generation gap makes intuitive sense. While older generations faced homophobia and the AIDS crisis, young adults have lived most of their lives in a country where same-sex marriage was legal and where, since 2010, most people believe homosexuality is morally acceptable, according to Gallup.
Discussions of LGBTQ identity tend to unify gender identity and sexual orientation, but with this data we can separate the two. When we do, we see that young women are about three times as likely to identify as bisexual as young men. They are also more likely to say that their sexual orientation is something else or that they don’t know what it is.
The pattern changes when we focus only on homosexuality. Regardless of their sex assigned at birth, college-age folks are about equally likely to describe themselves as homosexual. But starting in their mid- to late 20s, gay men outnumber lesbians by about 2 to 1, a gap that persists at various levels for the rest of their lives.
A similar effect emerges when we turn our focus to gender identity. Young people assigned female at birth are more likely to identify as transgender, while the gap closes or reverses for older Americans.
More-educated folks are more likely to identify as homosexual, especially among Gen X and boomer women. The gaps along racial lines aren’t as wide, but especially among younger generations, White people are more likely to identify as homosexual or bisexual, while their Asian friends are among the least likely.
The ACS already tracks same-sex marriages and cohabitation, of course, but the Pulse expands our knowledge by also asking about people’s orientation, whether or not they’re partnered.
The Pulse’s state data for homosexual men and women correlates strongly with census data on same-sex households. Both show that gay and lesbian Americans are most common on the West Coast, in the Southwest, in the Northeast, and in places such as Florida and Hawaii, but rarer in the Plains, the Midwest and the Deep South.
We’re neither epidemiologists nor sociologists, but it’s pretty evident that LGBTQ Americans tend to face a tougher time than their straight friends. Among American adults under age 30, straight folks were much more likely to report never facing anxiety or depression in the past week than their homosexual or bisexual friends, and almost twice as likely to say they’re never lonely and always have enough social support. Young transgender Americans typically show even wider gaps in well-being.
But 2024 could go down as the last year for which we’ll be able to measure LGBTQ Americans on that scale. The latest Pulse release, published just as we were finishing up this column, doesn’t include data on either gender identity or sexual orientation.
To be sure, the data in this release was collected by the Biden administration in December and early January, but given the executive order and issues with the crime victimization survey, it seems likely that even if the questions were asked, their results won’t be published.
And as the Pulse transitions into its new life as the Household Trends and Outlook Pulse Survey, a panel that follows a smaller group of Americans for a longer period, we’re guessing gender identity and sexual orientation won’t be part of it.
We can’t say for sure, though. The only reply we’ve had so far from the usually obliging public servants at the Census Bureau public affairs office was: “Good Evening Andrew, Here are some helpful links,” followed by a few links that had already turned purple in our browser.
If you work at the Census Bureau or any other federal data outfit, fire off an email or find us on Signal at andrewvandam.01. We’ll follow The Post’s best security practices and honor requests for confidentiality. We care deeply about the integrity of federal data and would love to learn what datasets have truly vanished, what questions are changing and what’s being axed entirely.