Monday, February 10, 2020

Census Bureau's attempt to keep data anonymous could mess with small-town counts

Small towns might end up with an inaccurate population count because of the Census Bureau's attempts to keep private individual data from the decennial census, Gus Wezerek and David Van Riper report for The New York Times. Wezerek is a writer and graphics editor for the Times' opinion section; Van Riper is a population data scientist at the University of Minnesota.

"The law requires individual census records to be kept confidential for 72 years," they explain. "Fearing that data brokers using new statistical techniques could de-anonymize the published population totals, the bureau is testing an algorithm that will scramble the final numbers. Imaginary people will be added to some locations and real people will be removed from others. The more the algorithm muddles the results, the more difficult it will be, for example, for a data scientist to combine a set of addresses and credit scores with census results to learn the age and race of people living on a certain block."

However, a test in the 2010 census produced "wildly inaccurate numbers" for rural areas and minority populations, Wezerek and Van Riper write. "To size up the threat of so-called re-identification attacks, the Census Bureau tried to reverse-engineer the 2010 census results. Officials were able to correctly identify just 17 percent of the original 309 million records." 

Congressional apportionment will not be affected, since Census officials have exempted state population totals from the algorithm's effects, but there could be other issues: Localities shortchanged by number scrambling could have a hard time accessing their fair share of state and federal spending. "There is still time to modify the algorithm," Wezerek and Van Riper write. "The bureau has more than a year before it releases results to the states for redistricting."

No comments: