Changes to Census Bureau Data Products

IPUMS Differential Privacy Analysis

A team of IPUMS research scientists, led by David Van Riper, analyzed the implementation of differential privacy by the Census Bureau and the impact of on the accuracy of summary data tables. Using the differentially-private 1940 data published by the Census Bureau, the team compared county and enumeration district counts against the IPUMS complete-count 1940 data. Van Riper presented the results of this analysis and an overview of the required policy decisions and a step-by-step description of differential privacy implementation at a workshop on August 16, 2019.

A video of the presentation and slides are available for those who would like to learn more about differential privacy and implications for summary file data.

Census Bureau Announces ACS Timeline

The Census Bureau has announced that the earliest date for implementation of differential privacy for American Community Survey (ACS) will be 2025. They also state that “the solutions will be thoroughly vetted by the scientific and user communities.” In the meantime, they will focus their efforts on using differential privacy for the 2020 decennial census.

We are relieved at this news and it follows our request, signed by 4,407 academics, planners, journalists, and researchers from the government, non-profits, and the private sector.

We will continue to monitor Census Bureau publications and presentations on this issue and we will direct IPUMS users toward any opportunities to engage with the Census Bureau in their planning and evaluation of disclosure avoidance methods.

Public Documentation and Test Data

The Census Bureau has released code and documentation from the 2018 end-to-end test. The Census Bureau is also testing their methods on 1940 census data and those data are available from IPUMS.

Census Bureau Disclosure Control

In September 2018, the Census Bureau announced a new set of methods for disclosure control in public use data products, including aggregate-level tabular data and microdata derived from the decennial census and the American Community Survey (ACS). The new approach, known as differential privacy, “marks a sea change for the way that official statistics are produced and published” (Garfinkel, Abowd, and Powazek 2018).

The Census Bureau claims that the new system will be more open and transparent to users. But the new system will come with a significant trade-off in data accuracy, making the public data useless for many applications. Indeed, in its pure form differential privacy techniques could make the release of scientifically useful microdata impossible and severely limit the utility of tabular small-area data.

We are following this conversation, and we will share relevant information from the Census Bureau and others as it is released.

We will continue to gather relevant information for the IPUMS user community and post here and share via IPUMS Twitter.