Changes to Census Bureau Data Products
New Census Bureau Demonstration Data Products
The Census Bureau has released differentially private 2010 demonstration data that allow researchers to investigate how and in what ways differential privacy may impact 2020 census data products. Users can compare data under the previous disclosure avoidance system (original 2010 data products) to data prepared under differential privacy (new demonstration data) and provide feedback to the Census Bureau. To facilitate these comparisons, the IPUMS NHGIS team has put together a subset of the demonstration data linked together with the original data.
The Committee on National Statistics has also released a Call for Input on Census Bureau 2020 Data products. The workshop on 2020 Census Data Products: Data Needs and Privacy Considerations will be held December 11 - 12, 2019.
We encourage all IPUMS users, particularly those who use decennial census data products, to compare research results from the original 2010 data to those from the demonstration data in order to provide feedback to the Census Bureau and input for the workshop. If you send input by email, you can also email a copy to IPUMS at NHGISemail@example.com.
IPUMS Differential Privacy Analysis
A team of IPUMS research scientists, led by David Van Riper, analyzed the implementation of differential privacy by the Census Bureau and the impact of on the accuracy of summary data tables. Using the differentially-private 1940 data published by the Census Bureau, the team compared county and enumeration district counts against the IPUMS complete-count 1940 data. Van Riper presented the results of this analysis and an overview of the required policy decisions and a step-by-step description of differential privacy implementation at a workshop on August 16, 2019.
Changes to Census Bureau Data Products
The Census Bureau has announced that the earliest date for implementation of differential privacy for American Community Survey (ACS) will be 2025. They also state that “the solutions will be thoroughly vetted by the scientific and user communities.” In the meantime, they will focus their efforts on using differential privacy for the 2020 decennial census.
We are relieved at this news and it follows our request, signed by 4,407 academics, planners, journalists, and researchers from the government, non-profits, and the private sector.
We will continue to monitor Census Bureau publications and presentations on this issue and we will direct IPUMS users toward any opportunities to engage with the Census Bureau in their planning and evaluation of disclosure avoidance methods.
Public Documentation and Test Data
Census Bureau Disclosure Control
In September 2018, the Census Bureau announced a new set of methods for disclosure control in public use data products, including aggregate-level tabular data and microdata derived from the decennial census and the American Community Survey (ACS). The new approach, known as differential privacy, “marks a sea change for the way that official statistics are produced and published” (Garfinkel, Abowd, and Powazek 2018).
The Census Bureau claims that the new system will be more open and transparent to users. But the new system will come with a significant trade-off in data accuracy, making the public data useless for many applications. Indeed, in its pure form differential privacy techniques could make the release of scientifically useful microdata impossible and severely limit the utility of tabular small-area data.
We are following this conversation, and we will share relevant information from the Census Bureau and others as it is released.
- NEW: David Van Riper, "Differential Privacy and the Decennial Census"
- NEW: David Van Riper, Tracy Kugler, José Pacas, and Jonathan Schroeder, “Differential Privacy and the Decennial Census”
- Ruggles, Steven, Catherine Fitch, Diana Magnuson, and Jonathan Schroeder. 2019. "Differential Privacy and Census Data: Implications for Social and Economics Research.” AEA Papers and Proceedings, 109 : 403-08.
- Task Force on Differential Privacy for Census Data, Implications of Differential Privacy for Census Bureau Data and Research
- John M. Abowd and Ian M. Schmutte, An Economic Analysis of Privacy Protection and Statistical Accuracy as Social Choices
- Simon L. Garfinkel, John M. Abowd, and Sarah Powazek, Issues Encountered Deploying Differential Privacy
- John Abowd presentation to Census Scientific Advisory Committee: Disclosure Avoidance for Block Level Data and Protection of Confidentiality in Public Tabulations
We will continue to gather relevant information for the IPUMS user community and post here and share via IPUMS Twitter.