CCRDC Research Projects
Research Opportunities | Other Census RDCs |
CES Discussion Paper Series

Research Opportunities
The Research Data Centers provide restricted access to non-public Census Bureau data in a secure laboratory. Because of disclosure risks and the costs and procedural requirements associated with operating an RDC, access is restricted to projects emphasizing model estimation and requiring use of non-public Census Bureau data. Projects that can be completed with public use data are not appropriate for the RDCs. In addition, the RDCs are not appropriate for research projects whose output consists primarily of tabulations of data.

A wide range of data collected by the Bureau of the Census are potentially available for research projects at the CCRDC. For a summary, go to the Data page.

Economic Data: Firms and Establishments
Data from many censuses and surveys of business establishments and firms are available for use in research projects. These data sets include the Economic Census data from the manufacturing, wholesale, retail, and other sectors, as well as surveys of business owners, research and development expenditures, pollution abatement expenditures, energy consumption in manufacturing, advanced technology use, and others. Microdata from these studies are almost never released as public use data files; the RDC program provides access to these data sets for researchers with approved projects.

Demographic Data: Households and Individuals
In addition to the decennial Census of Population and Housing, the Census Bureau regularly collects information through more than 30 surveys of households and individuals. Topics covered in these surveys include housing, crime victimization, schooling, employment, income, program participation, the careers of scientists and engineers, health and medical care, among others. Most of these datasets are released as public use files, allowing a wide range of research projects. But in order to protect confidentiality of survey participants the Census Bureau must restrict the information provided in the public use data sets. These restrictions may involve including only geographic identifiers of relatively large areas, topcoding variables such as income, and other recoding. Through the RDC program, researchers with approved projects can gain access to "internal" versions of the data that have not been modified for public use. In many cases, the additional information in these files allows researchers to perform innovative research. The following are typical reasons that researchers want to access internal versions of Census Bureau data sets:

  • Geographic precision Microdata made avaialble for public use from the demographic censuses and surveys generally code geography for areas with a minimum population of 100,000 persons. This limits study of the effects of small area characteristics. researchers with approved projects at an RDC can access detailed geographic codes from the internal microdata files. An important advantage is the ability to link to data on individuals more detailed "contextual" characteristics of the areas in which they live.
  • Examination of recoded variables In order to minimize disclosure risks, public use microdata files often contain topcoded, bracketed or otherwise recoded variables. These recodes sometimes mask variation or make model specification difficult. Again, approved projects at an RDC can access internal microdata files that contain the variable values prior to recoding.
  • Studies of small populations While the Public Use Microdata Samples from the decennial censuses generally provide samples large enough for most research, some populations of interest are exceptionally small. Larger samples can be generated from use of complete decennial censuses, including the 1 in 6 long form data.

Note: Use of these data files may result in significant disclosure risks. This is especially true for studies of small populations (even with the increased sample sizes that may be available), and even more if the project studies small populations classified by geography and by population characteristics such as age, race, or sex. Moreover, the addition of contextual data also may increase disclosure risks. Researchers should keep these risks in mind in writing their proposals. To reduce the disclosure risks, proposed research projects should emphasize models, not tabulations.

Combining Economic and Demographic Data
The Census Bureau is developing data sets combining data from the Bureau's Economic and Demographic programs. The first data set of this kind is the Worker-Establishment Characteristics Database, which provides 1990 Decennial Census data on manufacturing workers together with LRD data on the establishments at which they work. In addition, projects at the RDCs have combined economic and demographic data or matched demographic data from different surveys and censuses based on geographic identifiers.

Combining Census Bureau Data with Non-Census Bureau Data
Researchers with outside data such as administrative records may seek to enrich the information available to them by linking their data with Census Bureau data files. The CCRDC supports this kind of data development and innovation. However, such projects are subject to additional scrutiny and the review process is likely to require more time, because it is necessary to assess carefully possible disclosure risks, to obtain any permissions required to use the outside data and link the data sets, and to assess the costs and feasibility of data set construction.

Top of the page

Home   Overview   Organization   Research   Data   Operations   Contacts   Links