|
CCRDC Research Projects
Research Opportunities | Other
Census RDCs |
CES Discussion Paper Series
Research Opportunities
The Research Data Centers provide restricted access to non-public
Census Bureau data in a secure laboratory. Because of disclosure risks
and the costs and procedural requirements associated with operating
an RDC, access is restricted to projects emphasizing model estimation
and requiring use of non-public Census Bureau data. Projects that
can be completed with public use data are not appropriate for the
RDCs. In addition, the RDCs are not appropriate for research projects
whose output consists primarily of tabulations of data.
A wide range of data collected by the Bureau of the
Census are potentially available for research projects at the CCRDC.
For a summary, go to the Data page.
Economic Data: Firms and Establishments
Data from many censuses and surveys of business establishments and
firms are available for use in research projects. These data sets
include the Economic Census data from the manufacturing, wholesale,
retail, and other sectors, as well as surveys of business owners,
research and development expenditures, pollution abatement expenditures,
energy consumption in manufacturing, advanced technology use, and
others. Microdata from these studies are almost never released as
public use data files; the RDC program provides access to these data
sets for researchers with approved projects.
Demographic Data: Households and Individuals
In addition to the decennial Census of Population and Housing, the
Census Bureau regularly collects information through more than 30
surveys of households and individuals. Topics covered in these surveys
include housing, crime victimization, schooling, employment, income,
program participation, the careers of scientists and engineers, health
and medical care, among others. Most of these datasets are released
as public use files, allowing a wide range of research projects. But
in order to protect confidentiality of survey participants the Census
Bureau must restrict the information provided in the public use data
sets. These restrictions may involve including only geographic identifiers
of relatively large areas, topcoding variables such as income, and
other recoding. Through the RDC program, researchers with approved
projects can gain access to "internal" versions of the data that have
not been modified for public use. In many cases, the additional information
in these files allows researchers to perform innovative research.
The following are typical reasons that researchers want to access
internal versions of Census Bureau data sets:
- Geographic precision Microdata made avaialble for public
use from the demographic censuses and surveys generally code geography
for areas with a minimum population of 100,000 persons. This limits
study of the effects of small area characteristics. researchers
with approved projects at an RDC can access detailed geographic
codes from the internal microdata files. An important advantage
is the ability to link to data on individuals more detailed "contextual"
characteristics of the areas in which they live.
- Examination of recoded variables In order to minimize disclosure
risks, public use microdata files often contain topcoded, bracketed
or otherwise recoded variables. These recodes sometimes mask variation
or make model specification difficult. Again, approved projects
at an RDC can access internal microdata files that contain the variable
values prior to recoding.
- Studies of small populations While the Public Use Microdata
Samples from the decennial censuses generally provide samples large
enough for most research, some populations of interest are exceptionally
small. Larger samples can be generated from use of complete decennial
censuses, including the 1 in 6 long form data.
Note: Use of these data files may result in significant
disclosure risks. This is especially true for studies of small populations
(even with the increased sample sizes that may be available), and
even more if the project studies small populations classified by geography
and by population characteristics such as age, race, or sex. Moreover,
the addition of contextual data also may increase disclosure risks.
Researchers should keep these risks in mind in writing their proposals.
To reduce the disclosure risks, proposed research projects should
emphasize models, not tabulations.
Combining Economic and Demographic Data
The Census Bureau is developing data sets combining data from
the Bureau's Economic and Demographic programs. The first data set
of this kind is the Worker-Establishment Characteristics Database,
which provides 1990 Decennial Census data on manufacturing workers
together with LRD data on the establishments at which they work. In
addition, projects at the RDCs have combined economic and demographic
data or matched demographic data from different surveys and censuses
based on geographic identifiers.
Combining Census Bureau Data with Non-Census Bureau
Data
Researchers with outside data such as administrative records may seek
to enrich the information available to them by linking their data
with Census Bureau data files. The CCRDC supports this kind of data
development and innovation. However, such projects are subject to
additional scrutiny and the review process is likely to require more
time, because it is necessary to assess carefully possible disclosure
risks, to obtain any permissions required to use the outside data
and link the data sets, and to assess the costs and feasibility of
data set construction.
|