|
Census Data
Economic Data |
Demographic Data |
Combining Data
Research Opportunities
The Research Data Centers provide restricted access
to non-public Census Bureau data in a secure
laboratory. Because of disclosure risks and the costs
and procedural requirements associated with operating an
RDC, access is restricted to projects emphasizing model
estimation and requiring use of non-public Census Bureau
data. Projects that can be completed with public use
data are not appropriate for the RDCs. In addition, the
RDCs are not appropriate for research projects whose
output consists primarily of tabulations of data.
A wide range of data collected by the Bureau of the
Census are potentially available for research projects
at the CCRDC. For a list of datasets currently
available for use at the RDCs, please see the
CES Data page.
Economic Data: Firms
and Establishments
Data from many censuses and surveys of business
establishments and firms are available for use in
research projects. Microdata from these studies are
almost never released as public use data files; the RDC
program provides access to these data sets for
researchers with approved projects.
Demographic Data:
Households and Individuals
In addition to the decennial Census of Population and
Housing, the Census Bureau regularly collects
information through a number of surveys of households
and individuals. Most of these datasets are released as
public use files, but the versions in the RDC are NOT
the public-use versions. These internal RDC versions
include more complete geography (in many cases down to
the block). Also, items such as income are not topcoded.
PLEASE NOTE: individual identifiers such as name,
address, and social security numbers are NOT included.
In many cases, the additional information in these files
allows researchers to perform innovative research. The
following are typical reasons that researchers want to
access internal versions of Census Bureau data sets:
- Geographic precision
Microdata made avaialble for public use from the
demographic censuses and surveys generally code
geography for areas with a minimum population of
100,000 persons. This limits study of the effects of
small area characteristics. researchers with
approved projects at an RDC can access detailed
geographic codes from the internal microdata files.
An important advantage is the ability to link to
data on individuals more detailed "contextual"
characteristics of the areas in which they live.
- Examination of recoded
variables In order to minimize disclosure risks,
public use microdata files often contain topcoded,
bracketed or otherwise recoded variables. These
recodes sometimes mask variation or make model
specification difficult. Again, approved projects at
an RDC can access internal microdata files that
contain the variable values prior to recoding.
- Studies of small
populations While the Public Use Microdata
Samples from the decennial censuses generally
provide samples large enough for most research, some
populations of interest are exceptionally small.
Larger samples can be generated from use of complete
decennial censuses, including the 1 in 6 long form
data.
Note: Use of these data files may result in
significant disclosure risks. This is especially true
for studies of small populations (even with the
increased sample sizes that may be available), and even
more if the project studies small populations classified
by geography and by population characteristics such as
age, race, or sex. Moreover, the addition of contextual
data also may increase disclosure risks. Researchers
should keep these risks in mind in writing their
proposals. To reduce the disclosure risks, proposed
research projects should emphasize models, not
tabulations.
Combining Economic
and Demographic Data
Projects at the RDCs have combined economic and
demographic data or matched demographic data from
different surveys and censuses based on geographic
identifiers.
Combining Census Bureau Data with Non-Census Bureau Data
Researchers with outside data such as administrative
records may seek to enrich the information available to
them by linking their data with Census Bureau data
files. The CCRDC supports this kind of data development
and innovation. However, such projects are subject to
additional scrutiny and the review process will require
significantly more time because it is necessary to
assess carefully possible disclosure risks, to obtain
any permissions required to use the outside data and
link the data sets, and to assess the costs and
feasibility of data set construction. Projects
requesting outside administrative records can expect to
pay extra fees to reimburse the Center for Economics
Studies and the Census Bureau for the extra work
required for such projects.
During 2007, the Census Bureau has reached agreement
with several federal agencies to make restricted data
from those agencies available to qualified researchers
through the Census Research Data Center (RDC) network.
Begining in February, CES began accepting proposals to
use many data sets from the National Center for Health
Statistics (NCHS). NCHS will handle all proposal review
and also diclosure avoidance review for their data, and
will charge a small fee for creating the researcher's
data extract. Details are available at
CES.
Begining in July, CES began acceptiong proposals using
the restricted Medical Expenditure Panel Survey (MEPS)
data from the Agency for Healthcare Research and Quality
(AHRQ). AHRQ will handle all proposal review and also
disclosure avoidance review for their data; they have
agreed to waive their fee for creating the researcher's
data extract for RDC researchers. Details are available
at
CES.
In all cases, researchers will still need to obtain
Special Sworn Status in order to use these data at one of
the Census RDCs. |