I.
Sample Design
The sample design of the survey has changed
over time, but it has always been representative of the U.S. general population
(since 1991, the civilian noninstitutional population) age 12 and older
and has always oversampled youth and young adults. The 1998 NHSDA employed
a multistage area probability sample of 25,500 persons. The first stage
of selection is a sample of 137 Primary Sampling Units (PSUs), each consisting
of counties (administrative subdivisions of States) or groups of counties
such as metropolitan areas. Within these PSUs, segments (such as city blocks
or enumeration districts) are selected. In 1998, 2,670 segments were selected,
and in each of these segments a listing of all addresses was made, from
which a sample of 94,723 addresses was selected. Of these, 80,866 were
determined to be eligible sample units. In these sample units (which can
be either households or units within group quarters), sample persons were
randomly selected (with unequal probabilities) using a screening procedure
carried out by interviewers.
The 1998 NHSDA sampled segments were allocated
equally into four separate samples, one for each three month period during
the year, so that the survey is essentially continuous in the field. By
assigning the appropriate selection probabilities at the PSU, segment,
and person levels, oversampling of certain subpopulations of interest was
accomplished. In 1998, these subpopulations included younger individuals
(age 12-34), blacks, Hispanics, and residents of Arizona and California.
II.
Data Collection Methodology
The data collection method used in the
NHSDA is to conduct in-person interviews with sample persons, incorporating
procedures that would be likely to maximize respondents' cooperation and
willingness to report honestly about their illicit drug use behavior. Introductory
letters are sent to sampled addresses, followed by an interviewer visit.
A five-minute screening procedure involves listing all household members
along with their basic demographic data and selecting 0-2 sample person(s),
depending on the composition of the household. This selection process is
designed to provide the necessary sample sizes for specified population
groups.
Interviewers attempt to conduct interviews
in a private place, away from other household members. The interview averages
about an hour, and includes a combination of interviewer-administered and
self-administered questions. With this procedure, the answers to sensitive
questions (such as those on illicit drug use) are recorded by the respondent
and not seen or reviewed by the interviewer. After these answer sheets
are completed, they are placed by the respondent in an envelope, which
is sealed and mailed to the contractor, Research Triangle Institute, with
no personal identifying information attached.
III.
Data Processing
Upon receipt, questionnaires are checked
for critical identification and demographic data, then keyed to disk. This
creates a file consisting of one record for each completed interview. Extensive
within-record consistency checks and resolution of most inconsistencies
and missingdata are done using machine editing routines, called logical
imputation. For some key variables that still have missing values after
the application of logical imputation, statistical imputation is used to
replace the missing data with appropriate valid response codes. Two types
of statistical imputation procedures are used. Hot-deck imputation involves
the replacement of a missing value with a valid code taken from another
respondent who is "similar" and has complete data. Logistic regression
models are also used to determine replacement values for some variables.
Each record (i.e., respondent) is assigned
an analysis weight which incorporates:
a. The inverse of the selection probability
for the respondent. This is the product of the inverses of selection probabilities
at each stage of sampling.
b. Adjustments for household and person-level
nonresponse.
c. Poststratification adjustment to Census
projections (of the civilian noninstitutionalized population of the total
U.S.) for the midpoint of each NHSDA data collection period. Adjustments
are made to age, gender, and race/ethnicity distributions (see Appendix
2 for a discussion of the poststratification adjustment).
Data are generally released to the public
about six months after the end of data collection. Public use data files
are available 1-2 years after completion of data collection.
SAMHSA, an agency in the Department of Health and Human Services, is the Federal
Government's lead agency for improving the quality and availability of
substance abuse prevention, addiction treatment, and mental health
services in the United States.