FrederickF.Stephann/an/an/an/a
Sampling1
. . . . . . . . . . . . . .
All sampling problems stem from the limitations that are imposed on
observation. If one could observe directly all that one needs to know,
there would be no occasion to make inferences about what has not been
observed or to generalize one’s knowledge. Science would merely be a
systematic record of data, condensed, perhaps, by some convenient shorthand
but never stretched over any void or extended into areas of ignorance. Of
course, all the future, and most of the past and present is beyond the
reach of direct observation. Generalization from limited observations is
the rule, not the exception. Observation of a subject in a laboratory
experiment and clinical examination of an individual are sampling
operations in as fundamental a sense as are attitude testing or opinion
surveying. When one studies a "unique case" or an organic system of
interaction, the totality is perceived by the observer in terms of a set of
partial and particular observations that fall short of complete knowledge
and must be extended to processes of inference if they are to be more than
historical facts.…
In studying a process of communication or some other complex system of
human behavior and interaction, one is compelled to distribute one’s
efforts either evenly but thinly over the whole or more intensively over
certain parts to the neglect of others. The most effective distribution
will be different from one situation to another. The interrelationships
that bind the parts into a system may likewise be observed through a
selected set of parts or more superficially over the whole system. An
anthropologist studies a culture, a language, or an institution through
selected informants as well as by direct observation of some instances of
behavior. A social psychologist can interview individuals or observe groups
or do both. In any case, there is a sampling of individual and group
expressions for the purpose of inferring the structure and functioning of
the system.
. . . . . . . . . . . . . .
There is no "best" method of sampling that can be followed blindly in
all instances. The most effective sampling methods are those that are
designed specifically to fit the situation in which they are to be used.
They are based on the general theory of sampling derived from mathematical
statistics and economic theory, and they take advantage of what is known in
advance about the population, system of interaction, or process that is to
be sampled. They are designed to achieve the specific purposes of the study
as effectively as is possible under the limitations set by the funds,
personnel time, and other resources that are available. In a word, they are
tailored to fit the circumstances. In this respect they are like a
manufacturing process that is engineered in terms of specifications set
by the consumer and the technical equipment and cost relations of the
plant. For simple jobs no elaborate planning is necessary; for some
products thousands of dollars of preparatory work will be devoted to
setting up the operation in order to produce the best results, taking into
account many possibilities and considerations.
TWO RULES FOR THE AMATEUR SAMPLER
It would be a mistake to confine the discussion of sampling to the
larger surveys,
especially those nation-wide polls and canvasses that have done so much
to stir up public interest in sampling and in which a careful job of sample
design is feasible. For each such survey there may be scores of studies to
be conducted on a smaller scale, geographically and financially. They may
be more intensive and complicated in the variables they measure and the
relationships they analyze. In the aggregate, and in some individual
instances, they may turn out to be more important for research than some of
the large-scale studies. So one may well ask, "Isn’t there some simple
dependable rule I can use to solve my sampling problem? I’m studying a
limited group, not seeking a national average." Such a question deserves
careful examination and a serious answer.
If it is put just this way, without any information about the population
to be sampled and the resources that can be devoted to sampling, then one
must necessarily sacrifice the advantages that come from fitting the
sampling rule to the particular case. Hence, the answer could be, "Yes. You
can have two choices."
Rule A is: Follow your common sense and select people by what seems to
you to be a good way to get a group similar to the entire group in which
you are interested. Test it with every convenient set of data you can
obtain. Then be prudent in handling the results of your study, for you may
be far off the beam without knowing it. As the airplane pilots say, you are
"flying by the seat of your pants." With good luck you may not be too far
off too often. If other phases of the study are correspondingly uncertain
and not under full control, you are probably not adding a great deal to the
total risks you are taking. If you are badly wrong, perhaps later
experience, bitter though it may be, will set you right. If you are right,
you have saved yourself a bit of trouble. This is the way you handle many
other kinds of problems that you encounter, so perhaps you might as well do
your sampling by the same rule-of-thumb method.
Rule B is for those who do not like to gamble or live dangerously. Such
people may be less likely to make occasional brilliant discoveries or
produce a flood of studies, but, like the tortoise, they may pass the hare
before the race is finished. Rule B counsels: (1) Pin down very
specifically the definition of the population you are studying and the
variables you wish to measure. (This may be no more than the arduous task
of making up your mind about just what you will attempt.) (2) Obtain a list
of all the persons who make up that group or population or, lacking a
list, divide the population into many small parts according to residence,
place of work, or other suitable factors. Do this, however, in a way that
tends to make each part a mixture of different kinds of people rather than
a cluster of persons who are similar with regard to the variables you are
studying. (This is where you may have to go contrary to your intuition.)
(3) Select by some strictly random procedure enough of these parts to give
you the number of persons you think you need after allowing for the loss of
those you will not be able to study successfully. You may even make up sets
of persons or parts, well balanced on the variables you know about
beforehand, until you have every person in the population in the same
number of sets. Then select strictly at random one such set as a sample.
(4) Proceed to study the sample but keep a complete record of persons you
miss and what information you can find out about them. Analyze this
information and the variation shown within the sample by those you do
get. From this arrive at the best estimate you can of the accuracy of the
sample in representing the population on each variable. Do not lean too
heavily on statistical theory unless you have followed very meticulously
the procedure it assumes has been followed. (5) Use every opportunity
to check the sample with other dependable data. Possibly add
certain questions to the study for this purpose. (6) Record and report
your methods and deviations adequately so that others can repeat them or
form a somewhat independent appraisal of the sample.
So much for Rules A and B. Sometimes they will work quite well. At other
times the results will not be satisfactory unless greater care is taken
with the sampling. This can be done in several ways. First, one can learn
something more about sampling from the reports of previous surveys, though
most reports offer little in the way of useful tests of sampling methods,
and from technical publications in statistics and related fields. Second,
one can get help from advisers who have developed expertness in research
methods. Finally, one may seek to develop new methods appropriate for his
own situation by his own ingenuity and experimentation. How much time and
effort one gives to sampling should be determined by the general strategy
of his research plans, seeking the most effective use of his resources and
the greatest yield of results. In many small or casual studies very little
benefit may result from improvement of the sampling procedure, but in
larger and more important studies, there are great opportunities to
attain a degree of accuracy with a well-designed and well-executed sample
that cannot be attained by the crude sampling, no matter how large the
cruder sample may be. Hence, these amateur rules should only be followed
when the circumstances do not warrant a more technical treatment of the
sampling problems.
TECHNICAL PROBLEMS
The technical problems of a major sampling operation are too complex for
detailed discussion in this paper. It is important that their general
nature be widely known among those who produce or receive the results of
research. Hence, a brief and simplified description of the principal kinds
of technical problems is in order.
The first problem is that of the initial specifications. It is
necessary to formulate rather specifically the purposes that are to be
served by the sampling operation: the population of persons or other units
of observation that are to be sampled; the variables that are to be
observed; the accuracy of measurement that can be attained; the degree of
accuracy required in the results; the resources that are available; and the
other principal factors that will restrict the procedure and determine its
suitability. If these conditions are vague, then there will be little basis
for designing, or even for choosing, a particular sampling plan as better
than other possible plans. If they are quite definite, they may establish
automatically the general procedure or over-all plan for the sampling
operation, leaving only the details to be determined.
A second problem is that of design. Within the limits set by the
initial specifications one must shape up a general plan or a series of
alternative plans to form a basis for organizing and solving the other
technical problems. These plans will set forth the procedures by which the
sample will be selected and the observations or interviews actually
obtained for the sample. They may well proceed through to a trial run on
the procedures that will be used to analyze the data and even to
pretests, preliminary surveys, or preparatory runs of experiments.
Concurrently or following this phase of designing, the more particular
problems will be studied and solved, leading, finally, to the preparation
of a complete plan of procedure with the necessary instructions and
materials for its execution.
A third problem is that of costs and resources. In the simplest
case the only cost
is the research worker’s time. In larger surveys the costs are those of
a far-flung corps of interviewers with travel, training, supervision,
office expenses, payments to respondents, recording and tabulating
equipment, and many other categories of expense. Sampling affects these
costs by determining the number, location, and type of persons who must be
reached and by regulating other phases of the operation. Very little has
been published on the actual costs of opinion surveys. Even less has been
done in genuine cost analysis to determine how each of many possible
modifications of design would increase or decrease costs. Hence, design may
rest heavily on judgment in attempting to minimize the total cost of a
survey.
It may be necessary to obtain certain equipment and materials and to
train personnel. Lists of persons or dwellings or maps may be needed. Tests
may be given and data assembled preliminary to the selection of subjects
for laboratory or clinical study. These phases of the sampling procedure
affect the design and the effectiveness of the entire operation.
A fourth problem is that of the analysis of accuracy. A
dependable estimate of the accuracy of the results of the study is needed
in advance to guide the planning and after the completion of the study to
guide its use. Frequently, only the roughest guesses about accuracy are
possible. There are several devices by which improved estimates of the
accuracy can be obtained if these devices are incorporated into the survey
operation. The accuracy of the final results will be affected by many
factors, and the estimate of accuracy may be built up from separate
estimates of the effect of each or from over-all measurements such as those
provided by previous surveys.
A fifth problem is that of operation, the actual execution of the
procedures and plans that have been designed. This introduces many
practical problems, some of which are rarely described or even mentioned in
the textbooks on research methods. One of the most important is that of
controlling the loss of part of the sample as a consequence of various
difficulties and accidents in completing the observations and carrying them
through the analysis. Subjects may fall ill, respondents may refuse to
answer questions, lists and maps may have inaccuracies, records may be
damaged or lost, severe storms or disasters may make field work
impossible, and many other influences may disturb the performance of the
operation. The resulting deviations from plan will vary in their
seriousness, but, unless they are checked and controlled, they may
invalidate the results of the most thorough planning.
The last problem is that of presentation and use. The ultimate
criterion for appraisal of sampling methods is how well their results
satisfy the purposes for which the research was undertaken. It matters
little how accurately the data were recorded somewhere along the line if
when the results of their analysis are put into practice they have been
misunderstood, distorted, inadequately defined, or ineffectively
utilized. This is a point on which there is sure to be disagreement. The
research worker may deny that he has any responsibility for what happens
after he completes his report. However, he cannot escape taking some
account of the use of his results, since otherwise he could just as well
design his study as a game to be played for its own sake. When one starts
seriously to design a procedure, all efforts are directed by the ultimate
objective. They fail to the degree that it is not attained. Hence, errors
of application and use must be considered in judging the appropriateness of
a sample design or in comparing two or more methods. There is a great job
to be done, and it is only barely started, of explaining sampling to the
potential users of research results. It is very difficult to correct
misconceptions and to instruct research workers in effective methods of
interpreting and using the research product. An attitude of restrained
confidence is not learned easily; investigators may err in the direction
of blind faith or excessive caution. They should
learn that the results of sampling are just as trustworthy as other
information if one takes account in each instance of the degree of accuracy
that has been attained.
. . . . . . . . . . . . . .
1 From , 1950, 55:371–375. By
permission of The University of Chicago Press.