FrederickF.Stephann/an/an/an/a

Sampling1

. . . . . . . . . . . . . .

All sampling problems stem from the limitations that are imposed on observation. If one could observe directly all that one needs to know, there would be no occasion to make inferences about what has not been observed or to generalize one’s knowledge. Science would merely be a systematic record of data, condensed, perhaps, by some convenient shorthand but never stretched over any void or extended into areas of ignorance. Of course, all the future, and most of the past and present is beyond the reach of direct observation. Generalization from limited observations is the rule, not the exception. Observation of a subject in a laboratory experiment and clinical examination of an individual are sampling operations in as fundamental a sense as are attitude testing or opinion surveying. When one studies a "unique case" or an organic system of interaction, the totality is perceived by the observer in terms of a set of partial and particular observations that fall short of complete knowledge and must be extended to processes of inference if they are to be more than historical facts.…

In studying a process of communication or some other complex system of human behavior and interaction, one is compelled to distribute one’s efforts either evenly but thinly over the whole or more intensively over certain parts to the neglect of others. The most effective distribution will be different from one situation to another. The interrelationships that bind the parts into a system may likewise be observed through a selected set of parts or more superficially over the whole system. An anthropologist studies a culture, a language, or an institution through selected informants as well as by direct observation of some instances of behavior. A social psychologist can interview individuals or observe groups or do both. In any case, there is a sampling of individual and group expressions for the purpose of inferring the structure and functioning of the system.

. . . . . . . . . . . . . .

There is no "best" method of sampling that can be followed blindly in all instances. The most effective sampling methods are those that are designed specifically to fit the situation in which they are to be used. They are based on the general theory of sampling derived from mathematical statistics and economic theory, and they take advantage of what is known in advance about the population, system of interaction, or process that is to be sampled. They are designed to achieve the specific purposes of the study as effectively as is possible under the limitations set by the funds, personnel time, and other resources that are available. In a word, they are tailored to fit the circumstances. In this respect they are like a manufacturing process that is engineered in terms of specifications set by the consumer and the technical equipment and cost relations of the plant. For simple jobs no elaborate planning is necessary; for some products thousands of dollars of preparatory work will be devoted to setting up the operation in order to produce the best results, taking into account many possibilities and considerations.

TWO RULES FOR THE AMATEUR SAMPLER

It would be a mistake to confine the discussion of sampling to the larger surveys, especially those nation-wide polls and canvasses that have done so much to stir up public interest in sampling and in which a careful job of sample design is feasible. For each such survey there may be scores of studies to be conducted on a smaller scale, geographically and financially. They may be more intensive and complicated in the variables they measure and the relationships they analyze. In the aggregate, and in some individual instances, they may turn out to be more important for research than some of the large-scale studies. So one may well ask, "Isn’t there some simple dependable rule I can use to solve my sampling problem? I’m studying a limited group, not seeking a national average." Such a question deserves careful examination and a serious answer.

If it is put just this way, without any information about the population to be sampled and the resources that can be devoted to sampling, then one must necessarily sacrifice the advantages that come from fitting the sampling rule to the particular case. Hence, the answer could be, "Yes. You can have two choices."

Rule A is: Follow your common sense and select people by what seems to you to be a good way to get a group similar to the entire group in which you are interested. Test it with every convenient set of data you can obtain. Then be prudent in handling the results of your study, for you may be far off the beam without knowing it. As the airplane pilots say, you are "flying by the seat of your pants." With good luck you may not be too far off too often. If other phases of the study are correspondingly uncertain and not under full control, you are probably not adding a great deal to the total risks you are taking. If you are badly wrong, perhaps later experience, bitter though it may be, will set you right. If you are right, you have saved yourself a bit of trouble. This is the way you handle many other kinds of problems that you encounter, so perhaps you might as well do your sampling by the same rule-of-thumb method.

Rule B is for those who do not like to gamble or live dangerously. Such people may be less likely to make occasional brilliant discoveries or produce a flood of studies, but, like the tortoise, they may pass the hare before the race is finished. Rule B counsels: (1) Pin down very specifically the definition of the population you are studying and the variables you wish to measure. (This may be no more than the arduous task of making up your mind about just what you will attempt.) (2) Obtain a list of all the persons who make up that group or population or, lacking a list, divide the population into many small parts according to residence, place of work, or other suitable factors. Do this, however, in a way that tends to make each part a mixture of different kinds of people rather than a cluster of persons who are similar with regard to the variables you are studying. (This is where you may have to go contrary to your intuition.) (3) Select by some strictly random procedure enough of these parts to give you the number of persons you think you need after allowing for the loss of those you will not be able to study successfully. You may even make up sets of persons or parts, well balanced on the variables you know about beforehand, until you have every person in the population in the same number of sets. Then select strictly at random one such set as a sample. (4) Proceed to study the sample but keep a complete record of persons you miss and what information you can find out about them. Analyze this information and the variation shown within the sample by those you do get. From this arrive at the best estimate you can of the accuracy of the sample in representing the population on each variable. Do not lean too heavily on statistical theory unless you have followed very meticulously the procedure it assumes has been followed. (5) Use every opportunity to check the sample with other dependable data. Possibly add certain questions to the study for this purpose. (6) Record and report your methods and deviations adequately so that others can repeat them or form a somewhat independent appraisal of the sample.

So much for Rules A and B. Sometimes they will work quite well. At other times the results will not be satisfactory unless greater care is taken with the sampling. This can be done in several ways. First, one can learn something more about sampling from the reports of previous surveys, though most reports offer little in the way of useful tests of sampling methods, and from technical publications in statistics and related fields. Second, one can get help from advisers who have developed expertness in research methods. Finally, one may seek to develop new methods appropriate for his own situation by his own ingenuity and experimentation. How much time and effort one gives to sampling should be determined by the general strategy of his research plans, seeking the most effective use of his resources and the greatest yield of results. In many small or casual studies very little benefit may result from improvement of the sampling procedure, but in larger and more important studies, there are great opportunities to attain a degree of accuracy with a well-designed and well-executed sample that cannot be attained by the crude sampling, no matter how large the cruder sample may be. Hence, these amateur rules should only be followed when the circumstances do not warrant a more technical treatment of the sampling problems.

TECHNICAL PROBLEMS

The technical problems of a major sampling operation are too complex for detailed discussion in this paper. It is important that their general nature be widely known among those who produce or receive the results of research. Hence, a brief and simplified description of the principal kinds of technical problems is in order.

The first problem is that of the initial specifications. It is necessary to formulate rather specifically the purposes that are to be served by the sampling operation: the population of persons or other units of observation that are to be sampled; the variables that are to be observed; the accuracy of measurement that can be attained; the degree of accuracy required in the results; the resources that are available; and the other principal factors that will restrict the procedure and determine its suitability. If these conditions are vague, then there will be little basis for designing, or even for choosing, a particular sampling plan as better than other possible plans. If they are quite definite, they may establish automatically the general procedure or over-all plan for the sampling operation, leaving only the details to be determined.

A second problem is that of design. Within the limits set by the initial specifications one must shape up a general plan or a series of alternative plans to form a basis for organizing and solving the other technical problems. These plans will set forth the procedures by which the sample will be selected and the observations or interviews actually obtained for the sample. They may well proceed through to a trial run on the procedures that will be used to analyze the data and even to pretests, preliminary surveys, or preparatory runs of experiments. Concurrently or following this phase of designing, the more particular problems will be studied and solved, leading, finally, to the preparation of a complete plan of procedure with the necessary instructions and materials for its execution.

A third problem is that of costs and resources. In the simplest case the only cost is the research worker’s time. In larger surveys the costs are those of a far-flung corps of interviewers with travel, training, supervision, office expenses, payments to respondents, recording and tabulating equipment, and many other categories of expense. Sampling affects these costs by determining the number, location, and type of persons who must be reached and by regulating other phases of the operation. Very little has been published on the actual costs of opinion surveys. Even less has been done in genuine cost analysis to determine how each of many possible modifications of design would increase or decrease costs. Hence, design may rest heavily on judgment in attempting to minimize the total cost of a survey.

It may be necessary to obtain certain equipment and materials and to train personnel. Lists of persons or dwellings or maps may be needed. Tests may be given and data assembled preliminary to the selection of subjects for laboratory or clinical study. These phases of the sampling procedure affect the design and the effectiveness of the entire operation.

A fourth problem is that of the analysis of accuracy. A dependable estimate of the accuracy of the results of the study is needed in advance to guide the planning and after the completion of the study to guide its use. Frequently, only the roughest guesses about accuracy are possible. There are several devices by which improved estimates of the accuracy can be obtained if these devices are incorporated into the survey operation. The accuracy of the final results will be affected by many factors, and the estimate of accuracy may be built up from separate estimates of the effect of each or from over-all measurements such as those provided by previous surveys.

A fifth problem is that of operation, the actual execution of the procedures and plans that have been designed. This introduces many practical problems, some of which are rarely described or even mentioned in the textbooks on research methods. One of the most important is that of controlling the loss of part of the sample as a consequence of various difficulties and accidents in completing the observations and carrying them through the analysis. Subjects may fall ill, respondents may refuse to answer questions, lists and maps may have inaccuracies, records may be damaged or lost, severe storms or disasters may make field work impossible, and many other influences may disturb the performance of the operation. The resulting deviations from plan will vary in their seriousness, but, unless they are checked and controlled, they may invalidate the results of the most thorough planning.

The last problem is that of presentation and use. The ultimate criterion for appraisal of sampling methods is how well their results satisfy the purposes for which the research was undertaken. It matters little how accurately the data were recorded somewhere along the line if when the results of their analysis are put into practice they have been misunderstood, distorted, inadequately defined, or ineffectively utilized. This is a point on which there is sure to be disagreement. The research worker may deny that he has any responsibility for what happens after he completes his report. However, he cannot escape taking some account of the use of his results, since otherwise he could just as well design his study as a game to be played for its own sake. When one starts seriously to design a procedure, all efforts are directed by the ultimate objective. They fail to the degree that it is not attained. Hence, errors of application and use must be considered in judging the appropriateness of a sample design or in comparing two or more methods. There is a great job to be done, and it is only barely started, of explaining sampling to the potential users of research results. It is very difficult to correct misconceptions and to instruct research workers in effective methods of interpreting and using the research product. An attitude of restrained confidence is not learned easily; investigators may err in the direction of blind faith or excessive caution. They should learn that the results of sampling are just as trustworthy as other information if one takes account in each instance of the degree of accuracy that has been attained.

. . . . . . . . . . . . . .

1 From , 1950, 55:371–375. By permission of The University of Chicago Press.