Statistical Methods in Medical Research - part 9 pptx

^p   N i p i N , where p i is the estimated prevalence in the ith stratum. The variance of ^p is given by the equivalent of (19.2), in which we shall drop the terms f i as being small: varp 1 N 2  N 2 i p i 1 À p i  n i : The values of var^p) for the three allocations are as follows: Allocation var^p) A0Á000714 B0Á000779 C0Á000739 As would be expected, the lowest variance is for A and the highest for B, the latter being only a little lower than the variance for a random sample. Multistage sampling In this method the sampling frame is divided into a population of `first-stage sampling units', of which a `first-stage' sample is taken. This will usually be a simple random sample, but may be a systematic or stratified sample; it may also, as we shall see, be a random sample in which some first-stage units are allowed to have a higher probability of selection than others. Each sampled first-stage unit is subdivided into `second-stage sampling units', which are sampled. The process can continue through as many stages as are appropriate. There are two main advantages of multistage sampling. First, it enables the resources to be concentrated in a limited number of portions of the whole sampling frame with a consequent reduction in cost. Secondly, it is convenient for situations in which a complete sampling frame is not available before the investigation starts. A list of first-stage units is required, but the second-stage units need be listed only within the first-stage units selected in the sample. Consider as an example a health survey of men working in a certain industry. There would probably not exist a complete index of all men in the industry, but it might be easy to obtain a list of factories, which could be first-stage units. From each factory selected in the first-stage sample, a list of men could be obtained and a second-stage sample selected from this list. Apart from the advantage of having to make lists of men only within the factories selected at the first stage, this procedure would result in an appreciable saving in cost by enabling the investigation to be concentrated at selected factories instead of necessitating the exam- ination of a sample of men all in different parts of the country. The economy in cost and resources is unfortunately accompanied by a loss of precision as compared with simple random sampling. Suppose that in the 654 Statistical methods in epidemiology example discussed above we take a sample of 20 factories and second-stage samples of 50 men in each of the 20 factories. If there is systematic variation between the factories, due perhaps to variation in health conditions in different parts of the country or to differing occupational hazards, this variation will be represented by a sample of only 20 first-stage units. A random sample of 1000 men, on the other hand, would represent 1000 random choices of first-stage units (some of which may, of course, be chosen more than once) and would con- sequently provide a better estimate of the national mean. A useful device in two-stage sampling is called self-weighting. Each first-stage unit is given a probability of selection which is proportional to the number of second-stage units it contains. Second-stage samples are then chosen to have equal size. It follows that each second-stage unit in the whole population has an equal chance of being selected and the formulae needed for estimation are somewhat simplified. Cluster sampling Sometimes, in the final stage of multistage sampling, complete enumeration of the available units is undertaken. In the industrial example, once a survey team has installed itself in a factory, it may cost little extra to examine all the men in the factory; it may indeed be useful to avoid the embarrassment that might be caused by inviting some men but not others to participate. When there is no sampling at the final stage, the method is referred to as cluster sampling. The investigator has no control over the number of sampling units in the clusters and this means that the loss of precision, compared with simple random sampling, is even greater than that in multistage sampling. Design effect The ratio of the variance of an estimator from a sampling scheme to the variance of the estimator from simple random sampling with the same total number of sampling units is known as the design effect, often abbreviated to Deff.In Example 19.1 the Deff for allocation C is 0Á000739=0Á000803  0Á92. Another way of looking at the Deff is in terms of sample size. The same precision could have been achieved with a stratified sample of 92 people with allocation C as for a simple random sample of 100 people. The stratified sample is more efficient (Deff < 1) than simple random sampling and this will occur generally provided that there is a component of variation between strata. The efficiency of stratified sampling increases with the increasing heterogeneity between strata and the consequent greater homogeneity within strata. In contrast, multistage and cluster sampling will usually have a Deff > 1. That is, a larger sample size will be required than with simple random sampling. 19.2 The planning of surveys 655 For cluster sampling with m members per cluster and a correlation within clusters of r for the variable under study, the Deff is given by Deff  1 m À1r, which will always exceed 1 (except in the unlikely scenario of negative correla- tions within clusters). A similar expression was given in §18.9 for sample size calculations in a cluster randomized trial. If m differs between clusters, then, in the above formula, m is replaced by  m 2 =  m, which is approximately  m, provided that the coefficient of variation of cluster size is small. The above discussion of the efficiency of different sampling schemes relative to simple random sampling is in terms of sample size. As we noted when discussing multistage sampling, one of the advantages of this method is a reduction in cost and the overall efficiency of a sampling scheme should be assessed in terms of the cost of carrying out the sampling, not just in terms of total sample size. If the cost of a sampling scheme is c per sample member, relative to the cost per member of a simple random sample, then the overall efficiency of the sampling scheme is 100% c Â Deff : Thus, a sampling scheme is more, or less, efficient than simple random sampling according as c Â Deff is less, or greater, than unity, respectively. Thus, a cluster sampling scheme with six members per cluster and a correlation within clusters of 0Á2(Deff  2Á0) will be more efficient than simple random sampling, provided that the average cost of conducting the survey per sample member is less than a half of the cost per member of a simple random sample. In this case the increase in sample size required to achieve a specified precision has been more than offset by a reduction in costs. Mark±recapture sampling The name of this technique is derived from its application to the estimation of the number of animals or birds in an area. Traps are set and the captured animals form a sample of the population. They are marked and released. A second set of trappings yields a second sample consisting of unmarked animals and some that were marked on the first trapping occasion. Again, animals are marked and released and the process is continued for several sets of trappings. The numbers of animals captured on each occasion, together with the numbers of recaptures of previously marked animals, can be used to estimate the total population. Assumptions must be made on whether the population is closed (no gains or losses over the total sampling period) or not, on whether the probability of capture depends on the previous capture history, and on 656 Statistical methods in epidemiology heterogeneity between animals in probability of capture. The more information that is availableÐfor example, more trapping occasions and a marking system that allows identification of the occasions on which each animal was capturedÐ the more readily can assumptions be checked and modified. This technique may be applied for the purpose of estimating the size of a population of individuals with some health-related characteristic. It is particularly useful for rare characteristics, which would require a very large sample using more traditional methods; for habits which people may be reluctant to disclose on a questionnaire, such as intravenous drug use; or for groups which may be differentially omitted from a sampling frame, such as homeless people in a city. Lists containing members of such a group may be available in agencies that provide services for the group, and if several lists are available then each may be considered to correspond to a `trapping' occasion. If there is sufficient information on identity, then it can be established whether a person is on more than one list (`recaptures'). A difference from the animal trapping situation is that there may be no time sequence involved and the lists may be treated symmetrically in respect of each other. If there are k lists, then the observations can be set out as a 2 k table, denoting presence or absence on each list. The number in the cell corresponding to absence on all k lists is, of course, unobserved and the aim of the method is to estimate this number and hence the total number. It will usually be reasonable to antici- pate that the probability of being on a list is dependent on whether or not a person is on some of the other lists. That is, there will be list dependency (corresponding to capture being dependent on previous capture history). A method of proceeding is to fit a log-linear model to the 2 k À 1 observed cells of the 2 k table and use this model to estimate the number in the unobserved cell. Dependencies between the lists can be included as interaction terms but the highest-order interaction between all k lists is set to zero, since it cannot be estimated from the incomplete table. A text on the methodology is given by Seber (1982), and Cormack (1989) describes the use of the log-linear model. A brief review is given by Chao (1998). The method gives an estimate of the total population size and its standard error. A problem is that the estimate may be dependent on non-verifiable assumptions, so that the standard error does not adequately describe the uncertainty. Never- theless, in situations where more robust methods, based on random sampling, are infeasible, the method does allow some estimation. Imputation Sample surveys usually have missing data through some people failing to answer some of the questions, either inadvertently or by refusing to answer particular questions. The analysis of the whole data set is facilitated if the missing data are 19.2 The planning of surveys 657 replaced by imputed values. Imputed values may be obtained using a model based on the observed values. For example, a multiple regression of x 1 on x 2 , x 3 , , x p might be fitted on the complete data and used to estimate values of x 1 for those individuals with this variable missing but with observed values of x 2 , x 3 , , x p . This gives the best predicted values but it is unsatisfactory to substitute these values because to do so excludes variability, and any analysis of the augmented data would appear more accurate than is justified. Instead, the random variability has to be built into the imputed values, not only the variability about the fitted regression line but also the variability due to uncertainty in estimating the regression coefficients. Thus imputed values contain random components and there is no unique best set of imputed values. Even for an imputed set of data with random variation incorporated, a single analysis will give standard errors of estimated parameters that are too small. This can be avoided by multiple imputation; that is, the missing data are imputed several times independently, and each imputed set is analysed in the same way. Variances of estimated parameters can then be produced by combining the within-imputation variance with the between-imputation variance. For further discussion of missing data and the importance of considering whether the fact that data are missing is informative in some way see §12.6. Multiple imputation has generally been used with large sample surveys but may be more widely applied as suitable software becomes available. Readers wishing to learn more about this technique are referred to Rubin (1987) and Barnard et al. (1998). Barnard et al. (1998) also describe applications of multiple imputation to a wider range of problems than non-response. These include imputation of the true ages of children when the collected data on ages were insufficiently precise, and the imputation of dates of acquired immune deficiency syndrome (AIDS) diag- nosis in people with human immunodeficiency virus (HIV) infection who had not yet contracted AIDS, using a model based on a set of covariates (Taylor et al., 1990). Other considerations The planning, conduct and analysis of sample surveys give rise to many problems that cannot be discussed here. The books by Moser and Kalton (1979) and Yates (1981) contain excellent discussions of the practical aspects of sampling. The books by Cochran (1977) and Yates (1981) may be consulted for the main theoretical results. The statistical theory of sample surveys is concerned largely with the meas- urement of sampling error. This emphasis may lead the investigator to overlook the importance of non-sampling errors. In a large survey the sampling errors may be so small that systematic non-sampling errors may be much the more 658 Statistical methods in epidemiology important. Indeed, in a complete enumeration, such as a complete population census, sampling errors disappear altogether, but there may be very serious non- sampling errors. Some non-sampling errors are non-systematic, causing no bias in the average. An example would be random inaccuracy in the reading of a test instrument. These errors merely contribute to the variability of the observation in question and therefore diminish the precision of the survey. Other errors are systematic, causing a bias in a mean value which does not decrease with increasing sample size; for example, in a health survey certain types of illness may be systematically under-reported. One of the most important types of systematic error is that due to inadequate coverage of the sampling frame, either because of non-cooperation by the individual or because the investigator finds it difficult to make the correct observations. For example, in an interview survey some people may refuse to be interviewed and others may be hard to find, may have moved away from the supposed address or even have died. Individuals who are missed for any of these reasons are likely to be atypical of the population in various relevant respects. Every effort must therefore be made to include as many as possible of the chosen individuals in the enquiry, by persistent attempts to make the relevant observations on all the non-responders or by concentrating on a subsample of them so that the characteristics of the non-responders can at least be estimated. This allows the possibility of weighting the subsample of initial non-responders to represent all non-responders (Levy & Lemeshow, 1991, p. 308), or, when back- ground data are available on all subjects, to use multiple imputation to estimate other missing variables (Glynn et al., 1993). Reference must finally be made to another important type of study, the longitudinal survey. Many investigations are concerned with the changes in certain measurements over a period of time: for example, the growth and devel- opment of children over a 10-year period, or the changes in blood pressure during pregnancy. It is desirable where possible to study each individual over the relevant period of time rather than to take different samples of individuals at different points of time. The statistical problems arising in this type of study are discussed in §12.6. 19.3 Rates and standardization Since epidemiology is concerned with the distribution of disease in populations, summary measures are required to describe the amount of disease in a population. There are two basic measures, incidence and prevalence. Incidence is a measure of the rate at which new cases of disease occur in a population previously without disease. Thus, the incidence, denoted by I,is defined as 19.3 Rates and standardization 659 I  number of new cases in period of time population at risk : The period of time is specified in the units in which the rate is expressed. Often the rate is multiplied by a base such as 1000 or 1 000 000 to avoid small decimal fractions. For example, there were 280 new cases of cancer of the pancreas in men in New South Wales in 1997 out of a population of 3Á115 million males. The incidence was 280= 3Á115  90 per million per year. Prevalence, denoted by P, is a measure of the frequency of existing disease at a given time, and is defined as P  total number of cases at given time total population at that time : Both incidence and prevalence usually depend on age, and possibly sex, and sex- and age-specific figures would be calculated. The prevalence and incidence rates are related, since an incident case is, immediately on occurrence, a prevalent case and remains as such until recovery or death (disregarding emigration and immigration). Provided the situation is stable, the link between the two measures is given by P  It, 19:4 where t is the average duration of disease. For a chronic disease from which there is no recovery, t would be the average survival after occurrence of the disease. Standardization Problems due to confounding (Example 15.6) arise frequently in vital statistics and have given rise to a group of methods called standardization. We shall describe briefly one or two of the most well-known methods, and discuss their relationship to the methods described in §15.6. Mortality in a population is usually measured by an annual death rateÐfor example, the number of individuals dying during a certain calendar year divided by the estimated population size midway through the year. Frequently this ratio is multiplied by a convenient base, such as 1000, to avoid small decimal fractions; it is then called the annual death rate per 1000 population. If the death rate is calculated for a population covering a wide age range, it is called a crude death rate. In a comparison of the mortality of two populations, say, those of two different countries, the crude rates may be misleading. Mortality depends strongly on age. If the two countries have different age structures, this contrast 660 Statistical methods in epidemiology alone may explain a difference in crude rates (just as, in Table 15.6, the contrast between the `crude' proportions with factor A was strongly affected by the different sex distributions in the disease and control groups). An example is given in Table 19.1 (on p. 664), which shows the numbers of individuals and numbers of deaths separately in different age groups, for two countries: A, typical of highly industrialized countries, with a rather high proportion of individuals at the older ages; and B, a developing country with a small proportion of old people. The death rates at each age (which are called age-specific death rates) are substantially higher for B than for A, and yet the crude death rate is higher for A than for B. The situation here is precisely the same as that discussed at the beginning of §15.6, in connection with Example 15.6. Sometimes, however, mortality has to be compared for a large number of different populations, and some form of adjust- ment for age differences is required. For example, the mortality in one country may have to be compared over several different years; different regions of the same country may be under study; or one may wish to compare the mortality for a large number of different occupations. Two obvious generalizations are: (i) in standardizing for factors other than, or in addition to, ageÐfor example, sex, as in Table 15.6; and (ii) in morbidity studies where the criterion studied is the occurrence of a certain illness rather than of death. We shall discuss the usual situationÐthe standardization of mortality rates for age. The basic idea in standardization is that we introduce a standard population with a fixed age structure. The mortality for any special population is then adjusted to allow for discrepancies in age structure between the standard and special populations. There are two main approaches: direct and indirect methods of standardization. The following brief account may be supplemented by reference to Liddell (1960), Kalton (1968) or Hill and Hill (1991). The following notation will be used. Standard Special (1) (2) (3) (4) (5) (6) Age group Population Deaths Death rate (2)=(1) Population Deaths Death rate (5)=(4) 1 N 1 R 1 P 1 n 1 r 1 p 1 . . . iN i R i P i n i r i p i . . . kN k R k P k n k r k p k 19.3 Rates and standardization 661 Direct method In the direct method the death rate is standardized to the age structure of the standard population. The directly standardized death rate for the special population is, therefore, p H   N i p i  N i : 19:5 It is obtained by applying the special death rates, p i , to the standard population sizes, N i . Alternatively, p H can be regarded as a weighted mean of the p i , using the N i as weights. The variance of p H may be estimated as varp H   N 2 i p i q i =n i    N i  2 , 19:6 where q i  1 À p i ; if, as is often the case, the p i are all small, the binomial variance of p i , p i q i =n i , may be replaced by the Poisson term p i =n i  r i =n 2 i , giving varp H 9  N 2 i p i =n i    N i  2 : 19:7 To compare two special populations, A and B, we could calculate a standardized rate for each (p H A and p H B ), and consider  d  p H A À p H B : From (19.5),  d   N i p Ai À p Bi   N i , which has exactly the same form as (15.15), with w i  N i , and d i  p Ai À p Bi as in (15.14). The method differs from that of Cochran's test only in using a different system of weights. The variance is given by var  d  N 2 i vard i    N i  2 , 19:8 with vard i  given by (15.17). Again, when the p 0i are small, q 0i can be put approximately equal to 1 in (15.17). If it is required to compare two special populations using the ratio of the standardized rates, p H A =p H B , then the variance of the ratio may be obtained using (19.6) and (5.12). The variance given by (19.7) may be unsatisfactory for the construction of confidence limits if the numbers of deaths in the separate age groups are small, since the normal approximation is then unsatisfactory and the Poisson limits are 662 Statistical methods in epidemiology asymmetric (§5.2). The standardized rate (19.5) is a weighted sum of the Poisson counts, r i . Dobson et al. (1991) gave a method of calculating an approximate confidence interval based on the confidence interval of the total number of deaths. Example 19.2 In Table 19.1 a standardized rate p H could be calculated for each population. What should be taken as the standard population? There is no unique answer to this question. The choice may not greatly affect the comparison of two populations, although it will certainly affect the absolute values of the standardized rates. If the contrast between the age-specific rates is very different at different age groups, we may have to consider whether we wish the standardized rates to reflect particularly the position at certain parts of the age scale; for example, it might be desirable to give less weight to the higher age groups because the purpose of the study is mainly to compare mortality at younger ages, because the information at higher ages is less reliable, or because the death rates at high ages are more affected by sampling error. At the foot of Table 19.1 we give standardized rates with three choices of standard population: (a) population A, (b) population B, and (c) a hypothetical population, C, whose proportionate distribution is midway between A and B, i.e. N Ci G 1 2 n Ai  n Ai  n Bi  n Bi  : Note that for method (a) the standardized rate for A is the same as the crude rate; similarly for (b) the standardized rate for B is the same as the crude rate. Although the absolute values of the standardized rates are different for the three choices of standard population, the contrast is broadly the same in each case. Indirect method This method is more conveniently thought of as a comparison of observed and expected deaths than in terms of standardized rates. In the special population the total number of deaths observed is  r i . The number of deaths expected if the age-specific death rates were the same as in the standard population is  n i P i . The overall mortality experience of the special population may be expressed in terms of that of the standard population by the ratio of observed to expected deaths: M   r i  n i P i : 19:9 When multiplied by 100 and expressed as a percentage, (19.9) is known as the standardized mortality ratio (SMR). To obtain the variance of M we can use the result varr i n i p i q i ,and regard the P i as constants without any sampling fluctuation (since we shall often 19.3 Rates and standardization 663 [...]... cancer in England and Wales are shown in Table 19. 6 in 5-year age groups and 10-year periods (in the original application 5-year periods were used) 19. 7 Subject-years method 687 Table 19. 6 Subject-years at risk ( y) of asbestos workers and death rates (d ) per 100 000 for lung cancer in men in England and Wales Period 194 1±50 Age group 25± 29 30±34 35± 39 40±44 45± 49 50±54 55± 59 60±64 65± 69 70±74 75 195 1±60... 12Á55 9 41 7Á84 6Á27 5Á 49 5Á10 5Á10 4Á71 4Á31 3Á14 2Á35 2Á35 1Á57 1 96 4100 100 100 180 190 160 170 200 270 370 530 690 880 1500 1520 4100 1275 99 99 15 060 365 815 15 060 C Age-specific DR per 1000 22Á16 0Á 59 0Á62 1Á50 1 90 2Á00 2Á43 3Á08 4Á15 6Á17 9 64 17Á25 29 33 50Á00 76Á00 164Á00 11Á81 15Á63 11Á81 137 338 10Á17 13Á73 Population 1000s % 1 174 1 072 99 0 876 755 634 595 576 597 556 536 478 396 310... 0Á 090 1, and approximate 95 % confidence limits for ln c are Table 19. 5 Combination of relative risks from 10 retrospective surveys on smoking and lung cancer (Cornfield, 195 6; Gart 196 2) Woolf 's method Lung cancer patients Study number NonSmokers smokers bi ai NonSmokers smokers ci di ^ ln ci 1 2 3 4 5 6 7 8 9 10 83 90 1 29 412 1350 60 4 59 499 451 260 3 3 7 32 7 3 18 19 39 5 72 227 81 299 1 296 106... dividing by the population incidence gives the population attributable risk, lP IP À INE : IP 19: 30 Now suppose a proportion uE of the population are exposed to the factor, then IP uE IE 1 À uE INE uE fINE 1 À uE INE 19: 31 INE 1 uE f À 1, where f is the relative risk and the second line is obtained using ( 19. 13) Substituting in ( 19. 30) gives lP uE f À 1 , 1 uE f À 1 19: 32... for by methods of age standardization (see § 19. 3) 670 Statistical methods in epidemiology Table 19. 3 Recent tobacco consumption of patients with carcinoma of the lung and control patients without carcinoma of the lung (Doll and Hill, 195 0) Daily consumption of cigarettes Non-smoker 1± 5± 15± 25± 50± Total Male Lung carcinoma Control 2 27 33 55 250 293 196 190 136 71 32 13 6 49 6 49 Female Lung carcinoma... characteristics, in particular between a group of individuals exposed to some factor and a group not exposed A measure of the increased risk (if any) of contracting a particular disease in the exposed compared with the non-exposed is required The measure usually used is the ratio of the incidences in the groups being compared and is referred to as relative risk (f) Thus, f IE =INE , 19: 13 672 Statistical methods. .. 10À5 ni Pi 1 29 8 152Á5 280Á4 816Á2 2312Á4 8 481 39 7 29 65 700 73 376 58 226 87 2 89 733 199 8 4571 55 303 92 1 2 99 4 6 732 7678 Age i (1) Annual death rate per 100 000, all males ( 194 9±53) 1 5 5 Pi Â 10 11 005 varSMR 104 varM 104 7678 11 0052 0Á634, and from 19: 11, SESMR 0Á80%: The smallness of the standard error (SE) of the SMR in Example 19. 3 is typical of much vital statistical. .. disease-prevention strategy The circumstances in which this interpretation is justified are probably few and there are particular difficulties when more than one cause is operating Suppose IP is the incidence of the disease in the population and, as in § 19. 5, IE and INE are the incidences in the exposed and non-exposed, respectively Then the excess incidence attributable to the factor is IP À INE and... hospital, of the same sex and within the same 5-year age group Each patient in each group was interviewed by a social worker, all interviewers using the same questionnaire The only substantial differences between the case and control groups were in their reported smoking habits Some of the findings are summarized in Table 19. 3 The difference in the proportion of non-smokers in the two groups is clearly... expressed on a 5-year basis The SMR is 100 M and 1007678 69 8% 11 005 666 Statistical methods in epidemiology Table 19. 2 Mortality of farmers in England and Wales, 194 9±53, in comparison with that of the male population Source: Registrar General of England and Wales ( 195 8) 20± 25± 35± 45± 55±64 (2) (3) Farmers, 195 1 census population ni Deaths of farmers 194 9±53 ri (4) Deaths expected in 5 years . result in an appreciable saving in cost by enabling the investigation to be concentrated at selected factories instead of necessitating the exam- ination of a sample of men all in different parts. 2Á56 42 000 70Á00 20 1Á57 1520 76Á00 207 2Á07 75± 700 2 99 98 100 140Á14 25 1 96 4100 164Á00 248 2Á48 Total 23 400 99 99 287 300 1275 99 99 15 060 10000 100Á00 Crude rate 12Á28 11Á81 (a) Standardization. very serious non- sampling errors. Some non-sampling errors are non-systematic, causing no bias in the average. An example would be random inaccuracy in the reading of a test instrument. These

Statistical Methods in Medical Research - part 9 pptx

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan