THE ESTIMATION OF THE YIELDS OF CEREAL EXPERIMENTS BY SAMPLING FOR THE RATIO OF GRAIN TO TOTAL PRODUCE

THE ESTIMATION OF THE YIELDS OF CEREAL EXPERIMENTS BY SAMPLING FOR THE RATIO OF GRAIN TO TOTAL PRODUCE BY W G COCHRAN Statistical Department, Rothamsted Experimental Station, Harpenden, Herts INTRODUCTION difficulties constitute a formidable barrier in planning an extensive series of small-plot cereal experiments of modern design, or in studying the residual effects of treatments on the succeeding cereals in a series of experiments on root crops Failing the provision of a small threshing machine, sampling is at present the only practicable method for obtaining the grain yields of small plots located on commercial farms Yates & Zacopanay (1935) summarized the work carried out at Rothamsted and its associated outside centres on the estimation of the yields of cereal crop experiments by sampling In these experiments a number of small areas (e.g £ m of each of four contiguous rows) were selected at random in each plot or subplot The standing crop in each of these areas was cut close to the ground, bagged, and transported to Rothamsted for threshing, the yields of grain and straw per unit area being estimated entirely from the samples The authors suggested that, if the total produce of each plot were weighed on the field, the samples need be used only to determine the ratio of the weight of grain to the weight of total produce In view of the high correlation which normally exists between grain and straw yields, the sampling errors of this ratio might be expected to be considerably smaller than those of the yield of grain itself, so that less sampling would be required to obtain results of equal precision They found from the average of nine experiments that the sampling error per metre of row length was 7-14% for the ratio of grain to total produce, compared with 23-9 % for the yield of grain Judging from these figures, only akmt one-tenth of the number of samples is required to obtain equal information if the total produce is weighed The estimation of yields by this method has been tried in a number of experiments since 1935 The THRESHING W G COCHRAN 263 present paper reviews their results from the point of view of sampling technique Since total produce, when weighed on the field, usually contains some moisture, the samples must be weighed on the field as well as before threshing, to enable a correction to be made in the grain and straw figures for the loss of moisture If the samples are taken from the standing crop, this should be done immediately before the crop is reaped, in order that the samples and the total produce may be weighed on the field at the same time This may not always be convenient, but with this method the samples can alternatively be taken from the crop while it is lying in the stooks, since there is no need to know the area of the land from which a sample was taken As the crops usually lie in the stooks for some days, this gives a wider choice in the time during which the sampling must be carried out In most of the experiments discussed below, the samples were taken from the stooks METHOD OF SAMPLING FROM THE STOOKS Total produce is first weighed on each plot A spring balance may be used, weighing the sheaves one at a time This method is rather tedious unless the plot size is small or the crop is a poor one, since a 1/40 acre plot may contain thirty sheaves If a portable tripod is available with a platform which can hold all the sheaves in one stook, some time will be saved After weighing, the sheaves should be laid separately on the ground, to facilitate the sampling operations The next step is to select the samples A method which gives reasonably random samples is as follows: Suppose that there are eighteen sheaves on a plot and that each sample is to be approximately % of the produce of the plot A sheaf is first selected at random The binding tape is cut and the sheaf is divided into six portions of about equal weight One of these is selected at random and constitutes the sample The division of the sheaf is usually most quickly done by successive subdivision into halves, selecting one-half at random at each stage for further subdivision, until a sample of about the required size is reached This method also has the advantage that it reduces to a minimum the number of small bundles which are scattered about the plot For selecting the halves at random, a piece of paper bearing a selection of odd and even numbers drawn from a book of random numbers may be used; alternatively, a set of disks containing an equal number of two different colours may be carried in the pocket 264 Yields of Cereal Experiments by Sampling When the samples have been selected, labelled and bagged, they are weighed For the calculation of the grain yields on each plot, it is necessary to know only the total weight of the samples from the plot, but if a full investigation of sampling errors is required, each sample must be weighed individually As the samples may weigh less than one pound each, a fairly accurate balance is required, and the weighings should if possible be done indoors whenever there is any appreciable wind The average weight of a bag, with its label and string, must also be recorded This completes the experimental operations on the field The sheaves should be restooked unless they are being carted off immediately The taking of random samples from the stooks is a lengthy process Following a suggestion made by Yates & Zacopanay, samples were also taken by picking a few shoots from each of several sheaves until a sample of about the agreed size had been amassed These samples, which will be called grab samples, can be taken in about one-third of the time required for random samples, since the sheaves need not even be opened unless they are very tightly bound It is, however, not clear a priori whether grab samples give unbiased estimates of the grain/total produce ratios or how they compare in accuracy with random samples In grabbing, no attempt was made to select representative shoots, as this method is known to be likely to introduce bias One might, however, expect a tendency to miss the shorter and less vigorous shoots, and possibly also to free the shoots of weeds in pulling them from the sheaves Both factors would tend to increase the apparent grain/total produce ratio A comparison of the results with random and grab samples will be given later in this paper MATERIAL A list of the experiments discussed is given in Table I below The plots were not subdivided for sampling, so that the plot area given is in all cases the area to which the sampling errors apply The random samples were taken from the stooks in five experiments The grab samples were taken from the stooks in all cases except in exp 5, where they were taken from the crop as it lay on the ground immediately after scything 265 W G COCHRAN Table I List of experiments Jo Year 1935 1935 1936 1936 1937 1937 1938 Size of plot acres Random samples taken from Type Wheat Stooks x L.S.* 1/40 Rothamsted x L.S Woburn 1/100 Stooks Woburn x L.S 1/100 Standing crop Barley 4, 16 R.B.f 1/40 Stooks Rothamsted Wye x L.S Standing crop 1/120 Tunstall 3, R.B Stooks 1/40 Oats Stooks 4, 18 R.B 1/60 Rothamsted Latin square I.e four randomized blocks of sixteen plots each Place No of samples per plot * ô Random Grab i ã 2 1 2 2 SAMPLING ERRORS PER CENT PER METRE The sampling errors per cent per metre of row length for the ratio r of grain to total produce are shown in Table II Where the samples were taken from the stooks, the average number of metres sampled was estimated from the ratio of the weight of the sample to the weight of the whole crop on the plot The sampling errors in all cases refer to the ratios of grain to dry total produce, as these were the figures with which Yates & Zacopanay dealt Table II Sampling errors of the ratio of grain to total produce Exp Method of sampling R R R R JR 1G G IR G Mean yield of grain cwt per acre 32-3 29-9 20-8 251 14-7 151 5-6 33-5 33-6 Area of plot acres 1/40 1/100 1/100 1/40 1/120 1/40 1/40 Mean Sampling Size of sampling unit error % per metre metres 5-4 16-8 1-9 100 20 150 [29-1] [5-9] (13-8 14-0 U2-5 13-0 5-6 13-7 (2-6 [7-0 •3 131 3-5 12-6 In the first four experiments sampling errors are obtainable only for the random samples, since only one grab sample was taken per plot In exp the random samples were unfortunately bulked for threshing The sampling errors per cent per metre are considerably higher than Yates & Zacopanay's figure of 7-14 % Exp may perhaps be omitted 266 Yields of Cereal Experiments by Sampling in reaching an average figure, since 12 % of the samples were reported as damaged by mice during storage These samples were excluded from the statistical analysis, but five other samples also showed an anomalously low ratio of dry total produce to wet total produce, as well as an anomalously low grain/total produce ratio These samples might perhaps be regarded as affected by damage which was not reported The experiment was, however, one in which different leys were growing under barley and the samples in question all came from plots growing a cloverryegrass mixture, so that they may have contained a substantial amount of the undergrowth In any case it is clear that if there is a vigorous and variable undergrowth of ley or weeds, this method is likely to give high sampling errors Excluding exp 4, the average value for the sampling error is 12-6% per metre There are several reasons which might account, in part at least, for the higher value obtained Size of plot The criterion used, sampling error per cent per metre, is likely to increase as the size of the plot increases While no correlation is evident in Table II between sampling error and size of plot, the average plot size in these experiments was considerably larger than in Yates & Zacopanay's experiments, in which, most of the sampling subplots were only 1/200 acre Since, however, Yates & Zacopanay used only a fraction of their data for this particular calculation, some additional information on the effect of plot size was obtained by calculating the sampling error of r for six of their experiments in which the plots were 1/80 acre The results are shown in Table III Table III Sampling errors of the ratio of grain to total produce (from 1/80 acre plots) Size of Sampling error % per metre sampling unit , * , metres rf Grain Crop Exp.* Barley 5-98 23-7 906 30-3 Barley 8-89 28-8 Wheat 10 Barley 10 10-57 32-5 9-42 22-8 Wheat 11 Barley 11 6-76 33-5 Mean 8-45 28-6 * In Yates & Zacopanay's notation, •f The method by which these figures were obtained is discussed in the Appendix The average value, 8-45, is somewhat larger than the previous figure of 7-14 for smaller plots, but is still considerably below 12-6 The average W G COCHRAN 267 sampling error for the yields of grain in the same experiments was 28-6%, so that the relative efficiency of the two methods works out at almost the same figure as Yates & Zacopanay obtained It does not appear as if the difference in the size of the plots can account for more than a small part of the increase from 7-1 to 12-6 % Size and type of sampling unit The sampling error of r will also depend to some extent on the size and shape of the sampling unit As a rule, it is to be expected that for the same total percentage sampled, a few large sampling units will be less efficient than a larger number of small sampling units In the present experiments the average size of the sampling unit was 3-5 m as against 2-0 m in Yates & Zacopanay's experiments, and this difference might partly account for the higher sampling error In this connexion it would have been instructive to compare the variation in r between samples taken from the same sheaf with that between samples taken from different sheaves, but this is not possible from the way in which the samples were selected It is also possible that the reaper or scythe gives a less even cut than is obtained when small samples are cut by hand from the standing crop Presence of weeds or undergrowth This point has already been mentioned in discussing exp in Table II, but it applies, to a less extent, to all experiments In Yates & Zacopanay's experiments, the samples were cleared of weeds before determining the weights of grain and straw, whereas in sampling for the grain/total produce ratio it is essential that the sample should not be cleaned of weeds Thus the presence of weeds, from which few experiments are entirely free, adds to the variability of r, particularly so as weeds compete with the crop and are more likely to abound in poorer patches, where the value of r is already low THE COEEECTION FOE LOSS OF MOISTUEE No discussion has so far been given for the correction which must be made for the amount of moisture in the total produce as weighed on the field Since this correction is made from the samples, it will involve some loss of information, so that the sampling errors given in the preceding section for the ratio of grain to dry total produce not represent the whole of the sampling error involved in this method 268 Yields of Cereal Experiments by Sampling The yield of dry grain of any plot is most simply obtained by multiplying the yield of wet total produce by the ratio, in samples from that plot, of the total yield of dry grain to the total yield of wet total produce The percentage sampling variance per plot of the yield of dry grain will be given (with all necessary accuracy) by the percentage sampling variance of the ratio of dry grain to wet total produce, divided by the number of samples taken per plot This can be calculated if'the samples were weighed individually on the field and threshed individually Since the sampling errors of the ratio of dry grain to dry total produce have already been discussed, it will be more convenient to discuss here the sampling errors of the ratio of dry total produce to wet total produce, assuming these ratios to be independent In general, however, the more direct approach is preferable, since the assumption of independence is not likely always to hold Unfortunately, little evidence on the dry/wet ratio is obtainable from these experiments The samples were weighed individually on the field in only three experiments, nos 3, and 7, mainly because the accuracy of the spring balance and the external conditions did not appear to justify weighing each sample Of these experiments, no has already been noted as exceptionally variable, while in no there appears to have been a zero error in the spring balance, since almost all the dry weights of total produce were slightly higher than the wet weights In exp 3, the sampling error per cent per plot for the ratio of dry to wet total produce was 7-03, as compared with 7-50 for the ratio r of dry grain to dry total produce The corresponding figures in exp 4, omitting the plots undersown with the clover-ryegrass mixture, were 7-45 and 8-50 These figures suggest that almost as much information is being lost in estimating the correction for drying as in estimating the ratio of dry grain to dry total produce If this is true, the accuracy of the method is only half that indicated by the figures in the last section There is, however, reason to believe that these results are not representative, since rain fell during the sampling of exp 3, some samples being wet when weighed, and in both experiments there was an unusual amount of drying-out, the mean values of the ratio of dry to wet total produce being 0-628 and 0-673 respectively For the remaining experiments, the experimental error between plots for the ratio of dry to wet total produce of the samples may be used as an upper limit to the corresponding sampling error within plots It may be mentioned that in exp 3, the sampling variance of the dry/wet ratio W G COCHRAN 269 was practically equal to the experimental variance, though there were significant differences between rows, columns and treatments, while in exp the sampling variance was less than half the experimental variance The results per plot for the other experiments are shown in Table IV Table IV Experimental errors per cent per plot of the ratio of dry to wet total produce Exp Mean ratici dry/wet 0-849 0-707 0-878 0-859 Experimental error % of dry/wet 2-91 4-44 2-26 2-59 Sampling error % of/ 5-17 5-17 4-87 4-07 In exps 1, and the percentage sampling variance of the dry/wet ratio cannot exceed about one-third of the percentage sampling variance of r, and may be substantially less In exp 2, in which the amount of drying was much greater, the additional loss of information was probably also greater If the dry/wet ratios are very variable the question arises whether the use of some average correction figure will improve matters Clearly such an average can only be properly employed if the dry/wet ratios are unaffected by the treatments, for if they are so affected the use of an average will distort the treatment differences Actually four of the six experiments considered showed clear differences between treatments, and exp also falls into this category if the clover-ryegrass plots are included The use of an average correction figure is therefore inadvisable Such distortion can of course be avoided by using a separate correction factor for each treatment, based on the average dry/wet ratio for all replicates of that treatment There is no point in following this course, however, since the results will be almost the same as if each plot is corrected separately The only effect will be to give a spuriously low estimate of experimental error COMPARISON OF GRAB WITH RANDOM SAMPLES Direct comparison of the sampling errors of r for random and grab samples can be made in only two of the experiments in Table II, nos and In exp grab sampling was somewhat more accurate, though not significantly so, while in exp there was little to choose between the two methods 270 Yields of Cereal Experiments by Sampling A less direct comparison may be obtained by calculating the betweenplot errors of the yields of grain given by the two methods (after elimination of treatment and block effects) Some allowance must be made for the difference in the amounts which were sampled under the two methods In exp 1, for example, two random samples each of about 954 g total produce were taken, as against one grab sample of 794 g The sampling and experimental errors per cent per plot for the random samples were 5-15 and 8-67 respectively The estimated experimental error per cent, if only one random sample of 794 g had been taken is and this figure is comparable with the experimental error per cent for grab sampling The adjustment for the size of the individual sample in the above formula is open to question, since a sample of twice the size, taken from the same sheaf j would probably not be twice as accurate Since the grab samples were usually the larger, the adjustment possibly favours the random samples slightly Table V Experimental errors per cent per plot of the yields of grain Method of sampling Exp Mean Random 10-8 7-6 15-5 17-2 8-3 14-7 6-8 11-6 Grab 7-9 10-4 12-8 171 9-3 140 6-9 11-2 Random samples gave a smaller experimental error in two experiments, grab samples in three, while the remaining two experiments showed practically identical results Thus grab sampling appears to be no less accurate than random sampling The mean yields of grain obtained by random and grab sampling are shown in Table VI The right-hand column shows the difference between the yields from grab and random sampling as a percentage of the yield given by random sampling Except in exp the grab samples gave slightly higher yields of grain than the random samples The biases are, however, in no case large Both random and grab sampling gave a positive bias in yields as W G COCHRAN 271 compared with full harvesting in exps and In exp the difference is due almost entirely to a greater drying out of the total produce than of the samples Table VI Comparison of mean yields by random and grab sampling Exp Crop Wheat Wheat Wheat Barley Barley Barley Oats Grain: cwt per acre Full Random harvesting sampling 30-59 32-36 29-88 18-57 20-83 — 25-07 14-74 — 5-44 — 33-50 Grab sampling 34-33 31-55 21-45 23-95 15-14 5-57 33-60 % bias in grab + 61 + 5-6 + 30 -4-5 + 2-7 + 2-4 + 0-3 A more detailed examination of the treatment means in these experiments shows close agreement between results from random and grab sampling DISCUSSION OF RESULTS Owing to the uncertainty about the sampling variance of the ratio of dry to wet total produce, the total sampling error involved in sampling for the ratio of grain to total produce cannot be fixed definitely for these experiments An average figure of 14-5 % per metre of row length for the ratio of dry grain to •wet total produce is probably not far wrong (This represents an increase in the average sampling variance in Table II by one-third to allow for the sampling variance of the dry/wet ratio.) With this figure, a sample of 25 m per plot gives a sampling error of 2-9 % per plot With an experimental error of 7-5 % per plot, the loss of information is about 13 %, i.e an amount which could be more than offset by adding an extra replication to an experiment with between four and seven replications This amount of sampling represents about % of the total produce in a 1/40 acre plot This figure is subject to qualification according to the conditions of the experiment If the crop is fairly dry and free from weeds or undergrowth when it is being sampled, or if the plot size is only 1/100 acre, some reduction may perhaps be allowed in the number of metres sampled, though it would be advisable to collect more experimental data on this point The size of the samples taken in these experiments was probably too large It might be better to take not more than m of row length for Journ Agric Sci xxx 18 272 Yields of Cereal Experiments by Sampling each sample It is easy to calculate, for any particular experiment, the fraction of a sheaf necessary to secure such samples For example, in a 1/60 acre plot, sown at in., there are about 380 m of row, and if there x 20 are twenty sheaves per plot, each sample should be QQ of a sheaf, i.e about one-tenth Apart from the small positive bias in the yield of grain, there appears to be no objection to grab sampling as carried out in these experiments In practice, one might take first two random samples of about m each, and then a further eight grab samples of about the same size The random samples would serve as a check on the others, and the whole process would require considerably less time than ten random samples Although this method has not proved as accurate as was anticipated, considerable time and labour still appears to be saved as compared with the previous method of random sampling from the standing crop without weighing total produce If the figure of 28-6 % per metre (from Table III) is taken as a comparable value for the sampling error of the yield of grain by the latter method, about one-quarter of the number of samples is needed if total produce is weighed and the samples are taken from the sheaves If most of the sampling is done by grabbing, this can be done in little more than one-eighth of the time (in exp 5, for example, seventy-two random samples from the standing crop required 10 manhours, including bagging and labelling, while an equal number of grab samples took man-hours) As far as can be judged, the time taken to weigh the samples and total produce is not more than twice the time required to select, bag and label an equal number of grab samples Thus the field operations require only about three-eighths of the time taken by the previous method There is also a considerable gain in time during threshing, which is also of importance, as with the machines at present available threshing occupies a large proportion of the total time The new method is somewhat more exposed to weather hazards at the time of sampling For instance, if rain falls after total produce has been weighed and while the samples are being taken, the samples which have already been drawn must be protected from the rain, while the total produce may have to be weighed again on plots which have not yet been sampled On the other hand, it is extremely difficult to sample from the standing crop if it is badly lodged, whereas such a crop presents no special difficulty once it is in the sheaves W G COCHRAN 273 SUMMARY In a number of cereal experiments, three on wheat, three on barley and one on oats, the yields of grain and straw per plot were estimated by weighing the total produce on each plot and taking samples, usually from the sheaves, to estimate the ratio of grain to total produce This paper discusses the sampling errors of this method The method proved considerably less accurate than was anticipated from previous calculations made by Yates & Zacopanay Amongst the reasons which are suggested to account for this are the larger sizes of plot and sampling unit used in these experiments and the additional variability introduced by the presence of weeds, undergrowth and moisture Nevertheless, the method appears to be substantially superior to the older method of cutting small areas from the standing crop, without weighing total produce, only about one-quarter of the number of samples being required to obtain results of equal precision The samples were taken both by an approximately random process and by grabbing a few shoots haphazardly from each of several sheaves The grab samples gave on the whole a slightly higher yield of grain, the greatest positive bias being %, but were otherwise just as accurate as the random samples Since the grab samples can be selected and bagged in about one-third of the time required for random samples, their use is recommended for the majority of the samples required in any experiment The validity of an approximate formula for calculating the variance of a ratio (in the present instance the ratio of grain to total produce) is discussed briefly in an appendix APPENDIX The validity of an approximate formula for the variance of a ratio To avoid the labour of calculating the actual ratios of grain to total produce, Yates & Zacopanay used an approximate formula expressing the variance of the ratio in terms of the variances and covariance of grain and total produce, most of which they had already calculated for the earlier part of their paper Let g, t denote the grain and total produce yields of a sample and r their ratio, and let g, I and f be the corresponding means over all samples taken Then as a first approximation 18-2 274 Yields of Cereal Experiments by Sampling The most important condition required for this approximation to be satisfactory is that the standard errors of g and t should be small relative to their mean values, though so far as I am aware, the limits within which the formula applies have never been investigated The standard errors of g and t were nearly all under 20 % in the experiments which Yates & Zacopanay used, but in the 1/80 acre plots given in Table III, they were sometimes as high as 30 %, so that some investigation is needed of the accuracy of the approximation under these conditions To proceed to a second approximation, the form of the joint distribution of g and t must be specified If we assume that they follow the bivariate normal distribution, and write

THE ESTIMATION OF THE YIELDS OF CEREAL EXPERIMENTS BY SAMPLING FOR THE RATIO OF GRAIN TO TOTAL PRODUCE

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan