Ebook Introductory statistics (9th edition) Part 2

454 771 0
Ebook Introductory statistics (9th edition) Part 2

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

(BQ) Part 2 book Introductory statistics has contents: Inferences for two population means, inferences for population standard deviations, inferences for population proportions, chi square procedures, descriptive methods in regression and correlation, inferential methods in regression and correlation, analysis of variance.

CHAPTER 10 Inferences for Two Population Means CHAPTER OUTLINE CHAPTER OBJECTIVES 10.1 The Sampling In Chapters and 9, you learned how to obtain confidence intervals and perform hypothesis tests for one population mean Frequently, however, inferential statistics is used to compare the means of two or more populations For example, we might want to perform a hypothesis test to decide whether the mean age of buyers of new domestic cars is greater than the mean age of buyers of new imported cars, or we might want to find a confidence interval for the difference between the two mean ages Broadly speaking, in this chapter we examine two types of inferential procedures for comparing the means of two populations The first type applies when the samples from the two populations are independent, meaning that the sample selected from one of the populations has no effect or bearing on the sample selected from the other population The second type of inferential procedure for comparing the means of two populations applies when the samples from the two populations are paired A paired sample may be appropriate when there is a natural pairing of the members of the two populations such as husband and wife Distribution of the Difference between Two Sample Means for Independent Samples 10.2 Inferences for Two Population Means, Using Independent Samples: Standard Deviations Assumed Equal 10.3 Inferences for Two Population Means, Using Independent Samples: Standard Deviations Not Assumed Equal 10.4 The Mann–Whitney Test* 10.5 Inferences for Two Population Means, Using Paired Samples 10.6 The Paired Wilcoxon Signed-Rank Test* 10.7 Which Procedure Should Be Used?* 432 CASE STUDY HRT and Cholesterol Older women most frequently die from coronary heart disease (CHD) Low serum levels of high-densitylipoprotein (HDL) cholesterol and high serum levels of low-densitylipoprotein (LDL) cholesterol are indicative of high risk for death from CHD Some observational studies of postmenopausal women have shown that women taking hormone replacement therapy (HRT) have a lower occurrence of CHD than women who are not taking HRT Researchers at the Washington University School of Medicine and the University of Colorado Health Sciences Center received funding from a Claude D Pepper Older Americans Independence Center award and from the National Institutes of Health to conduct a 9-month designed experiment to 10.1 Sampling Distribution of the Difference between Two Means 59 women, 39 were assigned to the HRT group and 20 to the placebo group Results of the measurements of lipoprotein levels, in milligrams per deciliter (mg/dL), in the two groups are displayed in the following table The change is between the measurements at months and baseline After studying the inferential methods discussed in this chapter, you will be able to conduct statistical analyses to examine the effects of HRT on cholesterol levels examine the effects of HRT on the serum lipid and lipoprotein levels of women 75 years old or older The researchers, E Binder et al., published their results in the paper “Effects of Hormone Replacement Therapy on Serum Lipids in Elderly Women” (Annals of Internal Medicine, Vol 134, Issue 9, pp 754–760) The study was randomized, double blind, and placebo controlled, and consisted of 59 sedentary women Of these HRT group (n = 39) Variable HDL cholesterol level LDL cholesterol level 10.1 433 Placebo group (n = 20) Mean change Standard deviation Mean change Standard deviation 8.1 −18.2 10.5 26.5 2.4 −2.2 4.3 12.2 The Sampling Distribution of the Difference between Two Sample Means for Independent Samples In this section, we lay the groundwork for making statistical inferences to compare the means of two populations The methods that we first consider require not only that the samples selected from the two populations be simple random samples, but also that they be independent samples That is, the sample selected from one of the populations has no effect or bearing on the sample selected from the other population With independent simple random samples, each possible pair of samples (one from one population and one from the other) is equally likely to be the pair of samples selected Example 10.1 provides an unrealistically simple illustration of independent samples, but it will help you understand the concept EXAMPLE 10.1 Introducing Independent Random Samples Males and Females Let’s consider two small populations, one consisting of three men and the other of four women, as shown in the following figure Male Population Tom Female Population Cindy Barbara Dick Dani Harry Nancy 434 CHAPTER 10 Inferences for Two Population Means Suppose that we take a sample of size from the male population and a sample of size from the female population a List the possible pairs of independent samples b If the samples are selected at random, determine the chance of obtaining any particular pair of independent samples Solution For convenience, we use the first letter of each name as an abbreviation for the actual name a In Table 10.1, the possible samples of size from the male population are listed on the left; the possible samples of size from the female population are listed on the right To obtain the possible pairs of independent samples, we list each possible male sample of size with each possible female sample of size 3, as shown in Table 10.2 There are 12 possible pairs of independent samples of two men and three women TABLE 10.1 TABLE 10.2 Possible samples of size from the male population and possible samples of size from the female population Possible pairs of independent samples of two men and three women Male sample of size Female sample of size T, D T, H D, H C, B, D C, B, N C, D, N B, D, N Male sample of size T, D T, D T, D T, D T, H T, H T, H T, H D, H D, H D, H D, H Female sample of size C, B, D C, B, N C, D, N B, D, N C, B, D C, B, N C, D, N B, D, N C, B, D C, B, N C, D, N B, D, N b For independent simple random samples, each of the 12 possible pairs of samples shown in Table 10.2 is equally likely to be the pair selected Therefore the chance of obtaining any particular pair of independent samples is 12 The previous example provides a concrete illustration of independent samples and emphasizes that, for independent simple random samples of any given sizes, each possible pair of independent samples is equally likely to be the one selected In practice, we neither obtain the number of possible pairs of independent samples nor explicitly compute the chance of selecting a particular pair of independent samples But these concepts underlie the methods we use Note: Recall that, when we say random sample, we mean simple random sample unless specifically stated otherwise Likewise, when we say independent random samples, we mean independent simple random samples, unless specifically stated otherwise Comparing Two Population Means, Using Independent Samples We can now examine the process for comparing the means of two populations based on independent samples 10.1 Sampling Distribution of the Difference between Two Means EXAMPLE 10.2 435 Comparing Two Population Means, Using Independent Samples Faculty Salaries The American Association of University Professors (AAUP) conducts salary studies of college professors and publishes its findings in AAUP Annual Report on the Economic Status of the Profession Suppose that we want to decide whether the mean salaries of college faculty in private and public institutions are different a Pose the problem as a hypothesis test b Explain the basic idea for carrying out the hypothesis test c Suppose that 35 faculty members from private institutions and 30 faculty members from public institutions are randomly and independently selected and that their salaries are as shown in Table 10.3, in thousands of dollars rounded to the nearest hundred Discuss the use of these data to make a decision concerning the hypothesis test TABLE 10.3 Sample (private institutions) Annual salaries ($1000s) for 35 faculty members in private institutions and 30 faculty members in public institutions Sample (public institutions) 87.3 75.9 108.8 83.9 56.6 99.2 54.9 49.9 105.7 116.1 40.3 123.1 79.3 73.1 90.6 89.3 84.9 84.4 129.3 98.8 72.5 57.1 50.7 69.9 40.1 71.7 148.1 132.4 75.0 98.2 106.3 131.5 41.4 73.9 92.5 99.9 95.1 57.9 97.5 115.6 60.6 64.6 59.9 105.4 74.6 82.0 44.9 31.5 49.5 55.9 66.9 56.9 87.2 45.1 116.6 106.7 66.0 99.6 53.0 75.9 103.9 60.3 80.1 89.7 86.7 Solution a We first note that we have one variable (salary) and two populations (all faculty in private institutions and all faculty in public institutions) Let the two populations in question be designated Populations and 2, respectively: Population 1: All faculty in private institutions Population 2: All faculty in public institutions Next, we denote the means of the variable “salary” for the two populations μ1 and μ2 , respectively: μ1 = mean salary of all faculty in private institutions; μ2 = mean salary of all faculty in public institutions Then, we can state the hypothesis test we want to perform as H0: μ1 = μ2 (mean salaries are the same) Ha: μ1 = μ2 (mean salaries are different) b Roughly speaking, we can carry out the hypothesis test as follows Independently and randomly take a sample of faculty members from private institutions (Population 1) and a sample of faculty members from public institutions (Population 2) Compute the mean salary, x¯1 , of the sample from private institutions and the mean salary, x¯2 , of the sample from public institutions Reject the null hypothesis if the sample means, x¯1 and x¯2 , differ by too much; otherwise, not reject the null hypothesis c This process is depicted in Fig 10.1 on the next page The means of the two samples in Table 10.3 are, respectively, x¯1 = xi n1 = 3086.8 = 88.19 35 and x¯2 = xi n2 = 2195.4 = 73.18 30 436 CHAPTER 10 Inferences for Two Population Means FIGURE 10.1 Process for comparing two population means, using independent samples POPULATION (Faculty in private institutions) POPULATION (Faculty in public institutions) Sample Sample – Compute x – Compute x – Compare x and x– Make decision The question now is, can the difference of 15.01 ($15,010) between these two sample means reasonably be attributed to sampling error, or is the difference large enough to indicate that the two populations have different means? To answer that question, we need to know the distribution of the difference between two sample means—the sampling distribution of the difference between two sample means We examine that sampling distribution in this section and complete the hypothesis test in the next section We can also compare two population means by finding a confidence interval for the difference between them One important aspect of that inference is the interpretation of the confidence interval For a variable of two populations, say, Population and Population 2, let μ1 and μ2 denote the means of that variable on those two populations, respectively To interpret confidence intervals for the difference, μ1 − μ2 , between the two population means, considering three cases is helpful Case 1: The endpoints of the confidence interval are both positive numbers To illustrate, suppose that a 95% confidence interval for μ1 − μ2 is from to Then we can be 95% confident that μ1 − μ2 lies somewhere between and Equivalently, we can be 95% confident that μ1 is somewhere between and greater than μ2 Case 2: The endpoints of the confidence interval are both negative numbers To illustrate, suppose that a 95% confidence interval for μ1 − μ2 is from −5 to −3 Then we can be 95% confident that μ1 − μ2 lies somewhere between −5 and −3 Equivalently, we can be 95% confident that μ1 is somewhere between and less than μ2 Case 3: One endpoint of the confidence interval is negative and the other is positive To illustrate, suppose that a 95% confidence interval for μ1 − μ2 is from −3 to Then we can be 95% confident that μ1 − μ2 lies somewhere between −3 and Equivalently, we can be 95% confident that μ1 is somewhere between less than and more than μ2 We present real examples throughout the chapter to further help you understand how to interpret confidence intervals for the difference between two population means For instance, in the next section, we find and interpret a 95% confidence interval for the difference between the mean salaries of faculty in private and public institutions 10.1 Sampling Distribution of the Difference between Two Means 437 The Sampling Distribution of the Difference between Two Sample Means for Independent Samples We need to discuss the notation used for parameters and statistics when we are analyzing two populations Let’s call the two populations Population and Population Then, as indicated in the previous example, we use a subscript when referring to parameters or statistics for Population and a subscript when referring to them for Population See Table 10.4 TABLE 10.4 Notation for parameters and statistics when considering two populations Population mean Population standard deviation Sample mean Sample standard deviation Sample size Population Population μ1 σ1 x¯1 s1 n1 μ2 σ2 x¯2 s2 n2 Armed with this notation, we describe in Key Fact 10.1 the sampling distribution of the difference between two sample means Understanding Key Fact 10.1 is aided by recalling Key Fact 7.2 on page 310 KEY FACT 10.1 The Sampling Distribution of the Difference between Two Sample Means for Independent Samples Suppose that x is a normally distributed variable on each of two populations Then, for independent samples of sizes n1 and n2 from the two populations, r μx¯ −x¯ = μ1 − μ , r σx¯ −x¯ = 2 (σ12 /n1 ) + (σ22 /n2 ), and r x¯1 − x¯ is normally distributed In words, the first bulleted item says that the mean of all possible differences between the two sample means equals the difference between the two population means (i.e., the difference between sample means is an unbiased estimator of the difference between population means) The second bulleted item indicates that the standard deviation of all possible differences between the two sample means equals the square root of the sum of the population variances each divided by the corresponding sample size The formulas for the mean and standard deviation of x¯1 − x¯2 given in the first and second bulleted items, respectively, hold regardless of the distributions of the variable on the two populations The assumption that the variable is normally distributed on each of the two populations is needed only to conclude that x¯1 − x¯2 is normally distributed (third bulleted item) and, because of the central limit theorem, that too holds approximately for large samples, regardless of distribution type Under the conditions of Key Fact 10.1, the standardized version of x¯1 − x¯2 , (x¯1 − x¯2 ) − (μ1 − μ2 ) z= , (σ12 /n ) + (σ22 /n ) has the standard normal distribution Using this fact, we can develop hypothesis-testing and confidence-interval procedures for comparing two population means when the population standard deviations are known.† However, because population standard † We call these procedures the two-means z-test and the two-means z-interval procedure, respectively The two-means z-test is also known as the two-sample z-test and the two-variable z-test Likewise, the two-means z-interval procedure is also known as the two-sample z-interval procedure and the two-variable z-interval procedure 438 CHAPTER 10 Inferences for Two Population Means deviations are usually unknown, we won’t discuss those procedures Instead, in Sections 10.2 and 10.3, we concentrate on the more usual situation where the population standard deviations are unknown Exercises 10.1 Understanding the Concepts and Skills 10.1 Give an example of interest to you for comparing two population means Identify the variable under consideration and the two populations 10.2 Define the phrase independent samples 10.3 Consider the quantities μ1 , σ1 , x¯1 , s1 , μ2 , σ2 , x¯2 , and s2 a Which quantities represent parameters and which represent statistics? b Which quantities are fixed numbers and which are variables? 10.4 Discuss the basic strategy for performing a hypothesis test to compare the means of two populations, based on independent samples 10.5 Why you need to know the sampling distribution of the difference between two sample means in order to perform a hypothesis test to compare two population means? 10.6 Identify the assumption for using the two-means z-test and the two-means z-interval procedure that renders those procedures generally impractical 10.7 Faculty Salaries Suppose that, in Example 10.2 on page 435, you want to decide whether the mean salary of faculty in private institutions is greater than the mean salary of faculty in public institutions State the null and alternative hypotheses for that hypothesis test 10.8 Faculty Salaries Suppose that, in Example 10.2 on page 435, you want to decide whether the mean salary of faculty in private institutions is less than the mean salary of faculty in public institutions State the null and alternative hypotheses for that hypothesis test In Exercises 10.9–10.14, hypothesis tests are proposed For each hypothesis test, a identify the variable b identify the two populations c determine the null and alternative hypotheses d classify the hypothesis test as two tailed, left tailed, or right tailed 10.9 Children of Diabetic Mothers Samples of adolescent offspring of diabetic mothers (ODM) and nondiabetic mothers (ONM) were taken by N Cho et al and evaluated for potential differences in vital measurements, including blood pressure and glucose tolerance The study was published in the paper “Correlations Between the Intrauterine Metabolic Environment and Blood Pressure in Adolescent Offspring of Diabetic Mothers” (Journal of Pediatrics, Vol 136, Issue 5, pp 587–592) A hypothesis test is to be performed to decide whether the mean systolic blood pressure of ODM adolescents exceeds that of ONM adolescents 10.10 Spending at the Mall An issue of USA TODAY discussed the amounts spent by teens and adults at shopping malls Suppose that we want to perform a hypothesis test to decide whether the mean amount spent by teens is less than the mean amount spent by adults 10.11 Driving Distances Data on household vehicle miles of travel (VMT) are compiled annually by the Federal Highway Administration and are published in National Household Travel Survey, Summary of Travel Trends A hypothesis test is to be performed to decide whether a difference exists in last year’s mean VMT for households in the Midwest and South 10.12 Age of Car Buyers In the introduction to this chapter, we mentioned comparing the mean age of buyers of new domestic cars to the mean age of buyers of new imported cars Suppose that we want to perform a hypothesis test to decide whether the mean age of buyers of new domestic cars is greater than the mean age of buyers of new imported cars 10.13 Neurosurgery Operative Times An Arizona State University professor, R Jacobowitz, Ph.D., in consultation with G Vishteh, M.D., and other neurosurgeons obtained data on operative times, in minutes, for both a dynamic system (Z -plate) and a static system (ALPS plate) They wanted to perform a hypothesis test to decide whether the mean operative time is less with the dynamic system than with the static system 10.14 Wing Length D Cristol et al published results of their studies of two subspecies of dark-eyed juncos in the paper “Migratory Dark-Eyed Juncos, Junco hyemalis, Have Better Spatial Memory and Denser Hippocampal Neurons Than Nonmigratory Conspecifics” (Animal Behaviour, Vol 66, Issue 2, pp 317–328) One of the subspecies migrates each year, and the other does not migrate A hypothesis test is to be performed to decide whether the mean wing lengths for the two subspecies (migratory and nonmigratory) are different In each of Exercises 10.15–10.20, we have presented a confidence interval (CI) for the difference, μ1 − μ2 , between two population means Interpret each confidence interval 10.15 95% CI is from 15 to 20 10.16 95% CI is from −20 to −15 10.17 90% CI is from −10 to −5 10.18 90% CI is from to 10 10.19 99% CI is from −20 to 15 10.20 99% CI is from −10 to 10.21 A variable of two populations has a mean of 40 and a standard deviation of 12 for one of the populations and a mean of 40 and a standard deviation of for the other population a For independent samples of sizes and 4, respectively, find the mean and standard deviation of x¯1 − x¯2 b Must the variable under consideration be normally distributed on each of the two populations for you to answer part (a)? Explain your answer 10.2 Inferences for Two Population Means: σ s Assumed Equal c Can you conclude that the variable x¯1 − x¯2 is normally distributed? Explain your answer 10.22 A variable of two populations has a mean of 7.9 and a standard deviation of 5.4 for one of the populations and a mean of 7.1 and a standard deviation of 4.6 for the other population a For independent samples of sizes and 6, respectively, find the mean and standard deviation of x¯1 − x¯2 b Must the variable under consideration be normally distributed on each of the two populations for you to answer part (a)? Explain your answer c Can you conclude that the variable x¯1 − x¯2 is normally distributed? Explain your answer 10.23 A variable of two populations has a mean of 40 and a standard deviation of 12 for one of the populations and a mean of 40 and a standard deviation of for the other population Moreover, the variable is normally distributed on each of the two populations a For independent samples of sizes and 4, respectively, determine the mean and standard deviation of x¯1 − x¯2 b Can you conclude that the variable x¯1 − x¯2 is normally distributed? Explain your answer c Determine the percentage of all pairs of independent samples of sizes and 4, respectively, from the two populations with the property that the difference x¯1 − x¯2 between the sample means is between −10 and 10 10.24 A variable of two populations has a mean of 7.9 and a standard deviation of 5.4 for one of the populations and a mean of 7.1 and a standard deviation of 4.6 for the other population Moreover, the variable is normally distributed on each of the two populations a For independent samples of sizes and 6, respectively, determine the mean and standard deviation of x¯1 − x¯2 b Can you conclude that the variable x¯1 − x¯2 is normally distributed? Explain your answer c Determine the percentage of all pairs of independent samples of sizes and 16, respectively, from the two populations with 10.2 439 the property that the difference x¯1 − x¯2 between the sample means is between −3 and Extending the Concepts and Skills 10.25 Simulation To obtain the sampling distribution of the difference between two sample means for independent samples, as stated in Key Fact 10.1 on page 437, we need to know that, for independent observations, the difference of two normally distributed variables is also a normally distributed variable In this exercise, you are to perform a computer simulation to make that fact plausible a Simulate 2000 observations from a normally distributed variable with a mean of 100 and a standard deviation of 16 b Repeat part (a) for a normally distributed variable with a mean of 120 and a standard deviation of 12 c Determine the difference between each pair of observations in parts (a) and (b) d Obtain a histogram of the 2000 differences found in part (c) Why is the histogram bell shaped? 10.26 Simulation In this exercise, you are to perform a computer simulation to illustrate the sampling distribution of the difference between two sample means for independent samples, Key Fact 10.1 on page 437 a Simulate 1000 samples of size 12 from a normally distributed variable with a mean of 640 and a standard deviation of 70 Obtain the sample mean of each of the 1000 samples b Simulate 1000 samples of size 15 from a normally distributed variable with a mean of 715 and a standard deviation of 150 Obtain the sample mean of each of the 1000 samples c Obtain the difference, x¯1 − x¯2 , for each of the 1000 pairs of sample means obtained in parts (a) and (b) d Obtain the mean, the standard deviation, and a histogram of the 1000 differences found in part (c) e Theoretically, what are the mean, standard deviation, and distribution of all possible differences, x¯1 − x¯2 ? f Compare your answers from parts (d) and (e) Inferences for Two Population Means, Using Independent Samples: Standard Deviations Assumed Equal† In Section 10.1, we laid the groundwork for developing inferential methods to compare the means of two populations based on independent samples In this section, we develop such methods when the two populations have equal standard deviations; in Section 10.3, we develop such methods without that requirement Hypothesis Tests for the Means of Two Populations with Equal Standard Deviations, Using Independent Samples We now develop a procedure for performing a hypothesis test based on independent samples to compare the means of two populations with equal but unknown standard deviations We must first find a test statistic for this test In doing so, we assume that the variable under consideration is normally distributed on each population † We recommend covering the pooled t-procedures discussed in this section because they provide valuable motivation for one-way ANOVA 440 CHAPTER 10 Inferences for Two Population Means Let’s use σ to denote the common standard deviation of the two populations We know from Key Fact 10.1 on page 437 that, for independent samples, the standardized version of x¯1 − x¯2 , (x¯1 − x¯2 ) − (μ1 − μ2 ) , z= (σ12 /n ) + (σ22 /n ) has the standard normal distribution Replacing σ1 and σ2 with their common value σ and using some algebra, we obtain the variable z= (x¯1 − x¯2 ) − (μ1 − μ2 ) √ σ (1/n ) + (1/n ) (10.1) However, we cannot use this variable as a basis for the required test statistic because σ is unknown Consequently, we need to use sample information to estimate σ , the unknown population standard deviation We so by first estimating the unknown population variance, σ The best way to that is to regard the sample variances, s12 and s22 , as two estimates of σ and then pool those estimates by weighting them according to sample size (actually by degrees of freedom) Thus our estimate of σ is sp2 = (n − 1)s12 + (n − 1)s22 , n1 + n2 − and hence that of σ is (n − 1)s12 + (n − 1)s22 n1 + n2 − sp = The subscript “p” stands for “pooled,” and the quantity sp is called the pooled sample standard deviation Replacing σ in Equation (10.1) with its estimate, sp , we get the variable (x¯1 − x¯2 ) − (μ1 − μ2 ) , √ sp (1/n ) + (1/n ) which we can use as the required test statistic Although the variable in Equation (10.1) has the standard normal distribution, this one has a t-distribution, with which you are already familiar KEY FACT 10.2 Distribution of the Pooled t-Statistic Suppose that x is a normally distributed variable on each of two populations and that the population standard deviations are equal Then, for independent samples of sizes n1 and n2 from the two populations, the variable t= (x¯1 − x¯2 ) − (μ1 − μ2 ) sp (1/n1 ) + (1/n2 ) has the t-distribution with df = n1 + n2 − In light of Key Fact 10.2, for a hypothesis test that has null hypothesis H0 : μ1 = μ2 (population means are equal), we can use the variable t= x¯1 − x¯2 √ sp (1/n ) + (1/n ) 10.2 Inferences for Two Population Means: σ s Assumed Equal 441 as the test statistic and obtain the critical value(s) or P-value from the t-table, Table IV in Appendix A We call this hypothesis-testing procedure the pooled t-test.† Procedure 10.1 provides a step-by-step method for performing a pooled t-test by using either the critical-value approach or the P-value approach PROCEDURE 10.1 Pooled t-Test Purpose To perform a hypothesis test to compare two population means, μ1 and μ2 Assumptions Simple random samples Independent samples Normal populations or large samples Equal population standard deviations Step The null hypothesis is H0: μ1 = μ2 , and the alternative hypothesis is Ha: μ1 < μ2 Ha: μ1 > μ2 Ha: μ1 = μ2 or or (Two tailed) (Left tailed) (Right tailed) Step Decide on the significance level, α Step Compute the value of the test statistic x¯ − x¯ , t= sp (1/n1 ) + (1/n2 ) where sp = (n1 − 1)s12 + (n2 − 1)s22 n1 + n2 − Denote the value of the test statistic t0 CRITICAL-VALUE APPROACH Step The t-statistic has df = n1 + n2 − Use Table IV to estimate the P-value, or obtain it exactly by using technology Step The critical value(s) are −tα tα ±tα/2 or or (Two tailed) (Left tailed) (Right tailed) with df = n1 + n2 − Use Table IV to find the critical value(s) Reject H0 Do not reject H Reject H0 Do not reject H0 Reject H0 Reject Do not reject H0 H0 P-VALUE APPROACH OR P - value −|t0 | |t0 | Two tailed ␣/2 ␣/2 −t␣/2 t␣/2 Two tailed ␣ ␣ t −t␣ Left tailed t t␣ t P- value P - value t t0 Left tailed t t0 t Right tailed Step If P ≤ α, reject H0 ; otherwise, not reject H0 Right tailed Step If the value of the test statistic falls in the rejection region, reject H0 ; otherwise, not reject H0 Step Interpret the results of the hypothesis test Note: The hypothesis test is exact for normal populations and is approximately correct for large samples from nonnormal populations † The pooled t-test is also known as the two-sample t-test with equal variances assumed, the pooled twovariable t-test, and the pooled independent samples t-test Indexes for Case Studies & Biographical Sketches Chapter 10 11 12 13 14 15 16 Case Study Biographical Sketch Greatest American Screen Legends 2, 31 25 Highest Paid Women 34, 87 U.S Presidential Election 89, 142 Texas Hold’em 144, 209 Aces Wild on the Sixth at Oak Hill 211, 251 Chest Sizes of Scottish Militiamen 253, 295 The Chesapeake and Ohio Freight Study 296, 320 The “Chips Ahoy! 1,000 Chips Challenge” 322, 357 Gender and Sense of Direction 358, 430 HRT and Cholesterol 432, 509 Speaker Woofer Driver Manufacturing 511, 543 Healthcare in the United States 544, 578 Eye and Hair Color 580, 625 Shoe Size and Height 628, 666 Shoe Size and Height 668, 713 Partial Ceramic Crowns 715, 758 Florence Nightingale 31 Adolphe Quetelet 88 John Tukey 142 Andrei Kolmogorov 210 James Bernoulli 252 Carl Friedrich Gauss 295 Pierre-Simon Laplace 320 William Gosset 357 Jerzy Neyman 431 Gertrude Cox 510 W Edwards Deming 543 Abraham de Moivre 578 Karl Pearson 625 Adrien Legendre 666 Sir Francis Galton 714 Sir Ronald Fisher 759 Statistically Significant Statistical reasoning and critical thinking are two key skills needed to effectively master statistics Weiss uses detailed explanations, clever features, and a meticulous style to help develop these crucial competencies SIGNIFICANT PEDAGOGY Weiss carefully explains the reasoning behind statistical concepts, skipping no detail to ensure the most thorough and accurate presentation PROCEDURE 9.2 One-Mean t-Test Purpose To perform a hypothesis test for a population mean, μ Assumptions Simple random sample Normal population or large sample σ unknown Step The null hypothesis is H0: μ = μ0 , and the alternative hypothesis is Procedure boxes aid in the learning of statistical procedures by presenting easy-to-follow, step-by-step methods for carrying them out Ha: μ = μ0 Ha: μ < μ0 Ha: μ > μ0 or or (Two tailed) (Left tailed) (Right tailed) Step Decide on the significance level, α Step Compute the value of the test statistic x¯ − μ0 t= √ s/ n and denote that value t0 CRITICAL-VALUE APPROACH Step The t-statistic has df = n − Use Table IV to estimate the P-value, or obtain it exactly by using technology ±tα/2 −tα tα or or (Two tailed) (Left tailed) (Right tailed) with df = n − Use Table IV to find the critical value(s) Reject H0 Do not reject H0 Reject H0 Do not reject H0 Reject H0 Reject Do not reject H0 H0 P-VALUE APPROACH OR Step The critical value(s) are P -value P-value −|t0| |t0| t Two tailed ␣ /2 t␣/2 Two tailed t P -value t Left tailed t0 t Right tailed ␣ ␣ ␣/2 −t␣/2 t0 −t␣ t Left tailed t␣ t Right tailed Step If P ≤ α, reject H0 ; otherwise, not reject H0 Parallel Critical-Value/ P-Value Presentation allows both the flexibility to concentrate on one approach or the opportunity for greater depth by comparing the two approaches Step If the value of the test statistic falls in the rejection region, reject H0 ; otherwise, not reject H0 Step Interpret the results of the hypothesis test Note: The hypothesis test is exact for normal populations and is approximately correct for large samples from nonnormal populations page 394 DEFINITION 3.15 What Does it Mean? boxes clearly explain the meaning of definitions, formulas, and key facts ? What Does It Mean? The z-score of an observation tells us the number of standard deviations that the observation is from the mean, that is, how far the observation is from the mean in units of standard deviation z-Score For an observed value of a variable x, the corresponding value of the standardized variable z is called the z-score of the observation The term standard score is often used instead of z-score A negative z-score indicates that the observation is below (less than) the mean, whereas a positive z-score indicates that the observation is above (greater than) the mean Example 3.27 illustrates calculation and interpretation of z-scores page 133 SIGNIFICANT EXERCISES With more than 2,600 exercises, most using real data, this text provides a wealth of opportunities to apply knowledge and develop statistical literacy page 361 You Try It! accompanies most worked examples, pointing to a similar exercise to immediately check understanding Real-World Examples illustrate every concept in the text using detailed, compelling cases based on real-life situations Many examples include Interpretation sections that explain the meaning and significance of the statistical results SIGNIFICANT ANALYSIS StatCrunch™ integration with this text includes 64 StatCrunch Reports, each corresponding to examples covered in the book EXAMPLE 2.7 Pie Charts FIGURE 2.2 Pie chart of the political party affiliation data in Table 2.1 Political Party Affiliations Political Party Affiliations Construct a pie chart of the political party affiliations of the students in Professor Weiss’s introductory statistics class presented in Table 2.1 on page 40 Solution We apply Procedure 2.3 Step Obtain a relative-frequency distribution of the data by applying Procedure 2.2 Republican (45.0%) We obtained a relative-frequency distribution of the data in Example 2.6 See the columns of Table 2.3 Other (22.5%) Step Divide a disk into wedge-shaped pieces proportional to the relative frequencies Democratic (32.5%) Referring to the second column of Table 2.3, we see that, in this case, we need to divide a disk into three wedge-shaped pieces that comprise 32.5%, 45.0%, and 22.5% of the disk We so by using a protractor and the fact that there are 360◦ in a circle Thus, for instance, the first piece of the disk is obtained by marking off 117◦ (32.5% of 360◦ ) See the three wedges in Fig 2.2 Step Label the slices with the distinct values and their relative frequencies Report 2.3 Exercise 2.19(c) on page 48 Referring again to the relative-frequency distribution in Table 2.3, we label the slices as shown in Fig 2.2 Notice that we expressed the relative frequencies as percentages Either method (decimal or percentage) is acceptable page 43 NEW! StatCrunch Reports replicate example problems from the text, walking through how to use the online statistical software, StatCrunch, to solve these problems MyStatLab or StatCrunch account required Procedure Index Following is an index that provides page-number references for the various statistical procedures discussed in the book Note: This index includes only numbered procedures (i.e., Procedure x.x), not all procedures Binomial Distribution Binomial probability formula, 231 Normal approximation, 289 Poisson approximation, 244 Goodness-of-fit, 585 Homogeneity, 615 Independence, 606 Correlation t-test, 698 Correlation test for normality, 704 Critical-value approach, 371 P-value approach, 377 Bar chart, 44 Boxplot, 120 Dotplot, 57 Histogram, 55 Pie chart, 43 Stem-and-leaf diagram, 58 Normally Distributed Variables Observations corresponding to a specified percentage or probability, 272 Percentages or probabilities, 269 One-Mean Inferences Confidence intervals t-interval procedure, 346 z-interval procedure, 330 Hypothesis tests t-test, 394 Wilcoxon signed-rank test, 404 z-test, 380 Proportion Inferences One proportion z-interval procedure, 548 z-test, 558 Two proportions z-interval procedure, 567 z-test, 565 Regression Inferences Estimation and prediction Conditional mean t-interval procedure, 689 Predicted value t-interval procedure, 691 Slope of the population regression line Regression t-interval procedure, 685 Regression t-test, 682 Sampling Cluster sampling, 17 Stratified random sampling with proportional allocation, 19 Systematic random sampling, 16 Kruskal–Wallis test, 749 One-way ANOVA test, 727 Tukey multiple-comparison method, 739 One standard deviation χ -interval procedure, 519 χ -test, 517 Two standard deviations F-interval procedure, 533 F-test, 531 Frequency distribution, 40 Relative-frequency distribution, 41 Confidence intervals Nonpooled t-interval procedure, 456 Paired t-interval procedure, 483 Pooled t-interval procedure, 445 Hypothesis tests Mann–Whitney test, 468 Nonpooled t-test, 453 Paired t-test, 481 Paired Wilcoxon signed-rank test, 492 Pooled t-test, 441 Chi-Square Tests Correlation Inferences Generic Hypothesis Tests Graphs and Charts Several-Means Inferences Standard-Deviation Inferences Tables Two-Means Inferences TABLE IV Values of t ␣ t␣ NOTE: See the version of Table IV in Appendix A for additional values of t df t 0.10 t 0.05 t 0.025 t 0.01 t 0.005 df 3.078 1.886 1.638 1.533 6.314 2.920 2.353 2.132 12.706 4.303 3.182 2.776 31.821 6.965 4.541 3.747 63.657 9.925 5.841 4.604 1.476 1.440 1.415 1.397 1.383 2.015 1.943 1.895 1.860 1.833 2.571 2.447 2.365 2.306 2.262 3.365 3.143 2.998 2.896 2.821 4.032 3.707 3.499 3.355 3.250 10 11 12 13 14 1.372 1.363 1.356 1.350 1.345 1.812 1.796 1.782 1.771 1.761 2.228 2.201 2.179 2.160 2.145 2.764 2.718 2.681 2.650 2.624 3.169 3.106 3.055 3.012 2.977 10 11 12 13 14 15 16 17 18 19 1.341 1.337 1.333 1.330 1.328 1.753 1.746 1.740 1.734 1.729 2.131 2.120 2.110 2.101 2.093 2.602 2.583 2.567 2.552 2.539 2.947 2.921 2.898 2.878 2.861 15 16 17 18 19 20 21 22 23 24 1.325 1.323 1.321 1.319 1.318 1.725 1.721 1.717 1.714 1.711 2.086 2.080 2.074 2.069 2.064 2.528 2.518 2.508 2.500 2.492 2.845 2.831 2.819 2.807 2.797 20 21 22 23 24 25 26 27 28 29 1.316 1.315 1.314 1.313 1.311 1.708 1.706 1.703 1.701 1.699 2.060 2.056 2.052 2.048 2.045 2.485 2.479 2.473 2.467 2.462 2.787 2.779 2.771 2.763 2.756 25 26 27 28 29 30 35 40 50 60 1.310 1.306 1.303 1.299 1.296 1.697 1.690 1.684 1.676 1.671 2.042 2.030 2.021 2.009 2.000 2.457 2.438 2.423 2.403 2.390 2.750 2.724 2.704 2.678 2.660 30 35 40 50 60 70 80 90 100 1000 2000 1.294 1.292 1.291 1.290 1.282 1.282 1.667 1.664 1.662 1.660 1.646 1.646 1.994 1.990 1.987 1.984 1.962 1.961 2.381 2.374 2.369 2.364 2.330 2.328 2.648 2.639 2.632 2.626 2.581 2.578 70 80 90 100 1000 2000 1.282 1.645 1.960 2.326 2.576 z 0.10 z 0.05 z 0.025 z 0.01 z 0.005 TABLE II Second decimal place in z Areas under the standard normal curve 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0.00 z 3.9 3.8 3.7 3.6 3.5 † z 0.0001 0.0001 0.0001 0.0002 0.0001 0.0001 0.0001 0.0002 0.0001 0.0001 0.0001 0.0002 0.0001 0.0001 0.0001 0.0002 0.0001 0.0001 0.0001 0.0002 0.0001 0.0001 0.0001 0.0002 0.0001 0.0001 0.0001 0.0002 0.0001 0.0001 0.0001 0.0002 0.0001 0.0001 0.0002 0.0002 0.0000 0.0001 0.0001 0.0002 0.0002 0.0002 0.0003 0.0005 0.0007 0.0010 0.0003 0.0004 0.0005 0.0007 0.0010 0.0003 0.0004 0.0005 0.0008 0.0011 0.0003 0.0004 0.0006 0.0008 0.0011 0.0003 0.0004 0.0006 0.0008 0.0011 0.0003 0.0004 0.0006 0.0008 0.0012 0.0003 0.0004 0.0006 0.0009 0.0012 0.0003 0.0005 0.0006 0.0009 0.0013 0.0003 0.0005 0.0007 0.0009 0.0013 0.0003 0.0005 0.0007 0.0010 0.0013 3.4 3.3 3.2 3.1 3.0 0.0014 0.0019 0.0026 0.0036 0.0048 0.0014 0.0020 0.0027 0.0037 0.0049 0.0015 0.0021 0.0028 0.0038 0.0051 0.0015 0.0021 0.0029 0.0039 0.0052 0.0016 0.0022 0.0030 0.0040 0.0054 0.0016 0.0023 0.0031 0.0041 0.0055 0.0017 0.0023 0.0032 0.0043 0.0057 0.0018 0.0024 0.0033 0.0044 0.0059 0.0018 0.0025 0.0034 0.0045 0.0060 0.0019 0.0026 0.0035 0.0047 0.0062 2.9 2.8 2.7 2.6 2.5 0.0064 0.0084 0.0110 0.0143 0.0183 0.0066 0.0087 0.0113 0.0146 0.0188 0.0068 0.0089 0.0116 0.0150 0.0192 0.0069 0.0091 0.0119 0.0154 0.0197 0.0071 0.0094 0.0122 0.0158 0.0202 0.0073 0.0096 0.0125 0.0162 0.0207 0.0075 0.0099 0.0129 0.0166 0.0212 0.0078 0.0102 0.0132 0.0170 0.0217 0.0080 0.0104 0.0136 0.0174 0.0222 0.0082 0.0107 0.0139 0.0179 0.0228 2.4 2.3 2.2 2.1 2.0 0.0233 0.0294 0.0367 0.0455 0.0559 0.0239 0.0301 0.0375 0.0465 0.0571 0.0244 0.0307 0.0384 0.0475 0.0582 0.0250 0.0314 0.0392 0.0485 0.0594 0.0256 0.0322 0.0401 0.0495 0.0606 0.0262 0.0329 0.0409 0.0505 0.0618 0.0268 0.0336 0.0418 0.0516 0.0630 0.0274 0.0344 0.0427 0.0526 0.0643 0.0281 0.0351 0.0436 0.0537 0.0655 0.0287 0.0359 0.0446 0.0548 0.0668 1.9 1.8 1.7 1.6 1.5 0.0681 0.0823 0.0985 0.1170 0.1379 0.0694 0.0838 0.1003 0.1190 0.1401 0.0708 0.0853 0.1020 0.1210 0.1423 0.0721 0.0869 0.1038 0.1230 0.1446 0.0735 0.0885 0.1056 0.1251 0.1469 0.0749 0.0901 0.1075 0.1271 0.1492 0.0764 0.0918 0.1093 0.1292 0.1515 0.0778 0.0934 0.1112 0.1314 0.1539 0.0793 0.0951 0.1131 0.1335 0.1562 0.0808 0.0968 0.1151 0.1357 0.1587 1.4 1.3 1.2 1.1 1.0 0.1611 0.1867 0.2148 0.2451 0.2776 0.1635 0.1894 0.2177 0.2483 0.2810 0.1660 0.1922 0.2206 0.2514 0.2843 0.1685 0.1949 0.2236 0.2546 0.2877 0.1711 0.1977 0.2266 0.2578 0.2912 0.1736 0.2005 0.2296 0.2611 0.2946 0.1762 0.2033 0.2327 0.2643 0.2981 0.1788 0.2061 0.2358 0.2676 0.3015 0.1814 0.2090 0.2389 0.2709 0.3050 0.1841 0.2119 0.2420 0.2743 0.3085 0.9 0.8 0.7 0.6 0.5 0.3121 0.3483 0.3859 0.4247 0.4641 0.3156 0.3520 0.3897 0.4286 0.4681 0.3192 0.3557 0.3936 0.4325 0.4721 0.3228 0.3594 0.3974 0.4364 0.4761 0.3264 0.3632 0.4013 0.4404 0.4801 0.3300 0.3669 0.4052 0.4443 0.4840 0.3336 0.3707 0.4090 0.4483 0.4880 0.3372 0.3745 0.4129 0.4522 0.4920 0.3409 0.3783 0.4168 0.4562 0.4960 0.3446 0.3821 0.4207 0.4602 0.5000 0.4 0.3 0.2 0.1 0.0 † For z 3.90, the areas are 0.0000 to four decimal places TABLE II (cont.) Second decimal place in z Areas under the standard normal curve z † z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.0 0.1 0.2 0.3 0.4 0.5000 0.5398 0.5793 0.6179 0.6554 0.5040 0.5438 0.5832 0.6217 0.6591 0.5080 0.5478 0.5871 0.6255 0.6628 0.5120 0.5517 0.5910 0.6293 0.6664 0.5160 0.5557 0.5948 0.6331 0.6700 0.5199 0.5596 0.5987 0.6368 0.6736 0.5239 0.5636 0.6026 0.6406 0.6772 0.5279 0.5675 0.6064 0.6443 0.6808 0.5319 0.5714 0.6103 0.6480 0.6844 0.5359 0.5753 0.6141 0.6517 0.6879 0.5 0.6 0.7 0.8 0.9 0.6915 0.7257 0.7580 0.7881 0.8159 0.6950 0.7291 0.7611 0.7910 0.8186 0.6985 0.7324 0.7642 0.7939 0.8212 0.7019 0.7357 0.7673 0.7967 0.8238 0.7054 0.7389 0.7704 0.7995 0.8264 0.7088 0.7422 0.7734 0.8023 0.8289 0.7123 0.7454 0.7764 0.8051 0.8315 0.7157 0.7486 0.7794 0.8078 0.8340 0.7190 0.7517 0.7823 0.8106 0.8365 0.7224 0.7549 0.7852 0.8133 0.8389 1.0 1.1 1.2 1.3 1.4 0.8413 0.8643 0.8849 0.9032 0.9192 0.8438 0.8665 0.8869 0.9049 0.9207 0.8461 0.8686 0.8888 0.9066 0.9222 0.8485 0.8708 0.8907 0.9082 0.9236 0.8508 0.8729 0.8925 0.9099 0.9251 0.8531 0.8749 0.8944 0.9115 0.9265 0.8554 0.8770 0.8962 0.9131 0.9279 0.8577 0.8790 0.8980 0.9147 0.9292 0.8599 0.8810 0.8997 0.9162 0.9306 0.8621 0.8830 0.9015 0.9177 0.9319 1.5 1.6 1.7 1.8 1.9 0.9332 0.9452 0.9554 0.9641 0.9713 0.9345 0.9463 0.9564 0.9649 0.9719 0.9357 0.9474 0.9573 0.9656 0.9726 0.9370 0.9484 0.9582 0.9664 0.9732 0.9382 0.9495 0.9591 0.9671 0.9738 0.9394 0.9505 0.9599 0.9678 0.9744 0.9406 0.9515 0.9608 0.9686 0.9750 0.9418 0.9525 0.9616 0.9693 0.9756 0.9429 0.9535 0.9625 0.9699 0.9761 0.9441 0.9545 0.9633 0.9706 0.9767 2.0 2.1 2.2 2.3 2.4 0.9772 0.9821 0.9861 0.9893 0.9918 0.9778 0.9826 0.9864 0.9896 0.9920 0.9783 0.9830 0.9868 0.9898 0.9922 0.9788 0.9834 0.9871 0.9901 0.9925 0.9793 0.9838 0.9875 0.9904 0.9927 0.9798 0.9842 0.9878 0.9906 0.9929 0.9803 0.9846 0.9881 0.9909 0.9931 0.9808 0.9850 0.9884 0.9911 0.9932 0.9812 0.9854 0.9887 0.9913 0.9934 0.9817 0.9857 0.9890 0.9916 0.9936 2.5 2.6 2.7 2.8 2.9 0.9938 0.9953 0.9965 0.9974 0.9981 0.9940 0.9955 0.9966 0.9975 0.9982 0.9941 0.9956 0.9967 0.9976 0.9982 0.9943 0.9957 0.9968 0.9977 0.9983 0.9945 0.9959 0.9969 0.9977 0.9984 0.9946 0.9960 0.9970 0.9978 0.9984 0.9948 0.9961 0.9971 0.9979 0.9985 0.9949 0.9962 0.9972 0.9979 0.9985 0.9951 0.9963 0.9973 0.9980 0.9986 0.9952 0.9964 0.9974 0.9981 0.9986 3.0 3.1 3.2 3.3 3.4 0.9987 0.9990 0.9993 0.9995 0.9997 0.9987 0.9991 0.9993 0.9995 0.9997 0.9987 0.9991 0.9994 0.9995 0.9997 0.9988 0.9991 0.9994 0.9996 0.9997 0.9988 0.9992 0.9994 0.9996 0.9997 0.9989 0.9992 0.9994 0.9996 0.9997 0.9989 0.9992 0.9994 0.9996 0.9997 0.9989 0.9992 0.9995 0.9996 0.9997 0.9990 0.9993 0.9995 0.9996 0.9997 0.9990 0.9993 0.9995 0.9997 0.9998 3.5 3.6 3.7 3.8 3.9 0.9998 0.9998 0.9999 0.9999 1.0000† 0.9998 0.9998 0.9999 0.9999 0.9998 0.9999 0.9999 0.9999 0.9998 0.9999 0.9999 0.9999 0.9998 0.9999 0.9999 0.9999 0.9998 0.9999 0.9999 0.9999 0.9998 0.9999 0.9999 0.9999 0.9998 0.9999 0.9999 0.9999 0.9998 0.9999 0.9999 0.9999 0.9998 0.9999 0.9999 0.9999 For z 3.90, the areas are 1.0000 to four decimal places Formula/Table Card for Weiss’s Introductory Statistics, 9/e Larry R Griffey Notation n = sample size x = sample mean s = sample stdev Chapter Qj = jth quartile N = population size m = population mean • Lower limit ϭ Q1 Ϫ 1.5 IQR, Upper limit ϭ Q3 ϩ 1.5 IQR ©xi n ©xi N • Population standard deviation (standard deviation of a variable): • Population mean (mean of a variable): m = • Range: Range ϭ Max Ϫ Min • Sample standard deviation: ©(xi - x )2 s = B n - or s = ©x2i - (©xi)2>n B n - g(xi - m)2 N B s = or gx2i B N - m2 x - m s Probability Concepts • Probability for equally likely outcomes: • Rule of total probability: f P(E ) = N where f denotes the number of ways event E can occur and N denotes the total number of outcomes possible • Special addition rule: P(A or B or C or Á ) = P(A) + P(B) + P(C) + Á (A, B, C, … mutually exclusive) • Complementation rule: P(E) ϭ Ϫ P(not E) • General addition rule: P(A or B) ϭ P(A) ϩ P(B) Ϫ P(A & B) • Conditional probability rule: P(B ƒ A) = P(A & B) P(A) • General multiplication rule: P(A & B) ϭ P(A) и P(B ƒ A) • Special multiplication rule: P(A & B & C & Á ) = P(A) # P(B) # P(C) Á k P(B) = a P(Aj) # P(B ƒ Aj) j=1 (A1, A2, …, Ak mutually exclusive and exhaustive) • Bayes’s rule: P(Ai ƒ B) = P(Ai) # P(B ƒ Ai) k P(Aj) # P(B ƒ Aj) aj=1 (A1, A2, …, Ak mutually exclusive and exhaustive) • Factorial: k! = k(k - 1) Á # • Permutations rule: mPr = m! (m - r)! • Special permutations rule: mPm = m! • Combinations rule: mCr = m! r!(m - r)! • Number of possible samples: N Cn = (A, B, C, … independent) N! n!(N - n)! Discrete Random Variables • Mean of a discrete random variable X: m = ©xP(X = x) • Standard deviation of a discrete random variable X: s = 2©(x - m)2P(X = x) or s = 2©x2P(X = x) - m2 • Factorial: k! = k(k - 1) Á # n n! • Binomial coefficient: a b = x x!(n - x)! • Binomial probability formula: n P(X = x) = a bpx(1 - p)n - x x Chapter s = • Standardized variable: z = • Interquartile range: IQR ϭ Q3 Ϫ Q1 Chapter p = population proportion O = observed frequency E = expected frequency Descriptive Measures • Sample mean: x = Chapter s = population stdev d = paired difference pN = sample proportion where n denotes the number of trials and p denotes the success probability • Mean of a binomial random variable: ␮ ϭ np • Standard deviation of a binomial random variable: s = 1np(1 - p) • Poisson probability formula: P(X = x) = e-l • Mean of a Poisson random variable: ␮ ϭ ␭ • Standard deviation of a Poisson random variable: s = 1l The Normal Distribution • z-score for an x-value: z = x - m s lx x! • x-value for a z-score: x = m + z # s Formula/Table Card for Weiss’s Introductory Statistics, 9/e Larry R Griffey Chapter The Sampling Distribution of the Sample Mean • Mean of the variable x: mx = m Chapter • Standard deviation of the variable x: sx = s> 1n Confidence Intervals for One Population Mean • Standardized version of the variable x: • Studentized version of the variable x: x - m z = t = s> 1n • z-interval for ␮ (␴ known, normal population or large sample): x ; ta>2 # • Margin of error for the estimate of ␮: E = za>2 # n = a za>2 # s E s> 1n • t-interval for ␮ (␴ unknown, normal population or large sample): s x ; za>2 # 1n • Sample size for estimating ␮: x - m s 1n s 1n with df ϭ n Ϫ b rounded up to the nearest whole number Chapter Hypothesis Tests for One Population Mean • z-test statistic for H0: ␮ ϭ ␮0 (␴ known, normal population or large sample): x - m0 z = s> 1n • t-test statistic for H0: ␮ ϭ ␮0 (␴ unknown, normal population or large sample): t = • Symmetry property of a Wilcoxon signed-rank distribution: W1 - A = n(n + 1)>2 - WA • Wilcoxon signed-rank test statistic for H0: ␮ ϭ ␮0 (symmetric population): W ϭ sum of the positive ranks x - m0 s> 1n with df ϭ n Ϫ Chapter 10 Inferences for Two Population Means • Pooled sample standard deviation: sp = A (n1 - + (n2 n1 + n2 - 1)s21 1)s22 • Pooled t-test statistic for H0: ␮1 ϭ ␮2 (independent samples, normal populations or large samples, and equal population standard deviations): x1 - x2 t = sp 2(1>n1) + (1>n2) with df ϭ n1 ϩ n2 Ϫ t = x1 - x2 2(s21>n1) + (s22>n2) with df ϭ ⌬ • Nonpooled t-interval for ␮1 Ϫ ␮2 (independent samples, and normal populations or large samples): (x1 - x2) ; ta>2 # 2(s21>n1) + (s22>n2) with df ϭ ⌬ • Pooled t-interval for ␮1 Ϫ ␮2 (independent samples, normal populations or large samples, and equal population standard deviations): (x1 - x2) ; ta>2 # sp 2(1>n1) + (1>n2) with df ϭ n1 ϩ n2 Ϫ • Symmetry property of a MannϪWhitney distribution: M1 - A = n1(n1 + n2 + 1) - MA • Mann–Whitney test statistic for H0: ␮1 ϭ ␮2 (independent samples and same-shape populations): M ϭ sum of the ranks for sample data from Population • Degrees of freedom for nonpooled t-procedures: ¢ = • Nonpooled t-test statistic for H0: ␮1 ϭ ␮2 (independent samples, and normal populations or large samples): [(s21>n1) + (s22>n2)]2 • Paired t-test statistic for H0: ␮1 ϭ ␮2 (paired sample, and normal differences or large sample): (s21>n1)2 (s22>n2)2 + n1 - n2 - rounded down to the nearest integer t = with df ϭ n Ϫ d sd > 1n Formula/Table Card for Weiss’s Introductory Statistics, 9/e Larry R Griffey • Paired t-interval for ␮1 Ϫ ␮2 (paired sample, and normal differences or large sample): sd d ; ta>2 # 1n with df ϭ n Ϫ Chapter 11 • Paired Wilcoxon signed-rank test statistic for H0: ␮1 ϭ ␮2 (paired sample and symmetric differences): W ϭ sum of the positive ranks Inferences for Population Standard Deviations • x2-test statistic for H0: ␴ ϭ ␴0 (normal population): n - x2 = s s20 with df ϭ n Ϫ • F-test statistic for H0: ␴1 ϭ ␴2 (independent samples and normal populations): F = s21>s22 • x2-interval for ␴ (normal population): n - 1# n - 1# s to s A x2a>2 A x21 - a>2 • F-interval for s1>s2 (independent samples and normal populations): s 1 # s1 # to s s 2Fa>2 2F1 - a>2 with df ϭ n Ϫ Chapter 12 with df ϭ (n1 Ϫ 1, n2 Ϫ 1) with df ϭ (n1 Ϫ 1, n2 Ϫ 1) Inferences for Population Proportions • Sample proportion: x n where x denotes the number of members in the sample that have the specified attribute • z-test statistic for H0: p1 ϭ p2: pN = • z-interval for p: pN ; za>2 # 2pN (1 - pN )>n (Assumption: both x and n Ϫ x are or greater) • Margin of error for the estimate of p: E = za>2 # 2pN (1 - pN )>n • Sample size for estimating p: za>2 za>2 b or n = pN g (1 - pN g) a b n = 0.25 a E E rounded up to the nearest whole number (g ϭ “educated guess”) z = (Assumptions: independent samples; x1, n1 Ϫ x1, x2, n2 Ϫ x2 are all or greater) • z-interval for p1 Ϫ p2: ( pN - pN 2) ; za>2 # 2pN 1(1 - pN 1)>n1 + pN 2(1 - pN 2)>n2 (Assumptions: independent samples; x1, n1 Ϫ x1, x2, n2 Ϫ x2 are all or greater) • Margin of error for the estimate of p1 Ϫ p2: E = za>2 # 2pN 1(1 - pN 1)>n1 + pN 2(1 - pN 2)>n2 • Sample size for estimating p1 Ϫ p2: n1 = n2 = 0.5 a • z-test statistic for H0: p ϭ p0: z = pN - p0 2p0(1 - p0)>n (Assumption: both np0 and n(1 Ϫ p0) are or greater) x1 + x2 • Pooled sample proportion: pN p = n1 + n2 Chapter 13 pN - pN 2pN p(1 - pN p)2(1>n1) + (1>n2) za>2 E b or n1 = n2 = pN 1g (1 - pN 1g) + pN 2g (1 - pN 2g)2 a za>2 E b rounded up to the nearest whole number (g ϭ “educated guess”) Chi-Square Procedures • Expected frequencies for a chi-square goodness-of-fit test: E ϭ np • Test statistic for a chi-square goodness-of-fit test: x2 = ©(O - E)2>E with df ϭ c Ϫ 1, where c is the number of possible values for the variable under consideration • Expected frequencies for a chi-square independence test or a chi-square homogeneity test: R#C E = n where R ϭ row total and C ϭ column total • Test statistic for a chi-square independence test: x2 = ©(O - E)2>E with df ϭ (r Ϫ 1)(c Ϫ 1), where r and c are the number of possible values for the two variables under consideration • Test-statistic for a chi-square homogeneity test: x2 = ©(O - E)2>E with df ϭ (r Ϫ 1)(c Ϫ 1), where r is the number of populations and c is the number of possible values for the variable under consideration Formula/Table Card for Weiss’s Introductory Statistics, 9/e Larry R Griffey Chapter 14 Descriptive Methods in Regression and Correlation >Sxx • Regression sum of squares: SSR = ©( yN i - y)2 = Sxy • Sxx, Sxy , and Syy: Sxx = ©(xi - x)2 = ©x2i - (©xi)2>n >Sxx • Error sum of squares: SSE = ©( yi - yN i )2 = Syy - Sxy Sxy = ©(xi - x)( yi - y) = ©xi yi - (©xi)(©yi)>n • Regression identity: SST ϭ SSR ϩ SSE Syy = ©( yi - y) = ©y2i - (©yi) >n • Regression equation: yN = b0 + b1x, where Sxy b1 = and b0 = (©yi - b1 ©xi) = y - b1x n Sxx • Total sum of squares: SST = ©( yi - y) = Syy Chapter 15 • Coefficient of determination: r = • Linear correlation coefficient: n - ©(xi - x)( yi - y) r = sx sy or r = Sxy 1SxxSyy Inferential Methods in Regression and Correlation • Population regression equation: y = b + b 1x SSE • Standard error of the estimate: se = An - • Test statistic for H0: ␤1 ϭ 0: b1 t = se> 1Sxx with df ϭ n Ϫ • Prediction interval for an observed value of the response variable corresponding to xp: yN p ; ta>2 # se b1 ; ta>2 # with df ϭ n Ϫ A1 + • Test statistic for H0: r = 0: t = se 1Sxx (xp - ©xi>n)2 + n Sxx with df ϭ n Ϫ • Confidence interval for ␤1: r - r2 An - with df ϭ n Ϫ • Confidence interval for the conditional mean of the response variable corresponding to xp: (x - ©xi>n)2 yN p ; ta>2 # se + p An Sxx with df ϭ n Ϫ Chapter 16 SSR SST • Test statistic for a correlation test for normality: Rp = ©xiwi 2Sxx ©w2i where x and w denote observations of the variable and the corresponding normal scores, respectively Analysis of Variance (ANOVA) • Notation in one-way ANOVA: k ϭ number of populations n ϭ total number of observations x ϭ mean of all n observations nj ϭ size of sample from Population j xj ϭ mean of sample from Population j s2j ϭ variance of sample from Population j Tj ϭ sum of sample data from Population j • Defining formulas for sums of squares in one-way ANOVA: SST = ©(xi - x) SSTR = ©nj (xj - x)2 SSE = ©(nj - 1)s2j • One-way ANOVA identity: SST ϭ SSTR ϩ SSE • Computing formulas for sums of squares in one-way ANOVA: SST = ©x2i - (©xi)2>n SSTR = ©(Tj2>nj) - (©xi)2>n SSE = SST - SSTR • Mean squares in one-way ANOVA: SSTR SSE MSTR = MSE = k - n - k • Test statistic for one-way ANOVA (independent samples, normal populations, and equal population standard deviations): F = MSTR MSE with df ϭ (k Ϫ 1, n Ϫ k) • Confidence interval for ␮i Ϫ ␮j in the Tukey multiple-comparison method (independent samples, normal populations, and equal population standard deviations): (xi - xj) ; qa 12 # s2(1>ni) + (1>nj) where s = 1MSE and q␣ is obtained for a q-curve with parameters k and n Ϫ k • Test statistic for a Kruskal–Wallis test (independent samples, same-shape populations, all sample sizes or greater): k R2 j SSTR 12 H = or H = - 3(n + 1) SST>(n - 1) n(n + 1) a n j=1 j where SSTR and SST are computed for the ranks of the data, and Rj denotes the sum of the ranks for the sample data from Population j H has approximately a chi-square distribution with df ϭ k Ϫ Table II Areas under the standard normal curve Table II (cont.) Areas under the standard normal curve Formula/Table Card for Weiss’s Introductory Statistics, 9/e Larry R Griffey Formula/Table Card for Weiss’s Introductory Statistics, 9/e Larry R Griffey Table IV Values of t␣ Table IV (cont.) Table V Values of t␣ Values of W␣ Formula/Table Card for Weiss’s Introductory Statistics, 9/e Larry R Griffey Table I Table VI Random numbers Table III Normal scores Values of M␣ Table VII Values of x2A Values of F␣ Table VIII (cont.) Values of F␣ Table VIII Formula/Table Card for Weiss’s Introductory Statistics, 9/e Larry R Griffey ... and females Midwest 16 .2 14.6 11 .2 24.4 9.6 12. 9 18.6 16.6 20 .3 15.1 South 17.3 10.8 16.6 20 .9 18.3 22 .2 24.6 18.0 16.0 22 .8 19 .2 20 .2 12. 2 17.5 11.5 9.3 15.8 20 .1 18 .2 At the 5% significance... (nmol/mL-hr/mg) Psychotic 0.0150 0. 020 4 0.0306 0.0 320 0. 020 8 0. 022 2 0. 027 5 0. 027 0 0. 022 6 0. 024 5 Not psychotic 0.0104 0. 020 0 0. 021 0 0.0105 0.01 12 0. 023 0 0.0116 0. 025 2 0.0130 0. 020 0 0.0145 0.0180 0.0154 0.0170... degrees, of the participants Male 13 13 38 59 58 130 68 23 20 39 18 60 86 167 67 Female 33 22 15 26 10 11 70 30 19 14 122 128 109 12 91 78 31 36 27 68 20 69 18 27 66 111 35 32 176 138 111 35

Ngày đăng: 18/05/2017, 10:17

Từ khóa liên quan

Mục lục

  • Cover

  • Title Page

  • Copyright Page

  • About the Author

  • Contents

  • Preface

  • Acknowledgments

  • Supplements

  • Technology Resources

  • Data Sources

  • PART I: Introduction

    • CHAPTER 1 The Nature of Statistics

      • Case Study: Greatest American Screen Legends

      • 1.1 Statistics Basics

      • 1.2 Simple Random Sampling

      • 1.3 Other Sampling Designs

      • 1.4 Experimental Designs

      • Chapter in Review

      • Review Problems

      • Focusing on Data Analysis

      • Case Study Discussion

      • Biography

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan