A textbook of Computer Based Numerical and Statiscal Techniques part 44 pdf

10 417 0
A textbook of Computer Based Numerical and Statiscal Techniques part 44 pdf

Đang tải... (xem toàn văn)

Thông tin tài liệu

416 COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES And, the equation of the line of regression of y on x is () () . y x yy r xx σ −= − σ (ii) Let 1 m and 2 m be the slopes of () i and () ii respectively. Then, 1 . y x m r σ = σ and 2 . y x r m σ = σ . Therefore, () 12 12 tan 1 mm mm − θ= + () () () () 2 2 22 2 1 . 1 yy xy xx xy y x r r r r σσ −  σσ − σσ  ==  σ+σ  σ  + σ Proved. Example 9. The lines of regression of x on y and y on x are respectively x = 19.13 – 0.87y and y = 11.64 – 0.50x. Find: (a) The mean of x - series; (b) The mean of y- series; (c) The correlation coefficient between x and y . Sol. Let the mean of x-series is x – and that of y-series be y – . Since the lines of regression pass through () ,xy , we have: x – = 19.13 − 0.87 y – or x – + 0.87 y – = 19.13 (1) and 11.64 0.50yx=− or 0.50 11.64xy+= (2) On solving () 1 and () 2 , we get 15.94x = and 3.67.y = Therefore, mean of x-series = 15.94 And mean of y-series = 3.67 Now, the line of regression of y on x is: 11.64 0.50yx=− ∴ 0.50 yx b =− Also, the line of regresson x on y is: 19.13 0.87xy=− ∴ 0.87 xy b =− ∴ ()() 0.50 0.87 0.435 0.66 yx xy rbb ==−−==− Clearly, r is taken as negative, since each one of yx b and xy b is negative. Example 10. Out of the following two regression lines, find the line of regression of x on y : 2x + 3y = 7 and 5x + 4y = 9. CURVE FITTING 417 Sol. Let 237xy+= be the regression line of x on .y Then, 5x + 4y = 9 is the regression line of y on x. Therefore 237xy+= and 549xy+= ⇒ 37 22 xy=− + and 59 44 yx=− + ⇒ 3 2 xy b =− and 5 4 yx b =− ⇒ 35 24 xy yx rbb  ==−−−   [3 ,, xy yx rb b have the same sign] 15 1, 8 =− <− which is impossible. Therefore our choice of regression line is incorrect. Hence, the regression line of x on y is 5x + 4y = 9. Ans. Example 11. Find the correlation coefficient between x and y , when the lines of regression are: 2x – 9y + 6 = 0 and x – 2y + 1 = 0. Sol. Let the line of regression of x on y be 2x – 9y + 6 = 0 Then, the line of regression of y on x is 210xy−+= . Therefore 2960xy−+= and 210xy−+= ⇒ 9 3 2 xy=− and 11 22 yx=+ ⇒ 9 2 xy b = and 1 2 yx b = ⇒ 91 3 ·1, 22 2 xy yx rbb  ==×=>   which is impossible. So, our choice of regression line is incorrect. Therefore, the regression line of x on y is 210xy−+= . And, the regression line of y on x is 2960xy−+= . ⇒ 21xy=− and 22 93 yx=+ ⇒ 2 xy b = and 2 9 yx b = ⇒ 22 .2 93 xy yx rbb  ==×=   Hence, the correlation coefficient between x and y is 2 3 . Ans. Example 12. The equations of two lines of regression are: 3x + 12y = 19 and 3y + 9x = 46. Find 418 COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES (i) the mean of x-series (ii) the mean of y-series (iii) Regression coefficient xy b and b yx , (iv) Correlation coefficient between x and y. Sol. Let the mean of x-series be x and that of y-series be y . Then, each of the given lines passes through (,)xy . Therefore 31219xy+= (1) And 93xy+ = 46 (2) On solving (1) and (2), we get x = 5 and y = 1 3 . Therefore mean of x-series is 5 and mean of y-series is 1 3 . Now, let the line of regression of x on y be 3x + 12y = 19 Then, the line of regression of y on x is 3y + 9x = 46. Therefore 3x + 12y = 19 and 3y + 9x = 46 ⇒ x =–4y + 19 3 and y = –3x + 46 3 ⇒ b xy = –4 and b yx = –3 ⇒ r = ––4–3 afaf = –2 3 < –1, which is impossible. ∴ Our choice of regression line is incorrect. Consequently, the regression line of x on y is 3y + 9x = 46. And, the regression line of y on x is 3x + 12y = 19. Therefore 3y + 9x = 46 and 3x + 12y = 19 ⇒ 146 39 xy=− + and 119 412 yx=− + ⇒ 11 , 34 xy yx bb =− =− and 11 1 3 34 6 23 r −−  =− −= =   (Because r , b xy and b yx have the same sign). Example 13. You are given the following data: Series x y Mean 18 100 standard deviation 14 20 Correlation coefficient between x and y is 0.8. Find the two regression lines. Estimate the value of ,y when x is 70. Estimate the value of ,x when y is 90. CURVE FITTING 419 Sol. Given that 18, 100,xy== 14, 20 xy σ= σ= and 0.8r = . Therefore the line of regression y on x is : () . y x yyr xx σ −= − σ or () () 0.8 20 100 18 14 yx ×  −= −   or 1.14 79.41yx=+ When 70x = , we have: (1.14 70 79.41) 159.21y =×+ = And, the line of regression of x on y is: () . x y xxr yy σ −= − σ or () () 14 18 0.8 100 20 xy −=× − or 0.56 38xy=− When 90y = , we have () 0.56 90 38 12.4x =×−= . Ans. To F i n d b yx and b xy Using Assumed Mean: Let the assumed means of x-series and y-series be A and B respectively. Then, taking () ii dx x A =− and () ii dy y B =− , we have () ()() () () 2 2 ii ii yx i i dx dy dx dy n b dx dx n ⋅− =   −    ∑∑ ∑ ∑ ∑ And, () ()() () () 2 2 ii ii xy i i dx dy dx dy n b dy dy n ⋅− =   −    ∑∑ ∑ ∑ ∑ Example 14. Find the regression coefficients and hence the equations of the two lines of regression from the following data: Age of husband (x) 25 22 28 26 35 20 22 40 20 18 Age of wife (y) 18 15 20 17 22 14 16 21 15 14 Hence estimate (i) The age of wife, when the age of husband is 30. (ii) The age of husband, when the age of wife is 19. 420 COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES Sol. We have 256 25.6 10 i x x x === ∑ and 172 17.2 10 i y y n === ∑ Let the assumed mean of x- series and y- series be 26 and 17 respectively. Then, we may prepare the table given below: () () () () 2 2 26 17 25 18 1 1 1 1 1 22 15 4 2 16 4 8 28 20 2 3 4 9 6 26 17 0 0 0 0 0 35 22 9 5 81 25 45 20 14 6 3 36 9 18 22 16 4 1 16 1 4 40 21 14 4 196 16 56 20 15 6 2 36 4 12 18 14 8 3 64 9 24 256 17 i i ii ii i i ii ii x y dx x dy y dx dy dx dy xy =− =− × −− −− −− −− −− −− == ∑ () () 2 2 4 2 450 78 172 iiiiii dx dy dx dy dx dy =− = = = = ∑∑ ∑∑ ∑ ∑ Therefore, () ()() () () ()() () 22 2 42 .172 10 4 450 10 ii ii yx i i dx dy dx dy n b dx dx n − −− ==  −   −  −      ∑∑ ∑ ∑ ∑ b yx = 172 0 8 450 1 6 + . –. bg bg = 172 8 448 4 . . = 0.385 () ()() () () ()() 22 2 42 .172 10 2 78 10 ii ii xy i i dx dy dx dy n b dy dy n − −− ==  −   −     ∑∑ ∑ ∑ ∑ () () 172 0.8 172.8 2.23 78 0.4 77.6 xy b + === − Therefore the equation of the line of regression of y on x is: () () . yx yy b xx −= − or () ()( ) 17.2 0.385 25.6yx −= − CURVE FITTING 421 Now, when x = 30, we get ()( ) 17.2 0.385 30 25.6y −= − or 19y = (approximately). ∴ When the age of husband is 30 years, the estimated age of husband is 19 years. Again, the equation of the line of regression of x on y is: xx– di = b xy yy– di or (x – 25.6) = (2.23)(y –17.2) Thus, when y = 19, we get x = 30 (approximately). So, when the age of wife is 19 years, the estimated age of husband is 30 years. Ans. 9.4 ERROR OF PREDICTION The deviation of the predicted value from the observed value is known as the standard error of prediction. It is given by () 2 p yx yy E n − = ∑ , where y is the actual value and p y the predicted value. Theorem: Prove that: (1) () 2 .1 yx y Er =σ − , (2) () 2 .1 xy x Er =σ − Proof: (1) The equation of the line of regression of y on x is () . y x yyr xx σ −= − σ ∴ () . y p yyr xx x σ =+ − σ (1) So, () () 2 1/2 2 1 p y yx x yy Eyyrxx nn  − σ   ==−−−  σ    ∑ ∑ () ()() 1/2 2 22 2 2 2. 1 () . y y x x rxx r yy xxyy n   σ− σ   =−+ −−−   σ σ     ∑ () () ()() 1/2 22 22 2 .2. yy x x yy xx xxyy rr nn n  −−−− σσ  =+ −  σ σ   ∑∑∑ 1/2 22 22 2 .2. yy yx xy x x rr r  σσ  =σ+ σ− σσ  σ σ   ()() 1/2 22 2 .1 y yy y rr =σ− σ =σ − . (2) Similarly, (2) may be proved. 422 COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES Example 15. For the data given below, find the standard error of estimate of y on x . x 12345 y 25387 Sol. We leave it to the reader to find the line of regression of y on x. This is: 1.3 1.1yx=+ . So, 1.3 1.1 p yx =+ Now form the table for given data: () () () 2 2 1.3 1.1 1 2 2.4 0.4 0.16 2 5 3.7 1.3 1.69 33 5 2 4 4 8 6.3 1.7 2.89 5 7 7.6 0.6 0.36 9.10 ppp p xyyx yy yy yy =+ − − − − −= ∑ Therefore () 2 9.10 1.82 1.349 5 p yx yy E n − ==== ∑ . Ans. 9.5 MULTIPLE LINEAR REGRESSION There are a number of situations where the dependent variable is a function of two or more independent variables either linear or non-linear. Here, we shall discuss an approach to fit the experimental data where the variable under consideration is linear function of two independent variables. Let us consider a two-variable linear function given by y = a + bx + cz (1) The sum of the squares of the errors is given by S = yabxcz iii i n –– – bg 2 1 = ∑ (2) Differentiating S partially w.r.t. a, b, c, we get ∂ ∂ S a =0 ⇒ 2 1 yabxcz iii i n –– – –1 bg af = ∑ = 0 ∂ ∂ S b =0 ⇒ 2 1 yabxcz x iiii i n –– – (–) bg = ∑ = 0 CURVE FITTING 423 and ∂ ∂ S c =0 ⇒ 2 1 yabxcz z iii i n i –– – (–) bg = ∑ = 0 which on simplification and omitting the suffix i, yields. ∑y = ma + b∑x + c∑z ∑xy = a∑x + b ∑x 2 + c∑xz ∑yz = a∑z + b∑xz + c∑z 2 Solving the above three equations, we get values of a, b, and c. Consequently, we get the linear function y = a + bx + cz called regression plane. Example 16. Obtain a regression plane by using multiple linear regression to fit the data given below : x : 1 2 3 4 y : 0 1 2 3 z : 12 18 24 30 (U.P(U.P (U.P(U.P (U.P .TU. 2002).TU. 2002) .TU. 2002).TU. 2002) .TU. 2002) Sol. Let y = a + bx + cz be required regression plane where a, b, c are the constants to be determined by following equations : and ∑= +∑+∑ ∑=∑+∑+∑ ∑=∑+∑+∑ U V | W | ymabxcz xy a x b x c xz yz a z b zx c z 2 2 (1) Here, m =4 22 22 1 0 12 1 0 12 0 0 2118 4 1 36 2 18 3 2 24 9 4 72 6 48 4 3 30 16 9 120 12 90 10 6 84 30 14 240 20 156 xz y x z xy xz yz xzyx z xy xzyz====== == ∑∑∑∑∑∑ ∑∑ From table, equation (1) can be written as 84 = 4a + 10b + 6c 240 = 10a + 30b + 20c and 156 = 6a + 20b + 14c Solving, we get a = 10, b = 2, c = 4 Hence the required regression plane is y = 10 + 2x + 4z. Ans.Ans. Ans.Ans. Ans. 424 COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES PROBLEM SET 9.2 1. Find the equation of the lines of regression on the basis of the data: :x 42342 :y 23244 [Ans. 3.75 0.25 , 3.75 0.25yxxy=− =− ] 2. Find the regression coefficient yx b for the data: 55,x = ∑ 88,y = ∑ 2 385, x = ∑ 2 1114, y = ∑ 586,xy = ∑ and 10n = [Ans.1.24] 3. The following data regarding the heights () y and weights () x of 100 college students are given: 15000,x = ∑ 2 2272500, x = ∑ 6800,y = ∑ 2 463025, y = ∑ and 1022250xy = ∑ . [Ans. 0.1 53yx=+ ] 4. Find the coefficient of correlation when two regression equations are: =− +0.2 4.2xy and 0.80 8.4yx=− + .[Ans. 0.4 r =− ] 5. Find the standard error of estimate of y on x for the data given below: :x 1346891114 :y 12445789 [ Ans. 0.564 yx E = ] 6. If two regression coefficients are 0.8 and 0.2, what would be the value of coefficient of correlation? [Ans. 0.4r = ] 7. x and y are two random variables with the same standard deviation and correlation coefficient r. Show that the coefficient of correlation between x and xy+ is 1 . 2 r+ 8. Show that the geometric mean of the coefficients of regression is the coefficient of correlation. GGG CHAPTER 10 Time Series and Forecasting 10.1 INTRODUCTION Business executives, economists, and government officials are often faced with problems that require forecast such as future sales, future revenue and expenditures, and the total business activity for the next decade. Time series analysis is a statistical method, which helps the businessman to understand the past behaviour of economic variables based on collection of observations taken at different time intervals. Having recognized the behaviour or movements of a time series, the businessman tries to forecast the future of economic variables on the assumption that the time series of such an economic variable will continue to behave in the same fashion as it had in the past. Thus analyzing information for the previous time periods is the subject of time series analysis. Thus the statistical data, which are collected, observed or recorded at successive intervals of time or arranged chronologically are said to form a time series. “A time series a set of observations taken at specified times, usually (but not always) at equal intervals”. Thus a set of data depending on time, which may be year, quarter, month, week, days etc. is called a time series. Examples: 1. The annual production of Rice in India over the last 15 years. 2. The daily closing price of a share in the Calcutta Stock Exchange. 3. The monthly sales of an Iron Industry for the last 6 months. 4. Hourly temperature recorded by the meteorological office in a city. Mathematically, a time series is defined by the value 12 , , , yy of a variable y (closing price of a share, temperature etc.) at time t 1 , t 2 , t 3 , . Thus y is a function of t and given by y = f (t) 10.2 TIMES SERIES GRAPH A time series involving a variable y is represented pictorially by constructing a graph of y verses t. 425 . Ans.Ans. Ans.Ans. Ans. 424 COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES PROBLEM SET 9.2 1. Find the equation of the lines of regression on the basis of the data: :x 42342 :y 23 244 [Ans coefficient of correlation? [Ans. 0.4r = ] 7. x and y are two random variables with the same standard deviation and correlation coefficient r. Show that the coefficient of correlation between x and. σ =σ − . (2) Similarly, (2) may be proved. 422 COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES Example 15. For the data given below, find the standard error of estimate of y on x . x 12345 y 25387 Sol.

Ngày đăng: 04/07/2014, 15:20

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan