Bài 1 business statistic

Thông tin tài liệu

Population, sampling, parameter and statistic Ø A population is a collection of persons, objects or items of interest Ø A sample is a part of the population that we actually examine in order to gather information. A sample is a subset of a population. Ø A descriptive measure of a population is called parameter Population… Ø A descriptive measure of a population is called parameter Ø A descriptive measure of a sample is called statistic Ø What we learn is using statistics to make inferences about parameters

BUSINESS STATISTICS Vu Trung Statistics The term of statistics is used in two sense: Ø  It refers to numerical statements of facts or quantitative data pertaining to a phenomenon Ø  In this sense, the term statistics also connotes “descriptive statistics” Ø  Statistics Ø  Besides that, the term “statistics” also refers to the statistical methods or methodology of collecting, compelling, presenting, analyzing and interpreting quantitative date Variables and Data Ø  Ø  Variables are characteristics or attributes that can be expected to differ from one individual to another Data are used as measurements or observations of a set of variables Population, sampling, parameter and statistic Ø  Ø  Ø  A population is a collection of persons, objects or items of interest A sample is a part of the population that we actually examine in order to gather information A sample is a subset of a population A descriptive measure of a population is called parameter Population… Ø  Ø  Ø  A descriptive measure of a population is called parameter A descriptive measure of a sample is called statistic What we learn is using statistics to make inferences about parameters …Sampling… Ø  Reasons for sampling: Ø  Money saving Ø  Time saving Ø  Broadening the scope of the study Ø  Impossibility to collecting information of every individual in a population …Sampling… Ø  Ø  There are two main types of sampling: random and non-random sampling In random sampling, every unit of the population has a known probability of being selected into the sample This is called the probability-based sampling …Sampling… Ø  basic random sampling (probability sampling) techniques: Ø  Simple random sampling Ø  Stratified random sampling Ø  Systematic random sampling Ø  Cluster random sampling …sampling… Ø  Ø  In non-random sampling, not every unit of the population has the same probability of being selected into the sample This type of sampling is known as nonprobability sampling types of nonprobability sampling: Ø  Convenience sampling Ø  Judgment sampling Ø  Quota sampling Ø  Snowball sampling Measures of Variation/Dispersion Coefficient of variation (relative measure of variation) The coefficient of variation is a relative variation that is always expressed as a percentage rather than in terms of units of the particular data ! CV = (! ) 100% Measures of Variation/Dispersion Example: Mean Standard Deviation Sample A (weight in pounds) 152 31 Sample B (height in inches) 69.3 3.86 Question: which one is absolutely more variable? We have to use coefficient variation to compare Measures of Variation/Dispersion Example: Mean Standard Deviation Sample A (weight in pounds) 152 31 Sample B (height in inches) 69.3 3.86 Question: which one is absolutely more variable? We have to use coefficient variation to compare Measures of Variation/Dispersion Example: Mean Standard Deviation Sample A (weight in pounds) 152 31 Sample B (height in inches) 69.3 3.86 Coefficient of variation for sample A = (31/152) x 100%= 20.4% Coefficient of variation for sample B = (3.86/69.3) x 100% = 5.57% Therefore, weight is more variable than height Measures of Shape Ø  Ø  A distribution is symmetrical if the distribution of data values above and below the mean are identical If not, the distribution is called nonsymmetrical and thus it is skewed either to the right (positively skewed) or to the left (negatively skewed) Measures of Shape Ø  In a unimodal distribution Ø  Mean = median = mode, the distribution is symmetrical Ø  Mean < median < mode, the distribution is likely to be negatively skewed Ø  Mean > median > mode, the distribution is likely to be positively skewed Measures of Shape Symmetric Mean = median = mode Negatively skewed mean median mode Positively skewed mode median mean Measures of Shape Ø  Covariance and coefficient of correlation Covariance is a measure of the linear relationship between two variables A positive value indicates a direct or increasing linear relationship and a negative values indicates a decreasing linear relationship Ø  A large positive value for the covariance indicates a strong positive linear relationship and that a large negative value indicates a strong negative linear relationship Ø  Measures of Shape Ø  If a distribution is symmetrical Ø  Approximately 68% of the observations fall within standard deviation of the mean Ø  Approximately 95% of the observations fall within standard deviation of the mean Ø  Approximately 99.7% of the observations fall within standard deviation of the mean Measures of Shape Ø  Formula for covariance: !"#! !, ! = Ø  Ø  ! !!! (! − !)(! − !) ! !−1 One problem with the above formula is that depends upon the scales of measurement used So, it is not a standardized measure The sample correlation coefficient, on the other hand, gives us a standardized measure of the linear relationship between two variables Measures of Shape Ø  Ø  The coefficient of correlation measures the relative strength of a linear relationship between two numerical variables Formula for correlation coefficient: !"#!" != ! !! !! Where the denominators are the sample standard deviation for variables X and Y Measures of Shape Ø  Example: Number of shops Sales (! − !)! (! − !! !) ! volume ($100) ! 50 57 41 54 54 38 63 48 59 46 (! − !)! (! − !! !)! !(! − !)(! − !) Measures of Shape Ø  Example: Number of shops Sales (! − !)! (! − !! !) ! volume ($100) ! 50 57 41 54 54 38 63 48 59 46 (! − !)! (! − !! !)! !(! − !)(! − !) Measures of Shape Ø  Covariance !"#! !, ! = Ø  ! !!! (! − !)(! − !) ! = 99/(10-1) = 11 !−1 As the covariance is positive, we can conclude that there is a positive linear relationship between X and Y !! = 1.49! !! = 7.93! r = 11/(1.49)(7.93) = 0.93 Measures of Shape Coefficient of correlation as r = 0.93 is very close to 1, we can say that there is a very strong positive linear relationship between X and Y Ø  Ø  If the value of the coefficient of correlation range from -1 for a perfect negative linear correlation to +1 for a perfect positive linear correlation [...]... Than Lower Boundary of Class Interval (%) 20 but less than 30 12 0 30 but less than 40 14 12 40 but less than 50 38 26 =12 +14 50 but less than 60 18 63 =12 +14 +38 60 but less than 70 12 82 =12 +14 +38 +18 70 but less than 80 6 94 =12 +14 +38 +18 +12 Total 10 0 10 0 =12 +14 +38 +18 +12 +6 ... 60 18 60 but less than 70 12 70 but less than 80 6 Total 10 0 Example: Numerical data Cumulative Distribution: provides a way of presenting information about the percentage of values that are less thant a specific amount Cost per Meal ($) Percenta ge (%) Percentage of Meals Less Than Lower Boundary of Class Interval (%) 20 but less than 30 12 0 30 but less than 40 14 12 40 but less than 50 38 26 =12 +14 ... 42 21 40 49 45 54 64 48 41 34 53 27 44 58 68 59 61 59 48 78 65 42 Example: Numerical data Ordered array method City Restaurant Meal Cost 21 23 23 27 28 29 32 32 33 34 35 38 39 40 40 40 41 42 42 43 43 43 44 44 44 45 45 46 48 48 49 49 53 54 56 56 57 58 59 59 59 61 62 ... population remains unknown …Statistics Ø  Ø  Statistics is broken into two branches: descriptive and inferential Descriptive statistics is used to describe the basic features of the data in a study They provide simple summarizes about the sample and the measures using frequency counts, range, means, modes, median scores and standard deviation …Statistics Ø  Ø  Inferential statistics is used when we... 78 79 79 Example: Stem-and-Leaf Display Stem-and-Leaf Display 2 13 3789 3 2234589 4 00 012 23334445568899 5 346678999 6 12 4578 7 899 Example: Numerical data Frequency distribution Determining the class interval width: Highest!value − lowest!value Interval!width = ! ! number!of!classes For example: 58 Interval!width = ! = 5.8! 10 Example: Numerical data Frequency distribution Cost per Meal ($) City... than 50 19 50 but less than 60 9 60 but less than 70 6 70 but less than 80 3 Total 50 Example: Numerical data Percentage distribution Computing the proportion or relative frequency Proportion = relative!frequency = ! number!of!values!in!each!class ! total!number!of!values Example: Numerical data Percentage distribution Cost per Meal ($) Percentage (%) 20 but less than 30 12 30 but less than 40 14 40... Categorical Data Ø  Example: types of payments Forms of payment Percentage (%) Cash 15 Check 54 Electronic/online 28 Other/don’t know 3 Source: Data extracted from “How Adults Pay Monthly Bills” USA today, 4 Oct, 2007 Example: Categorical Data Ø  Example: types of payments Types of payments 60 50 40 30 Percentage (%) 20 10 0 Cash Check Electronic/online Other/don’t know Presenting Numerical Data Numerical... salary is $10 00 gets twice as much as someone who gets $500 Levels… Ø  Ø  Ø  Ø  Nominal: no ordering (arbitrary labels) Ordinal: ordered but differences between values are not important Interval: ordered, constant scale, buts no natural zero Ratio: ordered, constant scale, natural zero Levels… Variables Quantitative variable Interval Qualitative variable Ratio Nominal Ordinal Exercise 1 Identifying

Ngày đăng: 22/05/2016, 22:08

Xem thêm: Bài 1 business statistic, Bài 1 business statistic

Bài 1 business statistic

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan