Quantitative Methods and Applications in GIS - Chapter 7 ppt

21 408 0
Quantitative Methods and Applications in GIS - Chapter 7 ppt

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

127 7 Principal Components, Factor, and Cluster Analyses, and Application in Social Area Analysis This chapter discusses three important multivariate statistical analysis methods: principal components analysis (PCA), factor analysis (FA), and cluster analysis (CA). PCA and FA are often used together for data reduction by structuring many variables into a limited number of components (factors). The techniques are partic- ularly useful for eliminating variable collinearity and uncovering latent variables. Applications of the methods are widely seen in socioeconomic studies (also see case study 8 in Section 8.4). While the PCA and FA group variables, the CA classifies many observations into categories according to similarity among their attributes. In other words, given a dataset as a table, the PCA and FA reduce the number of columns and the CA reduces the number of rows. Social area analysis is used to illustrate the techniques, as it employs all three methods. The interpretation of social area analysis results also leads to a review and comparison of three classic models on urban structure, namely, the concentric zone model, the sector model, and the multinuclei model. The analysis demonstrates how analytical statistical methods synthesize descriptive models into one framework. Beijing, the capital city of China, on the verge of forming its social areas after decades under a socialist regime, is chosen as the study area for a case study. Usage of GIS in this case study is limited to mapping for spatial patterns. Section 7.1 discusses principal components and factor analysis. Section 7.2 explains cluster analysis. Section 7.3 reviews social area analysis. A case study on the social space in Beijing is presented in Section 7.4 to provide a new perspective to the fast-changing urban structure in China. The chapter is concluded with a discussion and brief summary in Section 7.5. 7.1 PRINCIPAL COMPONENTS AND FACTOR ANALYSIS Principal components and factor analysis are often used together for data reduction. Benefits of this approach include uncovering latent variables for easy interpretation and removing multicollinearity for subsequent regression analysis. In many socio- economic applications, variables extracted from census data are often correlated with each other, and thus contain duplicated information to some extent. Principal components and factor analysis use fewer factors to represent the original variables, and thus simplify the structure for analysis. Resulting component or factor scores 2795_C007.fm Page 127 Friday, February 3, 2006 12:14 PM © 2006 by Taylor & Francis Group, LLC 128 Quantitative Methods and Applications in GIS are uncorrelated to each other (if not rotated or orthogonally rotated), and thus can be used as explanatory variables in regression analysis. Despite the commonalities, principal components and factor analysis are “both conceptually and mathematically very different” (Bailey and Gatrell, 1995, p. 225). Principal components analysis uses the same number of variables (components) to simply transform the original data, and thus is a mathematical transformation (strictly speaking, not a statistical operation). Factor analysis uses fewer variables (factors) to capture most of the variation among the original variables (with error terms), and thus is a statistical analysis process. Principal components attempts to explain the variance of observed variables, whereas factor analysis intends to explain their intercorrelations (Hamilton, 1992, p. 252). In many applications (as in ours), the two methods are used together. In SAS, principal components analysis is offered as an option under the procedure for factor analysis. 7.1.1 P RINCIPAL C OMPONENTS F ACTOR M ODEL In formula, principal components analysis (PCA) transforms original data on K observed variables Z k to data on K principal components F k that are independent from (uncorrelated with) each other: (7.1) Retaining only the J largest components ( J < K ), we have (7.2) where the discarded components are represented by the residual term v k , such as v k = l k , J +1 F J +1 + l k , J +2 F J +2 + … + l kK F K (7.3) Equations 7.2 and 7.3 represent a model termed principal components factor analysis (PCFA). The PCFA retains the largest components to capture most of the variance while discarding minor components with small variance. The PCFA is the method used in social area analysis (Cadwallader, 1996, p. 137) and is simply referred to as factor analysis in the remainder of this chapter. In a true factor analysis (FA), the residual (error) term, denoted as u k to distin- guish it from v k in a PCFA, is unique to each variable Z k : The u k are termed unique factors (in contrast to common factors F j ). In the PCFA, the residual v k is a linear combination of the discarded components ( F J +1 , …, F K ) and thus cannot be uncorrelated like the u k in a true FA (Hamilton, 1992, p. 252). ZlFlF lF lF kk k kjj kKK = + ++ ++ 11 2 2 ZlFlF lFv kk k kJJ k =++++ 11 2 2 ZlFlF lFu kk k kJJ k =++++ 11 2 2 2795_C007.fm Page 128 Friday, February 3, 2006 12:14 PM © 2006 by Taylor & Francis Group, LLC Principal Components, Factor, and Cluster Analyses, and Application 129 7.1.2 F ACTOR L OADINGS , F ACTOR S CORES , AND E IGENVALUES For convenience, the original data of observed variables Z k are first standardized 1 prior to the PCA and FA analysis, and the initial values for components (factors) are also standardized. When both Z k and F j are standardized, the l kj in Equations 7.1 and 7.2 are standardized coefficients in the regression of variables Z k on components (factors) F j , also termed factor loadings . For example, l k 1 is the loading of variables Z k on standardized component F 1 . Factor loading reflects the strength of relations between variables and components. Conversely, the components F j can be reexpressed as a linear combination of the original variables Z k : (7.4) Estimates of these components (factors) are termed factor scores . Estimates of a kj are factor score coefficients , i.e., coefficients in the regression of factors on variables. The components F j are constructed to be uncorrelated with each other and are ordered such that the first component F 1 has the largest sample variance ( λ 1 ), F 2 the second largest, and so on. The variances λ j corresponding to various components are termed eigenvalues , and λ 1 > λ 2 > …. Since standardized variables have variances of 1, the total variance of all variables also equals the number of variables, such as λ 1 + λ 2 + … + λ K = K (7.5) Therefore, the proportion of total variance explained by the j th component is λ j / K . Eigenvalues provide a basis for judging which components (factors) are impor- tant and which are not, and thus deciding how many components to retain. One may also follow a rule of thumb that only eigenvalues greater than 1 are important (Griffith and Amrhein, 1997, p. 169). Since the variance of each standardized variable is 1, a component with λ < 1 accounts for less than an original variable’s variation, and thus does not serve the purpose of data reduction. The eigenvalue-1 rule is arbitrary. A scree graph plots eigenvalues against component (factor) number and provides a more useful guidance (Hamilton, 1992, p. 258). For example, Figure 7.1 shows the scree graph of eigenvalues in a case of 14 components (using the result from case study 7 in Section 7.4). The graph levels off after component 4, indicating that components 5 to 14 account for relatively little additional variance. Therefore, four components may be retained as principal components. Outputs from statistical analysis software such as SAS include important infor- mation, such as factor loadings, eigenvalues, and proportions (of total variance). Factor scores can be saved in a predefined external file. The factor analysis procedure in SAS also outputs a correlation matrix between the observed variables for analysts to examine their relations. FaZaZ aZ jj j KjK =+++ 11 2 2 2795_C007.fm Page 129 Friday, February 3, 2006 12:14 PM © 2006 by Taylor & Francis Group, LLC 130 Quantitative Methods and Applications in GIS 7.1.3 R OTATION Initial results from PCFA are often hard to interpret as variables load across factors. While fitting the data equally well, rotation generates simpler structure and more interpretable factors by maximizing the loading (positive or negative) of each vari- able on one factor and minimizing the loadings on the others. As a result, we can detect which factor (latent variable) captures the information contained in what variables (observed), and subsequently label the factors adequately. Orthogonal rotation generates independent (uncorrelated) factors, an important property for many applications. A widely used orthogonal rotation method is Varimax rotation , which maximizes the variance of the squared loadings for each factor, and thus polarizes loadings (either high or low on factors). Varimax rotation is often the rotation technique used in social area analysis. Oblique rotation (e.g., promax rotation ) generates even greater polarization, but allows correlation between factors. In SAS, an option is provided to specify which rotation to use. As a summary, Figure 7.2 illustrates the process of PCFA: 1. The original dataset of K observed variables with n records is first standardized to a dataset of Z scores with the same number of variables and records. 2. PCA then uses K uncorrelated components to explain all the variance of the K variables. 3. PCFA keeps only J ( J < K ) principal components to capture most of the variance. 4. A rotation method is used to load each variable strongly on one factor (and near zero on the others) for easier interpretation. The SAS procedure for factor analysis (FA) is FACTOR, which also reports the principal components analysis (PCA) results preceding those of FA. The following sample SAS statements implement the factor analysis that uses four factors to capture the structure of 14 variables, x1 through x14, and adopts the Varimax rotation technique: FIGURE 7.1 Scree graph for principal components analysis. 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Component Variance proportion 1 2 3 4 5 6 7 8 9 10 11 12 13 14 2795_C007.fm Page 130 Friday, February 3, 2006 12:14 PM © 2006 by Taylor & Francis Group, LLC Principal Components, Factor, and Cluster Analyses, and Application 131 proc factor out=FACTSCORE (replace=yes) nfact=4 rotate=varimax; var x1-x14; The SAS data set FACTSCORE has the factor scores, which can be saved to an external file. Note that a SAS program is not case sensitive. 7.2 CLUSTER ANALYSIS Cluster analysis (CA) groups observations according to similarity among their attributes. As a result, the observations within a cluster are more similar than observations between clusters, as measured by the clustering criterion. Note the difference between CA and another similar multivariate analysis technique — discriminant function analysis (DFA). Both group observations into categories based on the characteristic variables. Categories are unknown in CA but known in DFA. See Appendix 7A for further discussion on DFA. Geographers have a long-standing interest in cluster analysis (CA) that has been developed in applications such as regionalization and city classification. In the case of social area analysis, cluster analysis is used to further analyze the results from factor analysis (i.e., factor scores of various components across space) and group areas into different types of social areas. A key element in deciding assignment of observations to clusters is distance, measured in various ways. The most commonly used distance measure is Euclidean distance: FIGURE 7.2 Data processing steps in principal components factor analysis. n recor d s Original data set 1 2 3 K K variables (1) Standardize n records 1 2 3 . . . . . . . . . . . . . . . . . . . n 1 2 3 K K variables Z scores (2) PCA K variables 1 2 3 . . . . . . K Component loadings 1 2 3 K K components PCFA (3) K variables Factor loadings 1 2 3 J (J < K) J factors (4) Rotation K variables 1 2 3 . . . . . . K Maximizing loadings on one factor 1 2 3 J J factors 1 2 3 . . . . . . . . . . . . . . . . . . . n 1 2 3 . . . . . . K 2795_C007.fm Page 131 Friday, February 3, 2006 12:14 PM © 2006 by Taylor & Francis Group, LLC 132 Quantitative Methods and Applications in GIS (7.6) where x ik and x jk are the kth variable value of the K-dimensional observations for individuals i and j. When K = 2, Euclidean distance is simply the straight-line distance between observations i and j in a two-dimensional space. Like the various distance measures discussed in Chapter 2, distance measures here also include Manhattan (or city block) distance and others (e.g., Minkowski distance, Canberra distance) (Everitt et al., 2001, p. 40). The most widely used clustering method is the agglomerative hierarchical methods (AHMs). The methods produce a series of groupings: the first consists of single-member clusters, and the last consists of a single cluster of all members. The results of these algorithms can be summarized with a dendrogram, a tree diagram showing the history of sequential grouping process. See Figure 7.3 for the example illustrated below. In the diagram, the clusters are nested and each cluster is a member of a larger, higher-level cluster. For illustration, an example is used to explain a simple AHM, the single-linkage method or the nearest-neighbor method. Consider a dataset of four observations with the following distance matrix: FIGURE 7.3 Dendrogram for the clustering analysis example. Data points Distance 1.0 2.0 3.0 4.0 5.0 1234 C1 C2 C3 dxx ij ik jk k K =− = ∑ (( )) /2 1 12 D 1 1 2 3 4 0 30 650 9740 = ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ 2795_C007.fm Page 132 Friday, February 3, 2006 12:14 PM © 2006 by Taylor & Francis Group, LLC Principal Components, Factor, and Cluster Analyses, and Application 133 The smallest no-zero entry in the above matrix D 1 is (2 → 1) = 3, and therefore individuals 1 and 2 are grouped together to form the first cluster C1. Distances between this cluster and the other two individuals are defined according to the nearest-neighbor criterion: A new matrix is now obtained with cells representing distances between cluster C1 and individuals 3 and 4, or between individuals 3 and 4: The smallest no-zero entry in D 2 is (4 → 3) = 4, and thus individuals 3 and 4 are grouped to form a cluster C2. Finally, clusters C1 and C2 are grouped together, with distance equal to 5, to form one cluster C3 containing all four members. The process is summarized in a dendrogram in Figure 7.3, where the height represents the distance at which each fusion is made. Similarly, the complete linkage (farthest-neighbor) method uses the maximum distance between pair of objects (one in one cluster and one in the other); the average linkage method uses the average distance between pair of objects; and the centroid method uses squared Euclidean distance between individuals and cluster means (centroids). Another commonly used AHM is Ward’s method. The objective at each stage is to minimize the increase in the total within-cluster error sum of squares given by where in which x ck,i is the value for the kth variable for the ith observation in the cth cluster, and is the mean of the kth variable in the cth cluster. Each clustering method has its advantages and disadvantages. A desirable clus- tering should produce clusters of similar size, densely located, compact in shape, dddd ddd () () min{ , } min{ , 12 3 13 23 23 12 4 14 24 5=== = }} ==d 24 7 D 2 12 3 4 0 50 740 = ⎡ ⎣ ⎢ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ ⎥ () EE c c C = = ∑ 1 Exx c ck j ck k K i n c =− () = ∑∑ , 2 1 x ck 2795_C007.fm Page 133 Friday, February 3, 2006 12:14 PM © 2006 by Taylor & Francis Group, LLC 134 Quantitative Methods and Applications in GIS and internally homogeneous (Griffith and Amrhein, 1997, p. 217). The single-linkage method tends to produce unbalanced and straggly clusters and should be avoided in most cases. If outlier is a major concern, the centroid method should be used. If compactness of clusters is a primary objective, the complete linkage method should be used. Ward’s method tends to find same size and spherical clusters and is rec- ommended if no single overriding property is desired (Griffith and Amrhein, 1997, p. 220). The case study in this chapter also uses Ward’s method. The choice for the number of clusters depends on objectives of specific appli- cations. Similar to the selection of factors based on the eigenvalues in factor analysis, one may also use a scree plot to assist in the decision. In the case of Ward’s method, a graph of R 2 vs. the number of clusters helps choose the number, beyond which little more homogeneity is attained by further mergers. In SAS, the procedure CLUSTER implements the cluster analysis and the procedure TREE generates the dendrogram. The following sample SAS statements use Ward’s method for clustering and cut off the dendrogram at nine clusters: proc cluster method=ward outtree=tree; id subdist_id; /* variable for labeling ids */ var factor1-factor4; /* variables used */ proc tree out=bjcluster ncl=9; id subdist_id; 7.3 SOCIAL AREA ANALYSIS The social area analysis was developed by Shevky and Williams (1949) in a study of Los Angeles and was later elaborated on by Shevky and Bell (1955) in a study of San Francisco. The basic thesis is that the changing social differentiation of society leads to residential differentiation within cities. The studies classified census tracts into types of social areas based on three basic constructs: economic status (social rank), family status (urbanization), and segregation (ethnic status). Originally the three constructs were measured by six variables: economic status was captured by occupation and education; family status by fertility, women labor participation, and single-family houses; and ethnic status by percentage of minorities (Cadwallader, 1996, p. 135). In a factor analysis, an idealized factor loadings matrix probably looks like Table 7.1. Subsequent studies using a large number and variety of measures generally confirmed the validity of the three constructs (Berry, 1972, p. 285; Hartshorn, 1992, p. 235). Geographers made an important advancement in social area analysis by ana- lyzing the spatial patterns associated with these dimensions (e.g., Rees, 1970; Knox, 1987). The socioeconomic status factor tends to exhibit a sector pattern: tracts with high values for variables, such as income and education, form one or more sectors, and low-status tracts form other sectors. The family status factor tends to form concentric zones: inner zones are dominated by tracts with small families with either very young or very old household heads, and tracts in outer zones are mostly 2795_C007.fm Page 134 Friday, February 3, 2006 12:14 PM © 2006 by Taylor & Francis Group, LLC Principal Components, Factor, and Cluster Analyses, and Application 135 occupied by large families with middle-age household heads. The ethnic status factor tends to form clusters, each of which is dominated by a particular ethnic group. Superimposing the three constructs generates a complex urban mosaic, which can be grouped into various social areas by cluster analysis. See Figure 7.4. By studying the spatial patterns from social area analysis, three classic models for urban structure — Burgess’s (1925) concentric zone model, Hoyt’s (1939) sector model, and the Ullman–Harris (Harris and Ullman, 1945) multinuclei model — are synthesized into one framework. In other words, each of the three models reflects one specific dimension of urban structure and is complementary to the others. There are at least three criticisms of the factorial ecological approach to under- standing residential differentiation in cities (Cadwallader, 1996, p. 151). First, the analysis results are sensitive to research design, such as variables selected and measured, analysis units, and factor analysis methods. Second, it is still a descriptive form of analysis and fails to explain the underlying process that causes the patterns. Third, the social areas identified by the studies are merely homogeneous, but not necessarily functional regions or cohesive communities. Despite the criticisms, social area analysis helps us understand residential differentiation within cities, and serves as an important instrument for studying intraurban social spatial structure. Applications of social area analysis can be seen on cities in developed countries, particularly rich on cities in North America (see a review by Davies and Herbert, 1993), and also on some cities in developing countries (e.g., Berry and Rees, 1969; Abu-Lughod, 1969). 7.4 CASE STUDY 7: SOCIAL AREA ANALYSIS IN BEIJING This case study is developed on the basis of a research project reported in Gu et al. (2005). Detailed research design and interpretation of the results can be found in the original paper. This section shows the procedures to implement the study, with emphasis on illustrating the three statistical methods. In addition, the study illustrates how to test the spatial structure of factors by regression models with dummy vari- ables. Since the 1978 economic reforms in China, and particularly the 1984 urban reforms, including the urban land use reform and the housing reform, urban landscape in China has changed significantly. Many large cities have been on the TABLE 7.1 Idealized Factor Loadings in Social Area Analysis Economic Status Family Status Ethnic Status Occupation I O O Education I O O Fertility O I O Female labor participation O I O Single-family house O I O Minorities O O I Note: I denotes a number close to 1 or –1; O denotes a number close to 0. 2795_C007.fm Page 135 Friday, February 3, 2006 12:14 PM © 2006 by Taylor & Francis Group, LLC 136 Quantitative Methods and Applications in GIS transition from self-contained work unit neighborhood systems to more differentiated urban space. As the capital city of China, Beijing offers an interesting case to look into this important change in urban structure in China. The study area is the contiguous urbanized area of Beijing, with 107 subdistricts (jiedao), excluding the 2 remote suburban districts (Mentougou and Fangshan) and 23 subdistricts on the periphery of inner suburbs (also rural and lack of complete data). See Figure 7.5. The study area had a total population of 5.9 million, and the subdistricts had an average population of 55,200 in 1998. Subdistrict has been the basic administrative unit in Beijing for decades, and also the lowest geographic level reported in government statistical reports accessible by the public. Therefore, it was the analysis unit used in this research. Because of the lack of socioeconomic data in the national census of population, most of the data used in this research were extracted from the 1998 statistical yearbooks of individual districts in Beijing. Some data, such as personal income and individual living space, were obtained through a survey of households conducted in 1998. FIGURE 7.4 Conceptual model for urban mosaic. High SES (a) Socioeconomic status (SES) Low SES Small families Overlay (d) Urban mosaic (b) Family status Large families Ethnic enclaves ( c ) Ethnic status 2795_C007.fm Page 136 Friday, February 3, 2006 12:14 PM © 2006 by Taylor & Francis Group, LLC [...]... create and compute the dummy variables x2, x3, x4, y2, y3, and y4 according to Table 7. 6, and run regression models in Equations 7. 7 and 7. 8 The results are presented in Table 7. 7 A sample SAS program BJreg.sas is provided in the CD for reference 7. 5 DISCUSSION AND SUMMARY R2 in Table 7. 7 indicates whether the zonal or sectoral model is a good fit, and an individual t statistic (in parenthesis) indicates... Applications in GIS 9 1 1 1 2 2 1 2 1 5 1 2 1 1 2 5 1 1 5 2 1 5 1 4 2 1 6 2 1 1 66 6 4 2 5 5 6 2 1 1 4 4 4 6 6 2 1 2 4 4 22 6 1 4 6 6 2 1 1 4 4 5 2 44 4 4 47 4 4 4 2 8 5 4 47 4 5 2 2 4 2 5 5 2 5 5 5 2 5 5 5 5 5 3 3 3 3 3 3 5 5 2 Legend Sub mod-den Inner city mod-inc Inner city high-inc Inner sub mod-inc Outer sub mod-inc Inner city ethnic Outer sub mod-inc Inner city low-inc Outer sub float pop FIGURE 7. 7 Social... as in Figure 7. 6c, and factor4 (factor score for “ethnicity”) as in Figure 7. 6d Mapping social areas in ArcGIS: Similar to step 3, join both cluster9.csv and cluster5.csv to the shapefile bjsa in ArcGIS, and map the social areas as shown in Figure 7. 7 The five basic social areas are shown in different area patterns, and the nine detailed social areas are identified by their cluster numbers For understanding... Francis Group, LLC 279 5_C0 07. fm Page 140 Friday, February 3, 2006 12:14 PM 140 Quantitative Methods and Applications in GIS 2 3 4 5 a “Land use intensity” is by far the most important factor, explaining 35.16% of the total variance and capturing mainly six variables: three density measures (population density, public service density, and office and retail density), housing price, and two demographic... ratio Living space Income Natural growth rate Ethnic enclave Sex ratio Industry density 0.88 87 0.8624 –0.85 57 0.8088 0 .74 33 0 .71 00 0.0410 0.04 47 –0.5231 0.1010 –0.2550 0.0030 –0.2 178 0.4 379 0.04 67 0.0269 0.2909 –0.0068 –0.0598 0.1622 0.9008 0.8 879 0.6230 0.1400 0.2566 –0.1039 0.2316 –0.1433 0.1808 0.3518 0. 171 1 0.39 87 0. 178 6 –0.4 873 –0.0501 0.0238 –0.0529 0 .71 09 –0.6 271 –0.1263 –0.1592 0.3081 0.0 574 0.0855... economic opportunities in the fast-growing Haidian District (cluster 1) and manufacturing jobs in the Shijingshan District (cluster 3) The effects of the third factor (socioeconomic status) can be found in the emergence of the high-income areas in two inner city subdistricts (cluster 8), and the differentiation between middleincome (cluster 1) and low-income areas (clusters 2, 3, and 5) in suburbs The fourth... Taylor & Francis Group, LLC 279 5_C0 07. fm Page 141 Friday, February 3, 2006 12:14 PM Principal Components, Factor, and Cluster Analyses, and Application Legend Factor 1 scores −0.562 177 −0. 278 436 Legend Factor 2 scores 141 −0.231688−0.335100 −1.535921–1.156531 −0. 278 4 37 1.282308 −1.459063–0 .73 2648 0.335101−1.02 471 2 −1.156530–0.562 178 1.282309−2.4 275 28 −0 .73 26 47 0.231689 1.02 471 3−5 .79 6850 (a) Legend Factor... 0.9615 –0.01 47 0.0335 5.1510 1.8304 –1.1423 –0.8112 4.3598 –0 .75 91 279 5_C0 07. fm Page 143 Friday, February 3, 2006 12:14 PM Principal Components, Factor, and Cluster Analyses, and Application 143 TABLE 7. 6 Zones and Sectors Coded by Dummy Variables Zones Sectors Index and Location 1 2 3 4 Codes Inside second ring Between second and third rings Between third and fourth rings Outside fourth ring x2 = x2... 6 Inner city low income © 2006 by Taylor & Francis Group, LLC Averages of Factor Scores NeighborSocioNo of Land Use hood economic Subdistricts Intensity Dynamics Status Ethnicity 21 23 –0.2060 –0.4921 0. 673 0 –0.5159 –0.6932 –0.0522 0.3583 0.4143 22 21 0. 878 7 –0.8928 –0.1912 –0.8811 0.5541 0.0449 0. 172 2 –0 .72 47 6 –1.4866 2.06 67 0.3611 0.18 47 1 0.1041 5 .79 68 –0.2505 –1. 876 5 2 1 10 0 .71 68 1. 873 1 2.0 570 ... to clusters 2, 4, and 5 in the nine-cluster scenario Each cluster represents a social area Mapping factor patterns in ArcGIS: Open the shapefile bjsa in ArcGIS and join the text file factscore.csv to it based on the common key ref_id Map the field factor1 (factor score for “land use intensity”) as shown in Figure 7. 6a, factor2 (factor score for “neighborhood dynamics”) as in Figure 7. 6b, factor3 (factor . –0 .75 91 5 5 1 9 5 2 1 5 5 5 5 5 1 2 3 5 2 1 3 5 5 5 5 2 1 2 1 1 5 5 1 1 5 2 5 2 5 5 2 1 3 1 2 4 6 2 4 4 4 1 3 4 2 1 4 4 4 5 4 1 1 4 3 2 2 2 4 6 6 2 4 1 1 1 2 6 7 4 2 4 2 6 2 2 5 2 4 1 4 2 4 6 3 4 6 4 6 8 2 7 6 1 1 4 4 4 Legend Sub mod-den Inner sub mod-inc Outer sub mod-inc Inner city mod-inc Outer sub mod-inc Inner city low-inc Inner city high-inc Inner. LLC 134 Quantitative Methods and Applications in GIS and internally homogeneous (Griffith and Amrhein, 19 97, p. 2 17) . The single-linkage method tends to produce unbalanced and straggly clusters and. dummy vari- ables. Since the 1 978 economic reforms in China, and particularly the 1984 urban reforms, including the urban land use reform and the housing reform, urban landscape in China has changed

Ngày đăng: 11/08/2014, 17:22

Từ khóa liên quan

Mục lục

  • Quantitative Methods and Applications in GIS

    • Table of Contents

    • Chapter 7: Principal Components, Factor, and Cluster Analyses, and Application in Social Area Analysis

      • 7.1 PRINCIPAL COMPONENTS AND FACTOR ANALYSIS

        • 7.1.1 P RINCIPAL C OMPONENTS F ACTOR M ODEL

        • 7.1.2 F ACTOR L OADINGS , F ACTOR S CORES , AND E IGENVALUES

        • 7.1.3 ROTATION

        • 7.2 CLUSTER ANALYSIS

        • 7.3 SOCIAL AREA ANALYSIS

        • 7.4 CASE STUDY 7: SOCIAL AREA ANALYSIS IN BEIJING

        • 7.5 DISCUSSION AND SUMMARY

        • APPENDIX 7A: DISCRIMINANT FUNCTION ANALYSIS

        • APPENDIX 7B: SAMPLE SAS PROGRAM FOR FACTOR AND CLUSTER ANALYSES

        • NOTES

        • References

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan