书名:Essentials of multivariate data analysis
出版时间:2014
出版社:Taylor & Francis,
前言
Why does this book exist? Well, essentially because (1) multivariate statisticalmethods can be very useful to people doing applied research or learning how todo it (by this I mean exploring data and using it to answer research questions);(2) most other books on multivariate methods are aimed at statisticians orresearchers who are comfortable with (and enjoy?) mathematics and formulae.The aim of this book is to explain the usefulness of multivariate methods in away which is accessible to students and researchers who would not considerthemselves statisticians or mathematicians. They may be put off by the wholeidea of quantitative analysis of the formulae they see in other books. Here theywill find very few formulae, and those that cannot be left out are made to seemless scary than they might look."But surely most researchers have been trained in statistics?" I hear yousay. This is true. Students may also have had some statistics training beforethey find this book. However, their training may have concentrated on topicslike summary statistics, graphical displays, confidence intervals, hypothesistesting, correlation and regression. Although multivariate data may have hada look when studying multiple regression, most of the statistical training willhave considered one variable at a time rather than many variables together.An exception to this may be factor analysis which is widely used in the socialsciences but the other topics discussed in this book are less likely to have beenencountered. Most datasets are multivariate (containing a number of variables)and as a result, multivariate methods are useful to explore and to use them toanswer research questions.
查看更多
目录
Preface xiii
1 Frequently Asked Questions 1
1 .1 What Questions? 1
1.2 What Analysis Should I Use? 1
1 .3 What Data Do I Need? 4
1.4 What Data Is the Author Using in This Book? 6
1.4. 1 General Knowledge Scores 7
1 .4.2 Opinions about Similarity of Nations' Foreign Policies 7
1.5 What about Missing Data? 7
1.6 What about Other Topics? 8
1.7 What about Computer Packages? 8
2 Graphical Presentation of Multivariate Data 11
2.1 Why Do I Want to Do Graphical Presentations of Multivariate Data? 11
2.2 What Data Do I Need for Graphical Presentations of Multivariate Data? 13
2.3 The Rest of This Chapter 13
2.4 Comparable Histograms 14
2.5 A Step-by-Step Guide to Obtaining Comparable Histograms Using the Excel Add-In 14
2.6 Multiple Box Plots 23
2.7 A Step-by-Step Guide to Obtaining Multiple Box Plots Using the Excel Add-In 24
2.8 Trellis Plot 25
2.9 A Step-by-Step Guide to Obtaining a Trellis Plot Using the Excel Add-In 26
2.10 Star Plots 27
2.11 Chernoff Faces 28
2.12 Andrews' Plots 28
2.13 A Step-by-Step Guide to Obtaining Andrews' Plots Using the Excel Add-In 32
2.14 Principal Components Plot 32
2.15 A Step-by-Step Guide to Obtaining a Principal Components Plot Using the Excel Add-In 35
2.16 More Information 35
3 Multivariate Tests of Significance 37
3.1 Why Do I Want to Do Multivariate Tests of Significance? 37
3.2 What Data Do I Need for Multivariate Tests of Significance? 38
3.3 The Rest of This Chapter 38
3.4 Comparing Two Vectors of Means 39
3.4.1 What Are We Testing? 39
3.4.2 What Is a Vector of Means? 40
3.4.3 Univariate Tests 40
3.4.4 Assumptions Made for Multivariate Test 42
3.4.5 What Is a Covariance Matrix? 44
3.4.6 Hotelling's T2 Test 44
3.4. 7 A Step-by-Step Guide to Comparing Two Vectors of Means Using the Excel Add-In 47
3.5 Comparing Two Covariance Matrices 48
3.5.1 What Are We Testing and How? 48
3.5.2 Assumptions Made 49
3.5.3 Multivariate Levene's Test 49
3.5.4 A Step-by-Step Guide to Comparing Two Covariance Matrices Using the Excel Add-In 51
3.6 Comparing More than Two Vectors of Means 52
3.6.1 What Are We Testing, and How? 52
3.6.2 Assumptions Made 53
3.6.3 Wilks' Lambda Test 53
3.6.4 A Step-by-Step Guide to Comparing More than Two Vectors of Means Using the Excel Add-In 57
3.7 Comparing More than Two Covariance Matrices 57
3.7.1 What Are We Testing, and How? 57
3.7.2 Assumptions Made 58
3. 7.3 Combining Levene's Method and the Likelihood Ratio Test 59
3. 7.4 A Step-by-Step Guide to Comparing More than Two Covariance Matrices Using the Excel Add-In 60
3.8 More Information 61
4 Factor Analysis 63
4.1 Why Do I Want to Do Factor Analysis? 63
4.2 What Data Do I Need for Factor Analysis? 65
4.3 The Rest of This Chapter 66
4.4 How Do We Extract the Factors? 66
4.4.1 Using Principal Components Analysis 67
4.4.2 Using Principal Axis Factoring 70
4.5 Interpreting the Results of a PCA Factor Analysis 73
4.6 How Many Factors Are There? 74
4.7 Interpreting the Results of a PAF Factor Analysis 76
4.8 Communalities Briefly Revisited 78
4.9 Rotating Factor Loadings 80
4.9.1 Non-Orthogonal/Oblique Rotations 85
4.10 So Which Solution Do We Believe? 85
4.11 Factor Scores 86
4.12 A Step-by-Step Guide to Factor Analysis Using the Excel Add-In 87
4.13 More Information 88
5 Cluster Analysis 89
5.1 Why Do I Want to Do Cluster Analysis? 89
5.2 What Data Do I Need for Cluster Analysis? 91
5.3 The Rest of This Chapter 91
5.4 How Do We Decide How Close Together Two Cases Are? 91
5.4.1 Distances by Absolute Value 92
5.4.2 Standardising 92
5.4.3 Distances by Absolute Value Using Standardised Data 93
5.4.4 Euclidean Distances 93
5.4.5 Squared Euclidean Distances 95
5.4.6 Distances for Binary Data 95
5.5 How Do We Decide How Close Together Two Clusters Are? 96
5.5.1 Average Linkage between Groups 96
5.5.2 Complete Linkage 97
5.5.3 Single Linkage 98
5.5.4 Forward-Thinking Linkage Methods 98
5.6 How Do We Decide Which Distance Measure and Linkage Method to Use? 100
5.7 How Do We Decide How Many Clusters There Are? 100
5. 7 .1 Using First Seven Cases in the Dataset 100
5. 7 .2 Using All Cases in the Dataset 103
5.8 Interpreting Clusters 105
5.9 Non-Hierarchical Cluster Analysis 106
5.10 A Step-by-Step Guide to Cluster Analysis Using the Excel Add-In 107
5.11 More Information 108
6 Discriminant Analysis 109
6.1 Why Do I Want to Do Discriminant Analysis? 109
6.2 What Data Do I Need for Discriminant Analysis? 109
6.3 The Rest of This Chapter 110
6.4 How Do We Decide How Close a Case Is to Different Groups? 110
6.4.1 Linear Discriminant Functions 112
6.4.2 Assumptions Made 114
6.5 Allocating Individual Cases to Groups 115
6.5.1 Creating the Linear Discriminant Functions 115
6.5.2 Allocating Cases to the Groups 118
6.6 Which Variables Discriminate between Groups? 119
6.7 How Accurate Are the Allocations? 120
6.7.1 Assumptions 121
6.7.2 Probabilities 121
6.8 Testing a Discriminant Analysis 122
6.8.1 Splitting the Dataset 123
6.8.2 Cross-Validation 123
6.9 Other Methods of Discriminant Analysis 124
6.9.1 Canonical Discrimination 124
6.9.2 Stepwise Discrimination 125
6.10 A Step-by-Step Guide to Discriminant Analysis Using the Excel Add-In 125
6.11 More Information 126
7 Multidimensional Scaling 127
7.1 Why Do I Want to Do Multidimensional Scaling? 127
7.2 What Data Do I Need for Multidimensional Scaling? 128
7.3 The Rest of This Chapter 130
7.4 Classical Multidimensional Scaling 130
7.4.1 Problems with Classical Multidimensional Scaling 137
7.5 Other Methods of Multidimensional Scaling 137
7.6 A Step-by-Step Guide to Multidimensional Scaling Using the Excel Add-In 139
7.7 More Information 140
8 Correspondence Analysis 141
8.1 Why Do I Want to Do Correspondence Analysis? 141
8.2 What Data Do I Need for Correspondence Analysis? 142
8.3 The Rest of This Chapter 142
8.4 Chi-Square Distances, Inertia and Plots 144
8.4.1 A Chi-Square Test of Independence 144
8.4.2 Inertia 145
8.4.3 Plotting Chi-Square Distances 145
8.4.4 Adding Extreme Categories to the Plots 150
8.5 More Dimensions 152
8.5 .1 Reducing a Plot to Two Dimensions 153
8.6 Row, Column and Symmetric Normalisations 155
8.7 Correspondence Analysis with More than Two Variables 157
8.7.1 The Burt Matrix 157
8.7 .2 Analysing the Burt Matrix 159
8.8 A Step-by-Step Guide to Correspondence Analysis Using the Excel Add-In 161
8.9 More Information 161
References 163
Index 165
查看PDF
查看更多
作者简介
Dr. Neil H. Spencer is a reader in applied statistics and directorof the Statistical Services and Consultancy Unit at the Universityof Hertfordshire. His research interests include multilevel models,multivariate methods, statistical computing, multiple testing, and testing for randomness.
查看更多
馆藏单位
中科院文献情报中心