外文科技图书简介
当前位置:首页 > 检索结果 >文献详细内容

书名:R data analysis without programming

责任者:David Gerbing.

ISBN\ISSN:9780415641739,9780415657204 

出版时间:2014

出版社:Routledge

分类号:自动化技术、计算机技术


摘要

This book prepares readers to analyze data and interpret statistical results using R more quickly than other texts. R is a challenging program to learn because code must be created to get started. To alleviate that challenge, Professor Gerbing developed lessR. LessR extensions remove the need to program. By introducing R through less R, readers learn how to organize data for analysis, read the data into R, and produce output without performing numerous functions and programming exercises first. With lessR, readers can select the necessary procedure and change the relevant variables without programming. The text reviews basic statistical procedures with the lessR enhancements added to the standard R environment. Through the use of lessR, R becomes immediately accessible to the novice user and easier to use for the experienced user.
Highlights of the book include:
Quick Starts that introduce readers to the concepts and commands reviewed in the chapters.
Margin notes thathighlight,define,illustrate,and cross-reference the key concepts.When readers encounter a term previously discussed, the margin notes identify the page number to the initial introduction.
Scenarios that highlight the use of a specific analysis followed by the corresponding R/lessR input and an interpretation of the resulting output.
Numerous examples of output from psychology, business, education, and other social sciences, that demonstrate how to interpret results.
Two data sets provided on the website and analyzed multiple times in the book, provide continuity throughout.
End of chapter worked problems help readers test their understanding of the concepts.
A website at www.lessRstats.com that features the lessR program, the book’s data sets referenced in standard text and SPSS formats so readers can practice using R/lessR by working through the text examples and worked problems, PDF slides for each chapter, solutions to the book’s worked problems, links to R/lessR videos to help readers better understand the program, and more.
An ideal supplement for graduate or advanced undergraduate courses in statistics, research methods, or any course in which R is used, taught in departments of psychology, business, education, and other social and health sciences, this book is also appreciated by researchers interested in using R for their data analysis. Prerequisites include basic statistical knowledge. Knowledge of R is not assumed.

查看更多

前言

Purpose
This book is addressed to the student, researcher, and/or data analyst who wishes to analyze data using R to answer questions of interest. The structuring of the data for analysis, doing the analysis with the computer, and then interpreting the results are the skills to which this book is directed. Explicit instructions to use the computer for the analysis are provided, but using the computer is the easy part of data analysis, especially with the computer instructions provided in this book. By far, more content of this book is oriented toward the meaning of the analyses and interpretation of the results.
The computational tools discussed in this book are based on the open source and free R system that is becoming the world's standard for data analysis. Unlike commercial packages, R costs no money and can be installed at will on any Windows, Macintosh or Linux system. By itself, however, R presents a rather steep learning curve as it requires the ability to write computer code to do data analysis. To make data analysis with R more accessible, this book features a set of extensions to R called less R. These extensions ease the transition of the new user into using R for data analysis and for many users contain the only computational tools they will need.
The ultimate goal of this book is to make R accessible to users of spss and similar commercial packages, and, indeed, help facilitate R to become the preferred choice of any data analysis system. For data analysts with a proclivity for program computer code, R has already emerged as the preferred choice. The less R extensions to R remove the need to program and so are designed to provide access to R for all data analysts. To the extent that R becomes as straightforward to use and as comprehensive in its range of analyses as its commercial alternatives, then it becomes a viable choice as the preferred data analysis system both in the field and for instruction.
This is the first book that follows the format of the many spss based data analysis texts, with emphasis on both the instructions to produce the analysis, plus detailed interpretation of the resulting output. This book, however, applies the R/less R system for conducting the analysis, made possible with the less R extensions. Traditional R books on data analysis provide some interpretation, but much more space is required to explain the programming needed to obtain the output.
Intended Audience
This book is indented to serve as a supplementary text for graduate or advanced undergraduate courses in statistics, research methods, or any course in which R is used, taught in departments of psychology, business, education, and other social and health sciences. The book will also be helpful to researchers interested in using R for their data analysis. Prerequisites include basic statistical knowledge. Knowledge of R is not assumed
Overview of Content
The first chapter show show to download Rand then less R from the worldwide network of R servers, provides an example of data analysis, show show data are structured for analysis, and discusses general issues for the use of R and lessR. Chapter 2 show show to read the data from a computer file into R. Then the data often must be modified before analysis begins, the topic of Chapter 3. This editing can include changing individual data values, assigning missing data codes, transforming, recoding, and sorting the data values for a variable, and also sub-setting and merging data tables
Chapters 4 and 5 show how to obtain the most basic of all data analyses, the counting of how often each data value or groups of similar data values occurred. Chapter 4 explains bar charts and related techniques for counting the values of a categorical variable, one with non-numeric data values. Also included are the analysis of two or more variables in the form of bar charts for two variables, mosaic plots for more than two variables and the associated cross-tabulation tables. Chapter 5 does the same for numeric variables, which include the histogram and related analyses, the scatter plot for one variable, the box plot, the density plot and the time plot, plus the associated summary statistics
Chapter 6 explains the analysis of the mean of a single sample or of a mean difference across two different samples. Theses analyses are based on the t-test of the mean, the independent-samples t-test and the dependent-samples t-test. The non-parametric alternatives are also provided. Chapter 7 extends the material in Chapter 6 to comparison of many groups with the analysis of variance. The primary designs considered are the one-way independent groups, randomized blocks and two-way independent groups. Also illustrated are the randomized block factorial design and the split-plot factorial. Effect sizes are an integral part of each discussion.
Chapters 8, 9, and 10 focus on correlation and regression analysis. Chapter 8 introduces the scatter plot and the correlation coefficient to describe the relation between two variables. Both the usual parametric version and some non-parametric correlation coefficients a represented, as well as the correlation matrix. The subject of Chapter 9is regression analysis of a linear model of a response variable with a single predictor variable. The discussion includes estimation of the model, a consideration of the evaluation off it, outliers and influential observations, and prediction intervals. Chapter 10 extends this discussion to multiple regression, the analysis of models with multiple predictor variables, and also to logistic regression, the modeling of a response variable that has only two values.
The topic of Chapter 11is factor analysis, both its exploratory and confirmatory versions. The primary examples are of item analysis, the analysis of items that form a scale such as from an attitude survey, and the corresponding scale reliabilities. For exploratory factor analysis the concepts of factor extraction and factor rotation a represented. Within this context a linkage between exploratory and confirmatory factor analysis for item analysis is provided, as well as a discussion of the covariance structure that underlies the confirmatory analysis. Analysis of scale development from published data on Machiavellian is m appears throughout the chapter with the final development of the subscales with confirmatory factor analysis.
Distinctive Features
Every chapter after the first chapter begins with a brief Quick Start section. The function calls that provide the analyses described in the remainder of the chapter are listed and briefly described. The goal is that the user can immediately invoke the specified analyses and then, as needed refer to the remainder of the chapter to obtain the details. Each Quick Start section also serves as a convenient summary of the data analysis functions described in that chapter.
This book makes extensive use of margin notes to highlight the concepts discussed within the main text. Each definition is placed in a margin note. The margin notes also provide a concordance, across-reference to wherein the book a specific concept is first explained and illustrated. When the reader encounters a term previously discussed, the relevant section and page number appear in the margin notes
The motivation for a specific analysis is presented in what is called a Scenario, highlighted by lightgray rules. Following each Scenario is the R/less R input for a specific analysis, also highlighted by lightgray rules. The resulting output is then interpreted. A Listing is a literal copy of computer output, of which there are many throughout this book.
The analysis of many datasets illustrates the concepts discussed in the book. Two datasets, however, are analyzed multiple times to provide continuity throughout. One data set contains some of the typical information found in an employee database, such as each employee's Gender, Salary and so forth. The other data set regards the measurement of the attitudinal components of Machiavellian is m, the tendency to manipulate others for one's own perceived personal gain. This data set is from a published study by the author in the Journal of Personality and Social Psychology. This study was one of the first applications of confirmatory factor analysis in the psychological literature, and here the analysis is detailed and illustrated to explain both exploratory and confirmatory factor analysis as an integrated strategy for scale development.
less R Website
The website to support this book and the lessR package is www.lessRstats.com. The site includes a variety of reference materials to support your learning of data analysis, such as the datasets referenced in this book in both standard text and spss formats. This way you can practice using RandlessR, as you read this book and work through the examples and the included worked problems. Also included are videos on the use of RandlessR, a slide set for each chapter, and solutions, some of which are available only for instructors. The website also provides the opportunity to give your own suggestions and feedback with a place to request upgrades, a place to report bugs, and an interactive forum for asking and getting answers to questions.
Personal Acknowledgments
I would like to acknowledge the helpful people at Routledge/Taylor&Francis who have made this work possible. My editor Debra Riegert expressed an interest in this project from the start and has encouraged and guided the development of this book from its initial conceptualization. Miren Alberro managed the production process with good cheer and the helpful and needed assistance to turn a manuscript into a book.
I would also like to thank Jason T. Newsom, a colleague and Professor at the Institute on Aging, the School of Community Health, at Portland State University. Already a successful author at Routledge/Taylor & Francis, Jason introduced me to Debra, the introduction from which this book developed. Jason also provided two different forums for presenting my work on less Rat Portland State University: his informal seminars on quantitative topics for faculty and graduate students, and his annual June workshops on quantitative topics.
My students at Portland State University also deserve recognition for their role in the development of less Rand this book. In2008someofmy students who used Macintosh computers asked me what software they could use for class assignments. I had primarily been using Excel, but at that time Microsoft deleted much of the statistical functionality from the Macintosh version of Excel. My answer was R, but the first classroom experiences with R were not satisfying. Students were frustrated because of all the programming work needed to get anything useful done, and spent more time learning how to use the computer than actually thinking about the meaning of the results-hence less Rand four years of undergraduate and graduate students who have used and contributed much feedback to the project from the beginning of its development through the current version.
Finally, I would like to thank the three reviewers who provided comprehensive insightful reviews: J. Patrick Gray, University of Wisconsin-Milwaukee, Agnieszka Kwapisz, Montana State University, and Bertolt Meyer, University of Zurich, Switzerland. The quality of their reviews illustrates how whatever one person does is facilitated by the thoughtful critiques of others who are also knowledgeable in the subject. Their reviews shaped the format of this book.

查看更多

目录

BRIEF CONTENTS

Preface xiii

About the Author xvii

CHAPTER 1 R for Data Analysis 1

CHAPTER 2 Read/Write Data 31

CHAPTER 3 Edit Data 53

CHAPTER 4 Categorical Variables 77

CHAPTER 5 Continuous Variables 99

CHAPTER 6 Means, Compare Two Samples 123

CHAPTER 7 Compare Multiple Samples 149

CHAPTER 8 Correlation 181

CHAPTER 9 Regression I 203

CHAPTER 10 Regression II 223

CHAPTER 11 Factor/Item Analysis 251

Appendix: Standard R Code 279

Notes 283

References 285

Index 287

CONTENTS

Preface xiii

About the Author xvii

CHAPTER 1 R for Data Analysis 1

1.1 Introduction 1

1.2 Access R 3

1.3 Use R 6

1.4 R Graphs 16

1.5 Reproducible Code 19

1.6 Data 20

      Worked Problems 28

CHAPTER 2 Read/Write Data 31

2.1 Quick Start 31

2.2 Read Data 32

2.3 More Data Formats 40

2.4 Variable Labels 45

2.5 Write Data 48

      Worked Problems 50

CHAPTER 3 Edit Data 53

3.1 Quick Start 53

3.2 Edit Data 54

3.3 Transform Data 55

3.4 RecodeData 62

3.5 Sort Data 66

3.6 Subset Data 68

3.7 Merge Data 72

      Worked Problems 75

CHAPTER 4 Categorical Variables 77

4.1 Quick Start 77

4.2 One Categorical Variable 79

4.3 Two Categorical Variables 87

4.4 Onward to the Third Dimension 93

      Worked Problems 98

CHAPTER 5 Continuous Variables 99

5.1 Quick Start 99

5.2 Histogram 100

5.3 Summary Statistics 105

5.4 Scatter Plot and Box Plot 110

5.5 Density Plot 114

5.6 Time Plot 118

      Worked Problems 121

CHAPTER 6 Means, Compare Two Samples 123

6.1 Quick Start 123

6.2 Evaluate a Single Group Mean 124

6.3 Compare Two Different Groups 130

6.4 Compare Dependent Samples 142

      Worked Problems 147

CHAPTER 7 Compare Multiple Samples 149

7.1 Quick Start 149

7.2 One-way ANOVA 150

7.3 Randomized Block ANOVA 158

7.4 Two-way ANOVA 166

7.5 More Advanced Designs 173

      Worked Problems 180

CHAPTER 8 Correlation 181

8.1 Quick Start 181

8.2 Relation of Two Numeric Variables 181

8.3 The Correlation Matrix 194

8.4 Non-parametric Correlation Coefficients 200

      Worked Problems 202

CHAPTER 9 Regression I 203

9.1 Quick Start 203

9.2 The Regression Model 204

9.3 Residuals and Model Fit 209

9.4 Prediction Intervals 212

9.5 Outliers and Diagnostics 216

      Worked Problems 221

CHAPTER 10 Regression II 223

10.1 Quick Start 223

10.2 The Multiple Regression Model 224

10.3 Indicator Variables 234

10.4 Logistic Regression 239

      Worked Problems 248

CHAPTER 11 Factor/Item Analysis 251

11.1 Quick Start 251

11.2 Overview of Factor Analysis 252

11.3 Exploratory Factor Analysis 254

11.4 The Scale Score 260

11.5 Confirmatory Factor Analysis 263

11.6 Beyond the Basics 275

      Worked Problems 276

Appendix: Standard R Code 279

Notes 283

References 285

Index 287

查看更多

作者简介

David W.Gerbing received his B.A.in psychology from what is now Western Washington University in 1974, where he did his first statistical programming on an IBM 360 mainframe. He obtained his Ph.D. in quantitative psychology from Michigan State University in 1979 as a student of John E.Hunter.He has been an Associate Professor of Psychology and Statistics at Baylor University, and is now Professor of Quantitative Methods in the School of Business Administration at Portland State University.He has published many articles on statistical techniques and their application in a variety of journals that span several academic disciplines including psychology, sociology, education, business, and statistics.

查看更多

馆藏单位

中科院文献情报中心