书名:Bayesian disease mapping
出版时间:2013
出版社:Taylor & Francis,
前言
Bayesian approaches to biostatistical problems have become commonplace in epidemiological, medical, and public health applications. Indeed the use of Bayesian methodology has seen great advances since the introduction of, first, BUGS, and then WinBUGS. WinBUGS is a free software package that allows the development and fitting of relatively complex hierarchical Bayesian models. The introduction of fast algorithms" for sampling posterior distributions in the 1990s has meant that relatively complex Bayesian models can be fitted in a straightforward manner. This has led to a great increase in the use of Bayesian approaches not only to medical research problems but also in the field of public heath. One area of important practical concern is the analysis of the geographical distribution of health data found commonly in both public health databases and in clinical settings. Often population level data are available via government data sources such as online community health systems (e.g., for the US state of South Carolina this is http://scangis.dhec.sc.gov/scan/, while in the US state of Georgia it is http://oasis.state.ga.us/) or via centrally organized data registries where individual patient records are held. Cancer registry data (such as SEER in the United States) usually include individual diagnosis type and date as well as demographic information and so is at a finer level of resolution.
Most government sources hold publicly accessible aggregated health data due to confidentiality requirements. The resulting count data, usually available at county or postal/census region level, can yield important insights into the general spatial variation of disease in terms of incidence or prevalence. It can also be analyzed with respect to health inequalities or disparities related to health service provision. While this form of data and its analysis are relatively well documented, there are other areas of novel application of spatial methodology that are less well recognized currently. For example, one source of individual level data are disease registries where notification of a disease case leads to registering the individual and their demographic details. In addition, some diagnostic information is usually held. This is typically found on cancer registries, but other diseases have similar registration processes. In clinical trials, or community-based behavioral intervention trials, individual patient information is often held and disease progression is noted over the duration of the trial. In two ways, it may be important or relevant to consider spatial information in such applications. First, the recruitment or dropout process for trials may have a spatial component. Second, there may be unobserved confounding variables that have a spatial expression over the course of the trial. These issues may lead to the consideration of longitudinal or survival analyses where geo-referencing is admitted as a confounding factor. In general, the focus area of this work is, in effect, spatial biostatistics, as the inclusion of clini al and registry-level analysis, as well as population level analyses, lies within the range of applications for the methods covered.
In this work I have tried to provide an overview of the main areas of Bayesian hierarchical modeling in its application to geographical analysis of disease. I have tried to orient the coverage to both deal with population level analyses and also individual level analyses resulting from cancer registry data and also the possibility of the use of data on health service utilization (disease progression via health practitioner visits, etc.), and designed studies (clinical or otherwise). To this end, as well as including chapters on more conventional topics such as relative risk estimation and clustering, I have included coverage of spatial survival and longitudinal analysis, with a section on repeated event analysis.
There are many people that have helped in the production of this work. In particular, I would like to recognize sources of encouragement from Andrew Cliff, Sudipto Banerjee, Emmanuel Lesaffre, Peter Rogerson, and Allan Clark. In addition, I have to thank a range of postdoctoral fellows and graduate students who have provided help at various times: Hae-Ryoung Song, Ji-in Kim, Huafeng Zhou, Kun Huang, Junlong Wu, Yuan Liu, and Bo Ma. I must also thank those at CRC press for great help in finalizing the work. In particular, Rob Calver for general production support and Shashi Kumar for Latex help.
Finally I would like to acknowledge the continual and patient support of my family, and, in particular, Pat for her understanding during the sometimes fraught activity of book writing. Andrew Lawson; Charleston, United States 2008
查看更多
目录
List of Tables xiii
Preface xv Preface to Second Edition xvii
I Background 1
1 Introduction 3
1.1 Datasets 5
2 Bayesian Inference and Modeling 19
2.1 Likelihood Models 19
2.1.1 Spatial Correlation 20
2.2 Prior Distributions 21
2.2.1 Propriety 22
2.2.2 Non-informative Priors 22
2.3 Posterior Distributions 23
2.3.1 Conjugacy 24
2.3.2 Prior Choice 25
2.4 Predictive Distributions 25
2.4.1 Poisson-Gamma Example 26
2.5 Bayesian Hierarchical Modeling 26
2.6 Hierarchical Models 26
2.7 Posterior Inference 28
2.7.1 A Bernoulli and Binomial Example 29
2.8 Exercises 34
3 Computational Issues 35
3.1 Posterior Sampling 35
3.2 Markov Chain Monte Carlo Methods 36
3.3 Metropolis and Metropolis-Hastings Algorithms 37
3.3.1 Metropolis Updates 37
3.3.2 Metropolis-Hastings Updates 37
3.3.3 Gibbs Updates 38
3.3.4 M-H versus Gibbs Algorithms 38
3.3.5 Special Methods 39
3.3.6 Convergence 39
3.3.7 Subsampling and Thinning 44
3.4 Perfect Sampling 46
3.5 Posterior and Likelihood Approximations 47
3.5.1 Pseudo-likelihood and Other Forms 47
3.5.2 Asymptotic Approximations 48
3.6 Exercises 52
4 Residuals and Goodness-of-Fit 53
4.1 Model GOF Measures 53
4.1.1 The Deviance Information Criterion 54
4.1.2 Posterior Predictive Loss 55
4.2 General Residuals 56
4.3 Bayesian Residuals 58
4.4 Predictive Residuals and the Bootstrap 59
4.4.1 Conditional Predictive Ordinates 60
4.5 Interpretation of Residuals in a Bayesian Setting 61
4.6 Pseudo Bayes Factors and Marginal Predictive Likelihood 62
4.7 Other Diagnostics 63
4.8 Exceedence Probabilities 64
4.9 Exercises 66
II Themes 69
5 Disease Map Reconstruction and Relative Risk Estimation 71
5.1 An Introduction to Case Event and Count Likelihoods 71
5.1.1 The Poisson Process Model 71
5.1.2 The Conditional Logistic Model 72
5.1.3 The Binomial Model for Count Data 73
5.1.4 The Poisson Model for Count Data 74
5.2 Specification of the Predictor in Case Event and Count Models 75 5.2.1 The Bayesian Linear Model 76
5.3 Simple Case and Count Data Models with Uncorrelated Random Effects 78
5.3.1 Gamma and Beta Models 78
5.3.2 Log-normal/Logistic-normal Models 80
5.4 Correlated Heterogeneity Models 81
5.4.1 Conditional Autoregressive (CAR) Models 83
5.4.2 Fully-specified Covariance Models 86
5.5 Convolution Models 87
5.6 Model Comparison and Goodness-of-Fit Diagnostics 88
5.6.1 Residual Spatial Autocorrelation 90
5.7 Alternative Risk Models 92
5.7.1 Autologistic Models 93
5.7.2 Spline-based Models 96
5.7.3 Zip Regression Models 99
5.7.4 Ordered and Unordered Multi-Category Data 102
5.7.5 Latent Structure Models 102
5.8 Edge Effects 105
5.8.1 Edge Weighting Schemes and McMC Methods 107
5.8.2 Discussion and Extension to Space-Time 108
5.9 Exercises 109
5.9.1 Maximum Likelihood 109
5.9.2 Poisson-Gamma Model: Posterior and Predictive Inference 111
5.9.3 Poisson-Gamma Model: Empirical Bayes 111
6 Disease Cluster Detection 113
6.1 Cluster Definitions 113
6.1.1 Hot Spot Clustering 115
6.1.2 Clusters as Objects or Groupings 115
6.1.3 Clusters Defined as Residuals 115
6.2 Cluster Detection using Residuals 116
6.2.1 Case Event Data 116
6.2.2 Count Data 120
6.3 Cluster Detection using Posterior Measures 122
6.4 Cluster Models 125
6.4.1 Case Event Data 125
6.4.2 Count Data 133
6.4.3 Markov Connected Component Field (MCCF) Models 137
6.5 Edge Detection and Wombling 139
7 Regression and Ecological Analysis 141
7.1 Basic Regression Modeling 141
7.1.1 Linear Predictor Choice 141
7.1.2 Covariate Centering 142
7.1.3 Initial Model Fitting 142
7.1.4 Contextual Effects 145
7.2 Missing Data 146
7.2.1 Missing Outcomes 146
7.2.2 Missing Covariates 151
7.3 Non-Linear Predictors 151
7.4 Confounding and Multi-colinearity 152
7.5 Geographically Dependent Regression 155
7.6 Variable Selection 157
7.7 Ecological Analysis: The General Case of Regression 159
7.8 Biases and Misclassification Error 165
7.8.1 Ecological Biases 165
8 Putative Hazard Modeling 171
8.1 Case Event Data 172
8.2 Aggregated Count Data 177
8.3 Spatio-temporal Effects 180
8.3.1 Case Event Data 180
8.3.2 Count Data 183
9 Multiple Scale Analysis 189
9.1 Modifiable Areal Unit Problem (MAUP) 189
9.1.1 Scaling Up 189
9.1.2 Scaling Down 191
9.1.3 Multiscale Analysis 191
9.2 Misaligned Data Problem (MIDP) 195
9.2.1 Predictor Misalignment 195
9.2.2 Outcome Misalignment 202
9.2.3 Misalignment and Edge Effects 203
10 Multivariate Disease Analysis 205
10.1 Notation for Multivariate Analysis 205
10.1.1 Case Event Data 205
10.1.2 Count Data 205
10.2 Two Diseases 206
10.2.1 Case Event Data 206
10.2.2 Count Data 208
10.2.3 Georgia County Level Example (3 diseases) 209
10.3 Multiple Diseases 213
10.3.1 Case Event Data 213
10.3.2 Count Data 219
10.3.3 Multivariate Spatial Correlation and MCAR Models 221
10.3.4 The Georgia Chronic Ambulatory Care Sensitive Example 225
11 Spatial Survival and Longitudinal Analysis 229
11.1 General Issues 229
11.2 Spatial Survival Analysis 230
11.2.1 Endpoint Distributions 230
11.2.2 Censoring 231
11.2.3 Random Effect Specification 231
11.2.4 General Hazard Model 233
11.2.5 Cox Model 233
11.2.6 Extensions 234
11.3 Spatial Longitudinal Analysis 235
11.3.1 A General Model 237
11.3.2 Seizure Data Example 238
11.3.3 Missing Data 241
11.4 Extensions to Repeated Events 243
11.4.1 Simple Repeated Events 243
11.4.2 More Complex Repeated Events 244
11.4.3 Fixed Time Periods 247
12 Spatio-temporal Disease Mapping 253
12.1 Case Event Data 253
12.2 Count Data 255
12.2.1 Georgia Low Birth Weight Example 259
12.3 Alternative Models 262
12.3.1 Autologistic Models 262
12.3.2 Latent Structure ST Models 264
12.4 Infectious Diseases 267
12.4.1 Case Event Data 268
12.4.2 Count Data 269
12.4.3 Special Case: Veterinary Disease Mapping 273
12.4.4 FMD Revisited 276
13 Disease Map Surveillance 283
13.1 Surveillance Concepts 283
13.1.1 Syndromic Surveillance 284
13.1.2 Process Control Ideas 284
13.2 Temporal Surveillance 285
13.2.1 Single Disease Sequence 287
13.2.2 Multiple Disease Sequences 288
13.2.3 Infectious Disease Surveillance 289
13.3 Spatial and Spatio-temporal Surveillance 289
13.3.1 Components of Pij 290
13.3.2 Prospective Space-Time Analysis 292
III Appendices 299
Appendix A Basic R and WinBUGS 301
A.1 Basic R Usage 301
A.2 Use of R in Bayesian Modeling 304
A.3 WinBUGS 307
A.4 The R2WinBUGS Function 315
A.5 OpenBUGS and JAGS 318
A.6 BRugs 320
A.7 Maps on R. and GeoBUGS 322
Appendix B Selected WinBUGS Code 323
B.1 Code for the Convolution model (Chapter 5) 323
B.2 Code for Spatial Spline model (Chapter 5) 324
B.3 Code for the spatial autologistic model (Chapter 6) 324
B.4 Code for logistic spatial case control model (Chapter 6) 325
B.5 Code for PP residual model (Chapter 6) 325
B.6 Code for the logistic Spatial case-control model (Chapter 6) 326
B.7 Code for Poisson residual clustering example (Chapter 6) 327
B.8 Code for the proper CAR model (Chapter 5) 328
B.9 Code for the multiscale model for PH and County level data (Chapter 9) 329
B.10 Code for the shared component model for Georgia asthma and COPD (Chapter 10) 330
B.11 Code for the Seizure example with spatial effect (Chapter 11) 330
B.12 Code for the Knorr-Held model for space-time relative risk estimation (Chapter 12) 331
B.13 Code for the space-time autologistic model (Chapter 12) 332
Appendix C R Code for Thematic Mapping 333
Appendix D INLA Examples 335
D.1 Data Setup 335
D.2 A simple UH example 336
D.3 A more complex example 337
D.4 An example with covariates 339
D.5 Spatio-temporal example 341
Appendix E CAR Model Examples 345
References 347
Index 373
查看PDF
查看更多
馆藏单位
中国医科院医学信息研究所