书名:Secure data provenance and inference control with semantic web
责任者:Bhavani Thuraisingham | Tyrone Cadenhead | Murat Kantarcioglu | Vaibhav Khadilkar.
ISBN\ISSN:9781466569430,1466569433
摘要
With an ever-increasing amount of information on the web, it is critical to understand the pedigree, quality, and accuracy of your data. Using provenance, you can ascertain the quality of data based on its ancestral data and derivations, track back to sources of errors, allow automatic re-enactment of derivations to update data, and provide attribution of the data source.
Secure Data Provenance and Inference Control with Semantic Web supplies step-by-step instructions on how to secure the provenance of your data to make sure it is safe from inference attacks. It details the design and implementation of a policy engine for provenance of data and presents case studies that illustrate solutions in a typical distributed health care system for hospitals. Although the case studies describe solutions in the health care domain, you can easily apply the methods presented in the book to a range of other domains.
The book describes the design and implementation of a policy engine for provenance and demonstrates the use of Semantic Web technologies and cloud computing technologies to enhance the scalability of solutions. It covers Semantic Web technologies for the representation and reasoning of the provenance of the data and provides a unifying framework for securing provenance that can help to address the various criteria of your information systems.
Illustrating key concepts and practical techniques, the book considers cloud computing technologies that can enhance the scalability of solutions. After reading this book you will be better prepared to keep up with the on-going development of the prototypes, products, tools, and standards for secure data management, secure Semantic Web, secure web services, and secure cloud computing.
查看更多
目录
Preface xvii
Acknowledgments xxv
Authors xxvii
Permissions xxix
1 Introduction 1
1.1 Overview 1
1.2 Background 3
1.3 Motivation 5
1.4 Our Solutions and Contributions 7
1.5 Outline of the Book 9
1.6 Next Steps 11
References 12
SECTION I SUPPORTING TECHNOLOGIES
SECTION I Introduction
2 Security and Provenance 19
2.1 Overview 19
2.2 Scalability and Security of Provenance 21
2.3 Access Control Languages and Provenance 22
2.4 Graph Operations and Provenance 23
2.5 Summary and Directions 24
References 24
3 Access Control and Semantic Web 29
3.1 Overview 29
3.2 Access Control 30
3.3 Semantic Web 31
3.4 Semantic Web and Security 36
3.5 Summary and Directions 39
References 39
4 The Inference Problem 43
4.1 Overview 43
4.2 The Inference Problem 44
4.2.1 Functions of an Inference Controller 44
4.2.2 Inference Strategies 45
4.2.3 Security Constraints 46
4.2.4 Machine Learning and Inference 46
4.3 Our Approach 46
4.4 Historical Perspective 47
4.5 A Note on the Privacy Problem 49
4.6 Summary and Directions 50 References 50
5 Inference Engines 53
5.1 Overview 53
5.2 Concepts for Inference Engines 53
5.3 Software Systems 56
5.4 Summary and Directions 60
References 60
6 Inferencing Examples 63
6.1 Overview 63
6.2 Inference Function 64
6.3 Classification of a Knowledge Base 65
6.4 Inference Strategies and Examples 68
6.5 Approaches to the Inference Problem 74
6.6 Inferences in Provenance 76
6.7 Summary and Directions 77
References 78
7 Cloud Computing Tools and Frameworks 81
7.1 Overview 81
7.2 Cloud Computing Tools 82
7.3 Cloud Computing Framework 84
7.3.1 RDF Integration 84
7.3.2 Provenance Integration 85
7.4 Secure Query Processing in a Cloud Environment 86
7.4.1 The Web Application Layer 86
7.4.2 The ZQL Parser Layer 87
7.4.3 The XACML Policy Layer 88
7.4.4 The Hive Layer 89
7.4.5 HDFS 89
7.5 Summary and Directions 90
References 90
SECTION I Conclusion
SECTION II SECURE DATA PROVENANCE
SECTION II Introduction
8 Scalable and Efficient RBAC for Provenance 99
8.1 Overview 99
8.2 Motivation and Contributions 100
8.3 Unified and Flexible Policies 101
8.4 Supporting Inferences in RBAC 102
8.5 Overview of Our Approach 105
8.6 Extending RBAC to Support Provenance 107
8.7 A Query-Retrieval Process 109
8.7.1 Example of a Policy Query 109
8.7.2 Example of a SWRL Rule 110
8.7.3 Example of a Trace 110
8.7.4 Output of the Trace 111
8.7.5 Comment 111
8.8 Experimental Evaluation 113
8.9 Summary and Directions 115
References 116
9 A Language for Provenance Access Control 119
9.1 Overview 119
9.2 Challenges and Drawbacks 120
9.2.1 Drawbacks of Current Access Control Mechanisms 120
9.3 Policy Language 121
9.4 Solution Based on Regular Expression Queries 124
9.4.1 Data Representation 125
9.4.2 Graph Data Model 126
9.4.3 Provenance Vocabulary 127
9.4.4 Path Queries 128
9.5 Graph Analysis 129
9.5.1 Analysis of Digraphs 129
9.5.2 Composition of Digraphs 130
9.6 Access Control Policy Architecture 131
9.6.1 Modules in Access Control Policy Architecture 132
9.7 Use Case: Medical Example 133
9.7.1 Query Templates 135
9.7.2 Additional Templates 136
9.7.3 Access Control Example 137
9.8 Prototype 138
9.9 Summary and Directions 140
References 141
10 Transforming Provenance Using Redaction 143
10.1 Overview 143
10.2 Graph Grammar 145
10.2.1 An Example Graph Transformation Step 150
10.2.2 Valid Provenance Graph 153
10.2.3 Discussion 155
10.3 Redaction Policy Architecture 156
10.4 Experiments 158
10.5 Summary and Directions 160
References 161
SECTION II Conclusion
SECTION III INFERENCE CONTROL
SECTION III Introduction
11 Architecture for an Inference Controller 169
11.1 Overview 169
11.2 Design of an Inference Controller 170
11.3 Modular Design 172
11.4 Policy Processing 175
11.4.1 Parsing Process 175
11.4.2 High-Level Policy Translation 176
11.4.3 DL Rule Assembler 176
11.4.4 DL Policy Translation 177
11.4.5 Access Control Policy Assembler 178
11.4.6 Redaction Policy Assembler 178
11.5 Explanation Service Layer 179
11.6 Summary and Directions 180
References 181
12 Inference Controller Design 183
12.1 Overview 183
12.2 Design Philosophy 185
12.3 Inference Controller Process 188
12.4 Overview of a Query Process 189
12.5 Summary and Directions 192
References 192
13 Provenance Data Representation for Inference Control 195
13.1 Overview 195
13.2 Data Models for the Inference Controller 196
13.3 Separate Stores for Data and Provenance 197
13.4 Summary and Directions 198
References 199
14 Queries with Regular Path Expressions 201
14.1 Overview 201
14.2 Background 202
14.2.1 Regular Expressions 202
14.3 SPARQL Queries 204
14.4 Summary and Directions 206
References 207
15 Inference Control through Query Modification 209
15.1 Overview 209
15.2 Query Modification with Relational Data 210
15.3 SPARQL Query Modification 211
15.3.1 Query Modification for Enforcing Constraints 212
15.3.2 Overview of Query Modification 214
15.3.3 Graph Transformation of a SPARQL Query BGP 214
15.3.4 Match Pattern/Apply Pattern 215
15.4 Summary and Directions 216
References 217
16 Inference and Provenance 219
16.1 Overview 219
16.2 Invoking Inference Rules 221
16.3 Approaches to the Inference Problem 222
16.4 Inferences in Provenance 224
16.4.1 Implicit Information in Provenance 224
16.5 Use Cases of Provenance 225
16.5.1 Use Case: Who Said That? 226
16.5.2 Use Case: Cheating Dictator 227
16.6 Processing Rules 228
16.7 Summary and Directions 228
References 229
17 Implementing the Inference Controller 231
17.1 Overview 231
17.2 Implementation Architecture 232
17.3 Provenance in a Health Care Domain 233
17.3.1 Populating the Provenance Knowledge Base 233
17.3.2 Generating and Populating the Knowledge Base 234
17.3.3 Generating Workflows 234
17.4 Policy Management 235
17.4.1 Supporting Restrictions 239
17.5 Explanation Service Layer 241
17.6 Generators 242
17.6.1 Selecting Background Information 242
17.6.2 Background Generator Module 243
17.6.3 Annotating the Workflow 247
17.6.4 Generating Workflows 248
17.6.5 Incomplete Information in the Databases 248
17.7 Use Case: Medical Example 249
17.7.1 Semantic Associations in the Workflow 251
17.8 Implementing Constraints 251
17.8.1 Query Modification for Enforcing Constraints 251
17.9 Summary and Directions 252
References 253
SECTION III Conclusion
SECTION IV UNIFYING FRAMEWORK
SECTION IV Introduction
18 Risk and Inference Control 261
18.1 Overview 261
18.2 Risk Model 262
18.2.1 User's System 265
18.2.2 Internal Knowledge Base System 265
18.2.3 Controller 265
18.2.4 Adding Provenance 266
18.3 Semantic Framework for Inferences 267
18.3.1 Ontologics 268
18.3.2 Rules 269
18.3.3 Query Logs 269
18.4 Summary and Directions 270
References 271
19 Novel Approaches to Handle the Inference Problem 273
19.1 Overview 273
19.2 Motivation for Novel Approaches 275
19.3 Inductive Inference 276
19.3.1 Learning by Examples 276
19.3.2 Security Constraints and Inductive Inference 277
19.4 Probabilistic Deduction 278
19.4.1 Formulation of the Inference Problem 278
19.4.2 Probabilistic Calculus 279
19.4.3 Probabilistic Calculus and Database Security 280
19.4.4 A Note on Algorithmic Information Theory 281
19.5 Mathematical Programming 282
19.5.1 Nonmonotonic Reasoning 282
19.5.2 Inferencing in an MP Environment 283
19.5.3 Mathematical Programming and Database Security 285
19.6 Game Theory 285
19.6.1 Noncooperative and Cooperative Games 285
19.6.2 Query Processing as a Noncooperative Game 286
19.6.3 Ehrenfeucht—Fraisse Game 287
19.6.4 Adversarial Mining and Inference 287
19.7 Summary and Directions 288
References 288
20 A Cloud-Based Policy Manager for Assured Information Sharing 291
20.1 Overview 291
20.2 Architecture 292
20.2.1 Overview 292
20.2.2 Modules in Our Architecture 294
20.2.2.1 User Interface Layer 294
20.2.2.2 Policy Engines 296
20.2.2.3 Data Layer 301
20.2.3 Features of Our Policy Engine Framework 302
20.2.3.1 Develop and Scale Policies 303
20.2.3.2 Justification of Resources 304
20.2.3.3 Policy Specification and Enforcement 304
20.3 Cloud-Based Inference Control 304
20.4 Summary and Directions 306 References 306
21 Security and Privacy with Respect to Inference 309
21.1 Introduction 309
21.2 Trust, Privacy, and Confidentiality 310
21.2.1 Current Successes and Potential Failures 311
21.2.2 Motivation for a Framework 312
21.3 CPT Framework 312
21.3.1 Role of the Server 313
21.3.2 CPT Process 313
21.3.3 Advanced CPT 315
21.3.4 Trust, Privacy, and Confidentiality Inference Engines 316
21.4 Confidentiality Management 317
21.5 Privacy Management 318
21.6 Trust Management 319
21.7 Integrated System 320
21.8 Summary and Directions 322
References 323
22 Big Data Analytics and Inference Control 325
22.1 Overview 325
22.2 Big Data Management and Analytics 326
22.3 Security and Privacy for Big Data 327
22.4 Inference Control for Big Data 330
22.5 Summary and Directions 331
References 332
23 Unifying Framework 333
23.1 Overview 333
23.2 Design of Our Framework 334
23.3 The Global Inference Controller 338
23.3.1 Inference Tools 338
23.4 Summary and Directions 340
References 341
SECTION IV Conclusion
24 Summary and Directions 345
24.1 About This Chapter 345
24.2 Summary of the Book 345
24.3 Directions for Secure Data Provenance and Inference Control 350
24.4 Where Do We Go from Here? 351
Appendix A: Data Management Systems, Developments, and Trends 353
A.1 Overview 353
A.2 Developments in Database Systems 354
A.3 Status, Vision, and Issues 358
A.4 Data Management Systems Framework 359
A.5 Building Information Systems from the Framework 362
A.6 From Data CO Big Data 365
A.7 Relationship between the Texts 366
A.8 Summary and Directions 368
References 369
Appendix B: Database Management and Security 371
B.1 Overview 371
B.2 Database Management 372
B.2.1 Overview 372
B.2.2 Relational Data Model 372
B.2.3 Database Management Functions 373
B.2.3.1 Query Processing 374
B.2.3.2 Transaction Management 374
B.2.3.3 Storage Management 376
B.2.3.4 Metadata Management 377
B.2.3.5 Database Integrity 378
B.2.4 Distributed Data Management 378
B.3 Discretionary Security 380
B.3.1 Overview 380
B.3.2 Access Control Policies 381
B.3.2.1 Authorization Policies 381
B.3.2.2 RBAC Policies 382
B.3.3 Administration Policies 384
B.3.4 SQL Extensions for Security 385
B.3.5 Query Modification 386
B.3.6 Other Aspects 387
B.3.6.1 Identification and Authentication 387
B.3.6.2 Auditing a Database System 387
B.3.6.3 Views for Security 387
B.4 MAC 388
B.4.1 Overview 388
B.4.2 MAC Policies 389
B.4.3 Granularity of Classification 390
B.5 Summary and Directions 394
References 395
Appendix C: A Perspective of the Inference Problem 397
C.1 Overview 397
C.2 Statistical Database Inference 399
C.3 Approaches to Handling the Inference Problem in an MLS/ DBMS 400 CA Complexity of the Inference Problem 403
C.5 Summary and Directions 404
References 405
Appendix D: Design and Implementation of a Database Inference Controller 407
D.1 Overview 407
D.2 Background 408
D.3 Security Constraints 410
D.4 Approach to Security Constraint Processing 414
D.5 Consistency and Completeness of the Constraints 416
D.6 Design of the Query Processor 418
D.6.1 Security Policy 418
D.6.2 Functionality of the Query Processor 418
D.6.2.1 Query Modification 418
D.6.2.2 Response Processing 420
D.7 Design of the Update Processor 420
D.7.1 Security Policy 421
D.7.2 Functionality of the Update Processor 421
D.8 Handling Security Constraints during Database Design 423
D.8.1 Overview 423
D.9 Security Control Processing and Release Control 424
D.10 Distributed Inference Control 426
D.11 Summary and Directions 427
References 428
Index 429
查看PDF
查看更多
作者简介
Dr. Bhavani Thuraisingham is the Louis A. Beecherl, Jr. Distinguished Professor of Computer Science and the Executive Director of the Cyber Security Research and Education Institute (CSI) at The University of Texas at Dallas (UTD). She is an elected fellow of multiple organizations including the Institute for Electrical and Electronics Engineers (IEEE) and the American Association for the Advancement of Science (AAAS). She received several prestigious awards including the IEEE Computer Society's 1997 Technical Achievement, the 2010 Association for Computing Machinery, Special Interest Group on Security, Audit and Control (ACM SIGSAC) Outstanding Contributions Award, and the Society for Design and Process Science (SDPS) Transformative Achievement Medal. Her work has resulted in over 100 journal articles, over 200 conference papers, and over 100 keynote addresses. She has a PhD in theory of computation from the University of Wales, UK, and received the prestigious higher doctorate degree for her published research in secure dependable data management from the University of Bristol in England. PA\Dr. Tyrone Cadenhead worked in the computer industry for many years before joining UTD for graduate school. His thesis research was on secure data prov-enance and inference control, and he completed his PhD in 2011. He was a post-doctoral research associate at UTD for two years, conducting research in data security and privacy, and is currently a lead developer with Blue Cross Blue Shield working on semantic web technologies. PA\Dr. Murat Kantarcioglu is an associate professor in the Computer Science Department and the director of the Data Security and Privacy Lab at UTD. He is also a visiting scholar at the Data Privacy Lab at Harvard University. Dr. Kantarcioglu's research focuses on creating technologies that can efficiently extract useful information from any data without sacrificing privacy or security. He has published over 100 papers in peer-reviewed journals and conferences and has received two best paper awards. He is a recipient of the prestigious NSF CAREER award, and his research has been reported in the media, including the Boston Globe and ABC News. He holds MS and PhD degrees in computer science from Purdue University. He is a senior member of both the IEEE and the ACM. PA\Dr. Vaibhav Khadilkar completed his MS degree at Lamar University and, after working as a systems administrator for a few years, joined UTD for his PhD. He conducted research in secure semantic web, assured information sharing, and secure social networking and completed his PhD in 2013. He received a scholarship from the CSI for his outstanding contributions. He has published numerous papers in top tier venues and is currently employed at NutraSpace in Dallas.
查看更多
馆藏单位
中科院文献情报中心