ebook img

Anonymity and Trust in Large-Scale Distributed Storage Systems PDF

128 Pages·2013·1.3 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Anonymity and Trust in Large-Scale Distributed Storage Systems

Anonymity and Trust in Large-Scale Distributed Storage Systems Thèse présentée à la Faculté de Sciences Institut d’Informatique Université de Neuchâtel Pour l’obtention du grade de docteur ès science par JOSÉ VALERIO Acceptée sur proposition du jury : Prof. Pascal FELBER Université de Neuchâtel, directeur de thèse Prof. Peter KROPF Université de Neuchâtel Dr. Etienne RIVIÈRE Université de Neuchâtel Prof. Vivien QUÉMA INP, Grenoble, France Dr. Martin RAJMAN EPF Lausanne Soutenu le 3 Juillet 2013 Université de Neuchâtel 2013 ii Abstract Large-Scale Distributed Storage Systems (LS-DSSs) are at the core of several Cloud services. These externalized services may run atop multiple ad- ministrative domains. While a client may trust the organization that provides a given Web service, a single server may belong to another organization that the client does not trust. The design of a Distributed Storage System is itself a challenging task, in particular when scalability, availability and consistency are required. Thisthesisexploresthreeimportantaspectswithinthiscontext: anonymity and trust in LS-DSSs, auditing for Large-Scale Distributed Aggregation Sys- tems (LS-DASs), and the evaluation of Distributed File Storage Service (DFSS) prototypes. The three systems proposed are evaluated using proto- type implementations. We present SPADS, a system that provides publisher anonymization and rate limitation to any generic LS-DSS over a Distributed Hash Table (DHT), running on unreliable and untrustworthy peers. Then, we propose CADA, a system that sits on top of a generic LS-DAS, and detects when one or several servers attempt to bias an aggregation. The system performs probabilistic auditing, and raises suspicions based on the statistical deviation from an expected behavior. Finally, we study DFSSs. Performing a fair comparison of the File System Consistency Levels (FSCLs) as supported by existing DFSSs is difficult because these systems feature different base performance and optimization levels. We make an empirical comparison of these FSCLs by instantiating them into a novel DFSS testbed named FlexiFS. iii iv Contents 1 Introduction 1 1.1 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.4 Organization of the thesis . . . . . . . . . . . . . . . . . . . . 11 2 Anonymity and Trust in Large-Scale Distributed Storage Sys- tems 13 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3 Problem definition: Anonymity vs. Trust . . . . . . . . . . . . 21 2.4 SPADS: Publisher Anonymization for DHT Storage . . . . . . 25 2.5 Algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.7 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3 Auditing for Large-Scale Distributed Aggregation Systems 41 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.2 System model and Problem definition . . . . . . . . . . . . . . 42 3.3 Collaborative Auditing for Distributed Aggregation . . . . . . 46 3.4 Auditing mechanisms for insertion bias . . . . . . . . . . . . . 51 3.5 Auditing mechanisms for aggregation bias . . . . . . . . . . . 56 3.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.7 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 v CONTENTS 4 Framework for the evaluation of Distributed File Storage Ser- vice prototypes 67 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.3 FlexiFS design and architecture . . . . . . . . . . . . . . . . . 74 4.4 File System Consistency Levels . . . . . . . . . . . . . . . . . 79 4.5 Algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 4.6 Implementation and Deployment . . . . . . . . . . . . . . . . 89 4.7 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.8 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5 Conclusions 95 5.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.2 Integration of the three systems . . . . . . . . . . . . . . . . . 96 5.3 Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 vi List of Figures 1.1 Principle of an aggregation of <value, counter> pairs at server S for key k. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2 Buzzaar: general principle and flow of information. . . . . . . 7 1.3 Multidomain data storage system. . . . . . . . . . . . . . . . . 8 2.1 DHT ring example. . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2 Chord fingers. . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.3 CAN two-dimensional example. . . . . . . . . . . . . . . . . . 17 2.4 Example of onion routing with 3 layers and return address. . . 20 2.5 SPADS: general principle. . . . . . . . . . . . . . . . . . . . . 26 2.6 SPADS: one-time legitimation and authentication. . . . . . . . 28 2.7 Computational cost for a_put(). . . . . . . . . . . . . . . . . 34 2.8 Computational cost of the encoding phase at the client side. . 35 2.9 CDF of the total time spent in the anonymizing path. . . . . . 36 3.1 Ways of introducing bias in the aggregation middleware. . . . 46 3.2 General description of the oracles. . . . . . . . . . . . . . . . . 48 3.3 Confidence level vs. variance-to-mean ratio. . . . . . . . . . . 52 3.4 False positives vs. Confidence threshold . . . . . . . . . . . . . 55 3.5 Performance of the insertion bias oracle (without bias). . . . . 60 3.6 Performance of the insertion bias oracle (with bias). . . . . . . 61 3.7 Performance of the aggregation bias oracle (without bias). . . 63 3.8 Performance of the aggregation bias oracle (with bias). . . . . 64 4.1 Failure-free execution of the Basic Paxos protocol. . . . . . . . 71 4.2 General architecture of FlexiFS. . . . . . . . . . . . . . . . . . 74 4.3 Example of a file system structure stored in FlexiFS. . . . . . 78 4.4 File System Consistency Levels. . . . . . . . . . . . . . . . . . 80 vii LIST OF FIGURES 4.5 Close-to-Open (CTO) semantics vs. atomic operations. . . . . 87 4.6 Evaluating consistency at the key/value store level. . . . . . . 91 4.7 Evaluating FSCLs and the Replication Factor by writing a file. 92 5.1 Buzzaar general architecture with SPADS and CADA. . . . . 97 5.2 Architecture of a DFSS with SPADS and FlexiFS. . . . . . . . 99 5.3 Integration of the three systems. . . . . . . . . . . . . . . . . . 100 viii List of Tables 2.1 Server roles in SPADS. . . . . . . . . . . . . . . . . . . . . . . 26 3.1 Notations for the base aggregation system without auditing support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.2 Additions to the Aggregation Point (AP) state for the insertion bias oracle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.3 Additions to the AP state for the aggregation bias oracle. . . . 50 4.1 Important File System in User Space (FUSE) operations. . . . 75 ix LIST OF TABLES x

Description:
several Cloud services. These externalized services may run atop multiple ad- ministrative domains. While a client may trust the organization systems that use Chaum mixes for anonymization are AP3 [91], Bifrost [71], and Tarzan [46]. Publius [124] allows users to publish anonymously Web content
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.