Towards Practical and Fundamental Limits of Anonymity Protection Dissertation zur Erlangung des Grades eines Doktors der Wirtschaftswissenschaft Eingereicht an der Fakulta¨t fu¨r Wirtschaftswissenschaften der Universita¨t Regensburg Vorgelegt von Dipl.-Inform. Dang Vinh Pham 1. Berichterstatter:Prof. Dr. DogˇanKesdogˇan UniversityofRegensburg,Germany FacultyofBusiness,EconomicsandManagementInformationSystems 2. Berichterstatter:Prof. RNDr. Va´clavMatya´sˇ,Ph.D. MasarykUniversity,CzechRepublic FacultyofInformatics TagderDisputation: 15. November2013 This thesis is dedicated to my mother for her love and sacrifice. Acknowledgements After a long journey, it is my pleasure to thank all people who supported this thesis. First of all, I would like to express my gratitude to Prof. Dr. DogˇanKesdogˇanandProf. Ph.D.Va´clavMatya´sˇforbeingmythesiscom- mitteeandevaluatingthisthesis. IwouldliketothankmysupervisorProf. Dr. Dogˇan Kesdogˇan for the fruitful discussions and criticisms during the PhD time. I am thankful for being given the opportunity to do the the- sis at his chair. I am indebted to Fatih Karatas and Dr. Lars Fischer for discussions about implementation issues. In particular for the day, when they released me from the weeks of desperate search for a single bug that is specific to parallel programming. Works in this thesis were challenged by exchanges with other colleagues and I would like to thank all of them for providing this research environment. Special thanks to Dr. Benedikt Westermann, Dr. Tobias Mo¨mke and Soujen Chung, who squeezed time from their busy schedule, or spent their vacation to read the thesis. They contributedplentyofcommentstohelpmeimprovingthisthesis. Lastbut notleast,Iamgratefultomyparents,myuncleandmygirlfriendfortheir support and understanding. The PhD work took a great part of my life, leavinglittletimelefttospendwiththem. Abstract A common function of anonymity systems is the embedding of subjects that are associated to some attributes in a set of subjects, the anonymity set. Every subject within the anonymity set appears to be possibly asso- ciated to attributes of every other subject within it. The anonymity set coverstheassociationsbetweenthesubjectsandtheirattributes. Thelimit of anonymity protection basically depends on the hardness of disclosing those hidden associations from the anonymity sets. This thesis analyses the protection limit provided by anonymity sets by studying a practical and widely deployed anonymity system, the Chaum Mix. A Mix is an anonymouscommunicationsystemthatembedssendersofmessagesinan anonymity set to hide the association to their recipients (i.e., attributes), in each communication round. It is well known that traffic analyses can uniquelyidentifyauser’srecipientsbyevaluatingthesetsofsenders(i.e., the sender anonymity set) and recipients using the Mix in several rounds. The least number of rounds for that identification represents a fundamen- tal limit of anonymity protection provided by the anonymity sets, similar to Shannon’s unicity-distance. That identification requires solving NP- completeproblemsandwasbelievedtobecomputationallyinfeasible. Thisthesisshowsbyanewandoptimisedalgorithmthattheuniqueidenti- ficationofauser’srecipientsisformanyrealisticMixconfigurationscom- putational feasible, in the average case. It contributes mathematical esti- mates of the mean least number of rounds and the mean time-complexity for that unique identification. These measure the fundamental, as well as the practical protection limit provided by the anonymity sets of a Mix. TheycanbeappliedtosystematicallyidentifyMixconfigurationsthatlead to a weak anonymity of a user’s recipients. To the best of our knowledge, this has not been addressed yet, due to the computational infeasibility of past algorithms. All before-mentioned algorithms and analyses can be adapted to deduce information about a user’s recipients, even in cases of incomplete knowledge about the anonymity sets, or a low number of ob- servedanonymitysets. Contents Contents v ListofFigures xi ListofTables xv Nomenclature xviii 1 Introduction 1 1.1 AnonymousCommunicationProtectionModel . . . . . . . . . . . . 5 1.1.1 AttackerModel . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.1.2 AnonymityTerminology . . . . . . . . . . . . . . . . . . . . 6 1.1.3 BasicAnonymityTechniques . . . . . . . . . . . . . . . . . 10 1.2 TheMixforAnonymousCommunication . . . . . . . . . . . . . . . 11 1.2.1 EmbeddingFunction . . . . . . . . . . . . . . . . . . . . . . 12 1.2.1.1 PerfectMixConceptforClosedEnvironment . . . 12 1.2.1.2 ChaumMixConceptforOpenEnvironment . . . . 14 1.2.1.3 SequenceofMixes . . . . . . . . . . . . . . . . . 16 1.2.2 GroupFunction . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.2.3 MixVariantsforPractice-orientedAttackerModels . . . . . . 19 1.2.3.1 Stop-and-GoMix . . . . . . . . . . . . . . . . . . 20 1.2.3.2 Pool-Mix . . . . . . . . . . . . . . . . . . . . . . . 21 1.2.3.3 OnionRoutingandNon-Disclosing-Method . . . . 22 1.3 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 v CONTENTS 2 CombinatorialAttack 25 2.1 FormalMixandAttackerModel . . . . . . . . . . . . . . . . . . . . 27 2.1.1 FormalModelofChaumMix . . . . . . . . . . . . . . . . . 28 2.1.2 FormalAttackerModel . . . . . . . . . . . . . . . . . . . . . 29 2.1.2.1 Attacker’sGoal . . . . . . . . . . . . . . . . . . . 29 2.1.2.2 AttackScheme . . . . . . . . . . . . . . . . . . . . 30 2.1.2.3 Hitting-SetAttack . . . . . . . . . . . . . . . . . . 31 2.1.2.4 LearningNumberofAlice’sFriends . . . . . . . . 33 2.2 Hitting-SetAttackBasedonExactHS . . . . . . . . . . . . . . . . . 35 2.2.1 ExactHSAlgorithm . . . . . . . . . . . . . . . . . . . . . . . 36 2.2.1.1 IdentificationofHitting-Sets–Examples . . . . . . 39 2.2.2 SoundnessandCompleteness . . . . . . . . . . . . . . . . . 44 2.2.2.1 PropertiesofHitting-Sets . . . . . . . . . . . . . . 45 2.2.2.2 Soundness . . . . . . . . . . . . . . . . . . . . . . 48 2.2.2.3 Completeness . . . . . . . . . . . . . . . . . . . . 49 2.2.3 WorstCaseComplexity . . . . . . . . . . . . . . . . . . . . 51 2.2.3.1 Time-Complexity . . . . . . . . . . . . . . . . . . 51 2.2.3.2 Space-Complexity . . . . . . . . . . . . . . . . . . 55 2.2.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.2.4.1 CommunicationTraffic . . . . . . . . . . . . . . . 57 2.2.4.2 Simulation . . . . . . . . . . . . . . . . . . . . . . 59 2.3 ApproximationofHitting-Set-Attack . . . . . . . . . . . . . . . . . . 63 2.3.1 ClassificationofHitting-SetsandHypotheses . . . . . . . . . 63 2.3.2 ApproximationBasedonNo-Trivial-Disproof . . . . . . . . . 64 2.3.2.1 Complexity . . . . . . . . . . . . . . . . . . . . . 65 2.3.2.2 Relationto2×-Exclusivity . . . . . . . . . . . . . 65 2.3.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3 TheoreticalLimitofAnonymityProtection 71 3.1 MeanNumberofObservations . . . . . . . . . . . . . . . . . . . . . 72 3.1.1 BoundsofMeanNumberofObservationsfor2×-Exclusivity 73 3.1.1.1 Relating1×-Exclusivityand2×-Exclusivity . . . . 74 vi CONTENTS 3.1.1.2 Estimationof2×-ExclusivityBasedon1×-Exclusivity 74 3.1.2 MeanNumberofObservationsfork×-Exclusivity . . . . . . 76 3.1.2.1 Estimationofk×-Exclusivity . . . . . . . . . . . . 77 3.1.2.2 ComparisonofEstimatesfor2×-Exclusivity . . . . 79 3.1.2.3 EffectofAlice’sTrafficDistributionon2×-Exclusivity 80 3.1.3 RelationtoStatisticalDisclosureAttack . . . . . . . . . . . . 80 3.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.2.1 EstimatedNumberofObservationsRequiredbyHS-Attack . 81 3.2.2 NumberofObservationsRequiredbyHS-attackandSDA . . 83 3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4 PracticalLimitofAnonymityProtection 87 4.1 UpperBoundofMeanTime-Complexity . . . . . . . . . . . . . . . . 88 4.1.1 Potential – Estimate of Number of Observations Hit by a Hy- pothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.1.1.1 PotentialinCaseofnoChosenRecipient . . . . . . 91 4.1.1.2 PotentialinGeneralCase . . . . . . . . . . . . . . 92 4.1.1.3 Difference Between Potential and Number of Ob- servations . . . . . . . . . . . . . . . . . . . . . . 93 4.1.2 MeanDifferenceBetweenPotentialandNumberofObservations 95 4.1.2.1 ExpectationoftheDifference . . . . . . . . . . . . 95 4.1.2.2 Relation of Mean Difference to Number of Chosen Recipients . . . . . . . . . . . . . . . . . . . . . . 97 4.1.2.3 Relation of Mean Difference to Order of Recipient Choice . . . . . . . . . . . . . . . . . . . . . . . . 98 4.1.3 MaximalMeanNumberofRecipientChoicesforDisproofs . 99 4.1.3.1 LocalMaximalMean . . . . . . . . . . . . . . . . 100 4.1.3.2 MaximalMeanwithrespecttoHypothesisClass . . 101 4.1.3.3 GlobalMaximalMean . . . . . . . . . . . . . . . . 102 4.1.4 MaximalMeanTime-Complexity . . . . . . . . . . . . . . . 102 4.1.4.1 Estimate . . . . . . . . . . . . . . . . . . . . . . . 103 4.1.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 4.2 ImpactofTrafficDistributiononMeanTime-Complexity . . . . . . . 107 vii CONTENTS 4.2.1 Refined Potential – Estimate of Number of Observations Hit byaHypothesis . . . . . . . . . . . . . . . . . . . . . . . . . 108 4.2.2 MeanofPotential . . . . . . . . . . . . . . . . . . . . . . . . 112 4.2.2.1 SimplifiedAnalysis . . . . . . . . . . . . . . . . . 114 4.2.3 MinimalMeanNumberofRecipientChoicesforDisproofs . . 114 4.2.3.1 DerivingMaximalMinimalConditions . . . . . . . 115 4.2.3.2 ComparingUniformandNon-UniformDistribution 123 4.2.3.3 ApproachingOptimisticCaseStrategy . . . . . . . 125 4.2.4 RefinedMeanTime-Complexity . . . . . . . . . . . . . . . . 129 4.2.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 4.2.5.1 SolvingNumberofRecipientChoicesforDisproofs 131 4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 5 Extension 137 5.1 PartialInformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 5.1.1 QuantificationofMinimal-Hitting-SetsinaClass . . . . . . . 139 5.1.1.1 ComputingMinimal-Hitting-SetsinaClass . . . . 139 5.1.1.2 MaximalNumberofMinimal-Hitting-SetsinaClass 140 5.1.2 DescriptionofMinimal-Hitting-SetsbyExtensive-Hypotheses 142 5.1.2.1 EvolutionofExtensive-Hypotheses . . . . . . . . . 143 5.1.2.2 ConstructionofExtensive-Hypotheses . . . . . . . 145 5.1.3 ModellingEvolutionofExtensive-Hypotheses . . . . . . . . 150 5.1.3.1 MeanNumberofExtensive-Hypotheses . . . . . . 150 5.1.3.2 PartialDisclosure . . . . . . . . . . . . . . . . . . 153 5.1.3.3 BeyondUnambiguousInformation . . . . . . . . . 156 5.1.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 5.2 VagueInformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 5.2.1 ApplicationofHitting-SetAttackonVagueObservations . . . 159 5.2.1.1 VagueandErroneousObservations . . . . . . . . . 159 5.2.1.2 ApplicabilityofExactHS . . . . . . . . . . . . . . 161 5.2.2 AnalyticalAnalysesofConditionsforUniqueIdentification . 164 5.2.2.1 PropertiesofOrdinaryObservationsforUniqueIden- tification . . . . . . . . . . . . . . . . . . . . . . . 164 viii
Description: