ebook img

PRACTICAL ANALYSIS OF ENCRYPTED NETWORK TRAFFIC Andrew M. White A dissertation ... PDF

159 Pages·2015·3.35 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview PRACTICAL ANALYSIS OF ENCRYPTED NETWORK TRAFFIC Andrew M. White A dissertation ...

PRACTICALANALYSISOFENCRYPTEDNETWORKTRAFFIC AndrewM.White AdissertationsubmittedtothefacultyoftheUniversityofNorthCarolinaatChapelHillin partialfulfillmentoftherequirementsforthedegreeofDoctorofPhilosophyintheDepartment ofComputerScience. ChapelHill 2015 Approvedby: FabianMonrose MichaelBailey KevinJeffay PhillipPorras MichaelReiter ©2015 AndrewM.White ALLRIGHTSRESERVED ii ABSTRACT AndrewM.White: PRACTICALANALYSISOFENCRYPTEDNETWORKTRAFFIC (UnderthedirectionofFabianMonrose) Thegrowinguseofencryptioninnetworkcommunicationsisanundoubtedboonforuser privacy. However,thelimitationsofreal-worldencryptionschemesarestillnotwellunderstood, andnewside-channelattacksagainstencryptedcommunicationsaredisclosedeveryyear. Furthermore,encryptednetworkcommunications,bypreventinginspectionofpacketcontents, representasignificantchallengefromanetworksecurityperspective: ourexistinginfrastructure reliesonsuchinspectionforthreatdetection. Bothproblemsareexacerbatedbytheincreasing prevalenceofencryptedtraffic: recentestimatessuggestthat65%ormoreofdownstream Internettrafficwillbeencryptedbytheendof2016. Thisworkaddressestheseproblemsby expandingourunderstandingofthepropertiesandcharacteristicsofencryptednetworktraffic andexploringnew,specializedtechniquesforthehandlingofencryptedtrafficbynetwork monitoringsystems. Wefirstdemonstratethatopaquetraffic,ofwhichencryptedtrafficisasubset,canbe identifiedinreal-timeandhowthisabilitycanbeleveragedtoimprovethecapabilitiesofexisting IDSsystems. Todoso,weevaluateandcomparemultiplemethodsforrapididentificationof opaquepackets,ultimatelypinpointingasimplehypothesistest(whichcanbeimplementedon anFPGA)asanefficientandeffectivedetectorofsuchtraffic. Inourexperiments,usingthis techniqueto“winnow”,orfilter,opaquepacketsfromthetrafficloadpresentedtoanIDSsystem significantlyincreasedthethroughputofthesystem,allowingtheidentificationofmanymore potentialthreatsthanthesamesystemwithoutwinnowing. Second,weshowthatsidechannelsinencryptedVoIPtrafficenablethereconstructionof approximatetranscriptsofconversations. Ourapproachleveragestechniquesfromlinguistics, machinelearning,naturallanguageprocessing,andmachinetranslationtoaccomplishthistask despitethelimitedinformationleakedbysuchsidechannels. Ourabilitytodosounderscores iii boththepotentialthreattouserprivacywhichsuchsidechannelsrepresentandthedegreeto whichthisthreathasbeenunderestimated. Finally,weproposeanddemonstratetheeffectivenessofanewparadigmforidentifying HTTPresourcesretrievedoverencryptedconnections. Ourexperimentsdemonstratehowthe predominantparadigmfrompriorworkfailstoaccuratelyrepresentreal-worldsituationsand howourproposedapproachofferssignificantadvantages,includingtheabilitytoinferpartial information,incomparison. Webelievetheseresultsrepresentbothanenhancedthreattouser privacyandanopportunityfornetworkmonitorsandanalyststoimprovetheirowncapabilities withrespecttoencryptedtraffic. iv ACKNOWLEDGEMENTS Ihavebeenfortunatetoreceiveaidandencouragementfromagreatnumberofpeopleover thepastsevenyears. Firstandforemost,IwouldliketothankStephanieMalone,aswellasmy parentsandfamily,fortheirsupportandunderstanding. Second,Icannotadequatelyexpressmy gratitudetoProf.FabianMonrose,whohasbeenamentorandcolleaguethroughoutandwho hassupportedmeintoomanywaystocountsinceourfirstdaysatUNC. Iwouldalsoliketothanktheremainderofmycommittee(Prof.MichaelBailey,Prof.Michael Reiter,Prof.KevinJeffay,andPhilPorras)fortheiradviceandtheirtime. Specialthanksarealso duetomycolleaguesandcollaboratorsatUNC(particularlyProf.Jan-MichaelFrahm,Srinivas Krishnan,AustinMatthews,Prof.ElliottMoreton,NathanOtternes,RahulRaguram,Katherine Shaw,KevinSnow,TerylTaylor,JanWerner,andYiXu);atSRIInternational(particularlyVinod Yegneswaran);andatIBMResearch(particularlyReinerSailer,MihaiChristodorescu,andMarc Stoecklin). Inaddition,MurrayAnderegg,JakeCzyz,AlexEverett,JimGogan,BilHays,Jodie Turnbull,andMissyWoodprovidedinvaluableassistanceduringthecourseofthiswork. Finally,IwouldliketoexpressmyeternalgratitudetoToddAnderson,BrianCollier,and DavidFortney,forkeepingmeoffthewrongpath,andtoProf.BarryLawson,Prof.Douglas Szajda,andProf.JasonOwen,forhelpingmefindtherightone. ThisworkwassupportedinpartbytheDepartmentofHomelandSecurity(DHS)under contractnumberD08PC75388,theU.S.ArmyResearchOffice(ARO)underCyber-TAGrantno. W911NF-06-1-0316,andtheNationalScienceFoundation(NSF)underawardno. 0831245. Any opinions,findings,andconclusionsorrecommendationsexpressedinthismaterialarethoseof theauthorsanddonotnecessarilyreflecttheviewsoftheDHS,NSF,orARO. v TABLEOFCONTENTS LISTOFTABLES ...................................................................... xii LISTOFFIGURES ..................................................................... xiii LISTOFABBREVIATIONS ............................................................. xv CHAPTER1 INTRODUCTION........................................................ 1 ThesisStatement ........................................................ 4 1.1 Real-timeDetectionofOpaqueNetworkTraffic .................................. 5 1.2 ReconstructingTranscriptsofEncryptedVoIPConversations ...................... 6 1.3 IdentificationofEncryptedWebResources....................................... 7 1.4 Contributions ................................................................. 8 CHAPTER2 OPAQUETRAFFIC ...................................................... 10 2.1 Introduction................................................................... 10 2.2 Approach ..................................................................... 13 LikelihoodRatioTest .................................................... 15 SequentialProbabilityRatioTest ......................................... 15 2.3 Evaluation .................................................................... 16 2.3.1 OfflineAnalysis......................................................... 17 FileTypeIdentification .................................................. 17 ContentTypeMatching.................................................. 18 OperatorAnalysis....................................................... 23 2.3.2 OnlineAnalysis......................................................... 25 vi OperationalImpact ..................................................... 30 2.4 Limitations.................................................................... 31 2.5 RelatedWork.................................................................. 32 2.6 Discussion .................................................................... 34 2.7 FutureWork................................................................... 35 CompressedvsEncrypted......................................... 35 Flow-levelAnalysis ............................................... 35 2.8 BroaderImplications........................................................... 35 CHAPTER3 PHONOTACTICRECONSTRUCTIONOFENCRYPTEDVOIPCONVERSA- TIONS .................................................................. 37 3.1 Introduction................................................................... 37 3.2 BackgroundInformation ....................................................... 39 3.2.1 PhoneticModelsofSpeech .............................................. 39 3.2.2 VoiceoverIP ........................................................... 41 3.3 OverviewofOurApproach..................................................... 42 3.3.1 DataandAdversarialAssumptions....................................... 44 3.4 RelatedWork.................................................................. 45 3.5 Methodology.................................................................. 46 3.5.1 FindingPhonemeBoundaries(Stage➊)................................... 47 3.5.2 Methodology ........................................................... 47 3.5.3 Evaluation.............................................................. 50 3.5.4 ClassifyingPhonemes(Stage➋).......................................... 52 3.5.5 MaximumEntropyDiscriminationofPhonemes........................... 53 3.5.6 HMMModelingofPhonemes............................................ 54 3.5.7 Classification ........................................................... 55 vii 3.5.8 EnhancingClassificationusingLanguageModeling ....................... 55 3.5.9 Evaluation.............................................................. 55 3.5.10 SegmentingPhonemeStreamsintoWords(Stage➌) ....................... 56 3.5.11 IdentifyingWordsviaPhoneticEditDistance(Stage➍) .................... 57 3.5.12 MeasuringtheQualityofOurOutput .................................... 61 3.6 EmpiricalEvaluation........................................................... 64 3.6.1 AnAdversarialPointofView(MeasuringConfidence) ..................... 66 3.6.2 Discussion&Mitigation ................................................. 68 3.7 Conclusion.................................................................... 69 3.8 FutureWork................................................................... 70 Skype............................................................ 70 ConversationalSpeech ............................................ 70 3.9 BroaderImplications........................................................... 70 CHAPTER4 PLAYINGHIDE-AND-SEEK.............................................. 72 4.1 Introduction................................................................... 72 4.2 Background&RelatedWork.................................................... 76 4.2.1 LearningAlgorithms .................................................... 78 4.2.2 Features................................................................ 78 4.3 AssumptionsandThreatModel................................................. 79 4.3.1 NetworkingModel...................................................... 79 4.3.2 EncryptionModel....................................................... 80 HTTPSmodel .................................................... 80 TunnelModel .................................................... 81 DNSTraffic............................................................. 81 viii 4.3.3 WorldModels .......................................................... 82 Closed-world..................................................... 82 Open-world ...................................................... 82 BinaryOpen-world ........................................... 83 PartialInformation ............................................... 83 4.4 Approach ..................................................................... 84 4.4.1 ClassificationScheme: Multi-label........................................ 84 4.4.2 ClassifierModel: RandomForest ......................................... 85 SupportVectorMachineClassifiers....................................... 85 NaïveBayesClassifiers .................................................. 86 RandomForests......................................................... 87 SuitabilityforOurApproach............................................. 88 4.4.3 AbstentionandThresholding ............................................ 89 Post-ClassificationThresholding ......................................... 90 ValidationandThresholdSelection ....................................... 91 4.4.4 Hyper-parameterOptimization .......................................... 91 4.4.5 EpochValidation........................................................ 93 4.5 Evaluation .................................................................... 93 4.5.1 DataCollection ......................................................... 93 URLs .................................................................. 94 ScriptedRetrievals ...................................................... 95 4.5.2 EvaluationCriteria...................................................... 96 Multi-classMetrics...................................................... 96 Multi-labelMetrics...................................................... 97 ix 4.5.3 ExperimentalSetup ..................................................... 98 WorldModels .................................................... 98 Datasets ......................................................... 99 LearningAlgorithms.............................................. 100 Multi-labelClassification...................................... 100 FeatureSets ...................................................... 101 ValidationandDataSelection...................................... 101 Hyper-parameterOptimization .................................... 101 4.5.4 Implementation......................................................... 104 4.5.5 Results ................................................................. 104 Multi-classComparisonwithPreviousWork .............................. 104 MultipleURLsperDomainName ........................................ 106 Hyper-parameterOptimization .................................... 108 LabelingTraceswithURLComponents................................... 110 AbstainingfromClassification ........................................... 111 SummaryofFindings ................................................... 112 4.5.6 LimitationsandFutureWork ............................................ 114 4.6 BroaderImplications........................................................... 115 CHAPTER5 DISCUSSION&CONCLUSIONS ......................................... 116 A OPAQUETRAFFIC ................................................................. 117 A.1 ComparisonofMethods........................................................ 117 A.1.1 Preliminaries ........................................................... 117 DiscreteKolmogorov-SmirnovTest ....................................... 118 A.1.2 ParameterSpaceExploration............................................. 119 x

Description:
The growing use of encryption in network communications is an undoubted boon for Our investigation into opaque traffic also reveals a startling fact, which from the masses of data transmitted across our networks every day.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.