ebook img

Completing the picture : fragments and back again PDF

132 Pages·1947·9.23 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Completing the picture : fragments and back again

LinköpingStudiesinScienceandTechnology ThesisNo.1361 Completing the Picture — Fragments and Back Again by Martin Karresand SubmittedtoLinköpingInstituteofTechnologyatLinköpingUniversityinpartial fulfilmentoftherequirementsforthedegreeofLicentiateofEngineering DepartmentofComputerandInformationScience Linköpingsuniversitet SE-58183Linköping,Sweden Linköping2008 Completing the Picture — Fragments and Back Again by MartinKarresand May2008 ISBN978-91-7393-915-7 LinköpingStudiesinScienceandTechnology ThesisNo.1361 ISSN0280–7971 LiU–Tek–Lic–2008:19 ABSTRACT Bettermethodsandtoolsareneededinthefightagainstchildpornography. Thisthesispresentsa methodforfiletypecategorisationofunknowndatafragments,amethodforreassemblyofJPEG fragments,andtherequirementsputonanartificialJPEGheaderforviewingreassembledimages. Toenableempiricalevaluationofthemethodsanumberoftoolsbasedonthemethodshavebeen implemented. ThefiletypecategorisationmethodidentifiesJPEGfragmentswithadetectionrateof100%anda falsepositivesrateof0.1%.Themethodusesthreealgorithms,ByteFrequencyDistribution(BFD), RateofChange(RoC),and2-grams.Thealgorithmsaredesignedfordifferentsituations,depending ontherequirementsathand. Thereconnectionmethodcorrectlyreconnects97%ofaRestart(RST)markerenabledJPEGimage, fragmentedinto4KiBlargepieces. Whendealingwithfragmentsfromseveralimagesatonce,the methodisabletocorrectlyconnect70%ofthefragmentsatthefirstiteration. TwoparametersinaJPEGheaderarecrucialtothequalityoftheimage;thesizeoftheimageand thesamplingfactor(actuallyfactors)oftheimage. Thesizecanbefoundusingbruteforceandthe samplingfactorsonlytakeonthreedifferentvalues. HenceitispossibletouseanartificialJPEG headertoviewfullofpartsofanimage. TheonlyrequirementisthatthefragmentscontainRST markers. Theresultsoftheevaluationsofthemethodsshowthatitispossibletofind,reassemble,andview JPEGimagefragmentswithhighcertainty. ThisworkhasbeensupportedbyTheSwedishDefenceResearchAgencyandtheSwedishArmedForces. DepartmentofComputerandInformationScience Linköpingsuniversitet SE-58183Linköping,Sweden Acknowledgements This licentiate thesis would not have been written without the invaluable sup- portofmysupervisorProfessorNahidShahmehri. Iwouldliketothankherfor keepingmeandmyresearchontrackandhavingfaithinmewhenthegoinghas been tough. She is a good role model and always gives me support, encourage- ment,andinspirationtobringmyresearchforward. Many thanks go to Helena A, Jocke, Jonas, uncle Lars, Limpan, Micke F, MickeW,Mirko, andMårten. Withouthesitationyouletmeintoyourhomes throughthelensesofyourcameras. Ifapictureisworthathousandwords,Iowe yourmorethanninemillions! IalsoowealotofwordstoBrittanyShahmehri. Herpromptandthoroughproof-readinghasindeedincreasedthereadabilityof mythesis. I would also like to thank my colleagues at the Swedish Defence Research Agency(FOI),myfriendsattheNationalLaboratoryofForensicScience(SKL) andtheNationalCriminalInvestigationDepartment(RKP),andmyfellowPhD studentsattheLaboratoryforIntelligentInformationSystems(IISLAB)andthe DivisionforDatabaseandInformationTechniques(ADIT). Youinspiredmeto embarkonthisjourney. Thankyouall,youknowwhoyouare! AndlastbutnotleastIwouldliketothankmybelovedwifeHelenaandour lovelynewborndaughter. Youbringhappinessandjoytomylife. FinallyIacknowledgethefinancialsupportbyFOIandtheSwedishArmed Forces. MartinKarresand Linköping,14th April2008 Contents 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 ProblemFormulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.5 OutlineofMethod. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.6 OutlineofThesis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 IdentifyingFragmentTypes 9 2.1 CommonAlgorithmicFeatures . . . . . . . . . . . . . . . . . . . . . . 9 2.1.1 Centroid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.2 Lengthofdataatoms . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.3 MeasuringDistance . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 ByteFrequencyDistribution . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 RateofChange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.4 2-Grams. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.5.1 MicrosoftWindowsPEfiles. . . . . . . . . . . . . . . . . . . . 25 2.5.2 Encryptedfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.5.3 JPEGfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.5.4 MP3files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.5.5 Zipfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.5.6 Algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.6 Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.6.1 MicrosoftWindowsPEfiles. . . . . . . . . . . . . . . . . . . . 32 2.6.2 Encryptedfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.6.3 JPEGfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.6.4 MP3files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.6.5 Zipfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.6.6 Algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 i 3 PuttingFragmentsTogether 43 3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.3 ParametersUsed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.3.2 Correctdecoding. . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.3.3 Non-zerofrequencyvalues . . . . . . . . . . . . . . . . . . . . 50 3.3.4 LuminanceDCvaluechains . . . . . . . . . . . . . . . . . . . 51 3.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.4.1 Singleimagereconnection . . . . . . . . . . . . . . . . . . . . . 53 3.4.2 Multipleimagereconnection . . . . . . . . . . . . . . . . . . . 53 3.5 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.5.1 Singleimagereconnection . . . . . . . . . . . . . . . . . . . . . 54 3.5.2 Multipleimagereconnection . . . . . . . . . . . . . . . . . . . 57 4 ViewingDamagedJPEGImages 59 4.1 StartofFrame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.2 DefineQuantizationTable . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.3 DefineHuffmanTable . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.4 DefineRestartInterval . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.5 StartofScan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.6 CombinedErrors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 4.7 UsinganArtificialJPEGHeader . . . . . . . . . . . . . . . . . . . . . 75 4.8 ViewingFragments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5 Discussion 79 5.1 FileTypeCategorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.2 FragmentReconnection . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.3 ViewingFragments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.4 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 6 RelatedWork 85 7 FutureWork 93 7.1 TheFileTypeCategorisationMethod . . . . . . . . . . . . . . . . . . 94 7.2 TheImageFragmentReconnectionMethod . . . . . . . . . . . . . . 95 7.3 ArtificialJPEGHeader . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Bibliography 97 A Acronyms 103 B HardDiskAllocationStrategies 105 C ConfusionMatrices 107 ii List of Figures 2.1 Bytefrequencydistributionof.exe . . . . . . . . . . . . . . . . . . . . 13 2.2 BytefrequencydistributionofGPG . . . . . . . . . . . . . . . . . . . 13 2.3 BytefrequencydistributionofJPEGwithRST . . . . . . . . . . . . 14 2.4 BytefrequencydistributionofJPEGwithoutRST . . . . . . . . . . 15 2.5 BytefrequencydistributionofMP3. . . . . . . . . . . . . . . . . . . . 15 2.6 BytefrequencydistributionofZip . . . . . . . . . . . . . . . . . . . . 16 2.7 RateofChangefrequencydistributionfor.exe . . . . . . . . . . . . 18 2.8 RateofChangefrequencydistributionforGPG . . . . . . . . . . . 18 2.9 RateofChangefrequencydistributionforJPEGwithRST . . . . 19 2.10 RateofChangefrequencydistributionforMP3. . . . . . . . . . . . 20 2.11 RateofChangefrequencydistributionforZip . . . . . . . . . . . . 20 2.12 2-gramfrequencydistributionfor.exe . . . . . . . . . . . . . . . . . . 22 2.13 BytefrequencydistributionofGPGwithCAST5 . . . . . . . . . . 25 2.14 ROCcurvesforWindowsPEfiles. . . . . . . . . . . . . . . . . . . . . 33 2.15 ROCcurvesforanAESencryptedfile . . . . . . . . . . . . . . . . . . 34 2.16 ROCcurvesforfilesJPEGwithoutRST . . . . . . . . . . . . . . . . 34 2.17 ROCcurvesforJPEGwithoutRST;2-gramalgorithm . . . . . . . 35 2.18 ROCcurvesforfilesJPEGwithRST. . . . . . . . . . . . . . . . . . . 36 2.19 ROCcurvesforMP3files . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.20 ROCcurvesforMP3files;0.5%falsepositives. . . . . . . . . . . . . 38 2.21 ROCcurvesforZipfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.22 Contourplotfora2-gramZipfilecentroid . . . . . . . . . . . . . . . 40 3.1 Thefrequencydomainofadataunit . . . . . . . . . . . . . . . . . . . 45 3.2 Thezig-zagorderingofadataunittraversal . . . . . . . . . . . . . . 46 3.3 Thescanpartbinaryformatcoding. . . . . . . . . . . . . . . . . . . . 49 4.1 Theoriginalundamagedimage . . . . . . . . . . . . . . . . . . . . . . . 60 4.2 TheStartOfFrame(SOF)markersegment. . . . . . . . . . . . . . . 60 4.3 Quantizationtableswithswappedsamplerate . . . . . . . . . . . . . 62 4.4 Luminancetablewithhighsamplerate. . . . . . . . . . . . . . . . . . 62 4.5 Luminancetablewithlowsamplerate . . . . . . . . . . . . . . . . . . 64 4.6 Swappedchrominancecomponentidentifiers . . . . . . . . . . . . . 64 4.7 Swappedluminanceandchrominancecomponentidentifiers . . . 65 4.8 Moderatelywrongimagewidth . . . . . . . . . . . . . . . . . . . . . . 65 iii 4.9 TheDefineQuantizationTable(DQT)markersegment . . . . . . 66 4.10 LuminanceDCcomponentsetto0xFF . . . . . . . . . . . . . . . . . 68 4.11 ChrominanceDCcomponentsetto0xFF . . . . . . . . . . . . . . . 68 4.12 TheDefineHuffmanTable(DHT)markersegment . . . . . . . . . 69 4.13 ImagewithforeignHuffmantablesdefinition . . . . . . . . . . . . . 71 4.14 TheDefineRestartInterval(DRI)markersegment. . . . . . . . . . 71 4.15 Shortrestartintervalsetting. . . . . . . . . . . . . . . . . . . . . . . . . 71 4.16 TheStartOfScan(SOS)markersegment . . . . . . . . . . . . . . . . 72 4.17 LuminanceDCHuffmantablesettochrominanceditto . . . . . . 74 4.18 CompleteexchangeofHuffmantablepointers . . . . . . . . . . . . 74 4.19 Acorrectsequenceoffragments . . . . . . . . . . . . . . . . . . . . . . 78 4.20 Anincorrectsequenceoffragments . . . . . . . . . . . . . . . . . . . . 78 5.1 Possiblefragmentparts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 iv

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.