Table Of ContentSScchhoollaarrss'' MMiinnee
Masters Theses Student Theses and Dissertations
Fall 2017
AAnnaallyyssiiss ooff oouuttssoouurrcciinngg ddaattaa ttoo tthhee cclloouudd uussiinngg aauuttoonnoommoouuss kkeeyy
ggeenneerraattiioonn
Mortada Abdulwahed Aman
Follow this and additional works at: https://scholarsmine.mst.edu/masters_theses
Part of the Computer Engineering Commons
DDeeppaarrttmmeenntt::
RReeccoommmmeennddeedd CCiittaattiioonn
Aman, Mortada Abdulwahed, "Analysis of outsourcing data to the cloud using autonomous key
generation" (2017). Masters Theses. 7713.
https://scholarsmine.mst.edu/masters_theses/7713
This thesis is brought to you by Scholars' Mine, a service of the Missouri S&T Library and Learning Resources. This
work is protected by U. S. Copyright Law. Unauthorized use including reproduction for redistribution requires the
permission of the copyright holder. For more information, please contact scholarsmine@mst.edu.
ANALYSISOFOUTSOURCINGDATATOTHECLOUDUSING
AUTONOMOUSKEYGENERATION
by
MORTADAABDULWAHEDAMAN
ATHESIS
PresentedtotheGraduateFacultyofthe
MISSOURIUNIVERSITYOFSCIENCEANDTECHNOLOGY
InPartialFulfillmentoftheRequirementsfortheDegree
MASTEROFSCIENCE
in
COMPUTERENGINEERING
2017
Approvedby
Dr. EgemenK.Çetinkaya,Advisor
Dr. MaciejJ.Zawodniok
Dr. SanjayK.Madria
Copyright2017
MORTADAABDULWAHEDAMAN
AllRightsReserved
iii
ABSTRACT
Cloud computing, a technology that enables users to store and manage their data at
a low cost and high availability, has been emerging for the past few decades because of the
many services it provides. One of the many services cloud computing provides to its users
is data storage. The majority of the users of this service are still concerned to outsource
their data due to the integrity and confidentiality issues, as well as performance and cost
issues, that come along with it. These issues make it necessary to encrypt data prior to
outsourcingittothecloud. However,encryptingdatapriortooutsourcingmakessearching
the data obsolete, lowering the functionality of the cloud. Most existing cloud storage
schemes often prioritize security over performance and functionality, or vice versa. In this
thesis, the cloud storage service is explored, and the aspects of security, performance, and
functionality are analyzed in order to investigate the trade-offs of the service. DSB-SEIS,
a scheme with encryption intensity selection, an autonomous key generation algorithm
that allows users to control the encryption intensity of their files, as well as other features
is developed in order to find a balance between performance, security, and functionality.
The features that DSB-SEIS contains are deduplication, assured deletion, and searchable
encryption. Theeffectofencryptionintensityselectiononencryption,decryption,andkey
generation is explored, and the performance and security of DSB-SEIS are evaluated. The
MapReduce framework is also used to investigate the DSB-SEIS algorithm performance
with big data. Analysis demonstrates that the encryption intensity selection algorithm
generates a manageable number of encryption keys based on the confidentiality of data
whilenotaddingsignificantoverheadonencryptionordecryption.
iv
ACKNOWLEDGMENTS
I would like to thank Dr. Egemen K. Çetinkaya, Dr. Sanjay K. Madria, and Dr.
Maciej J. Zawodniok for their feedback and continuous help with this work. Furthermore,
I thank the CoNetS research group for listening and giving feedback to the ideas presented
inthisdocument.
This thesis work was supported by the Department of Electrical and Computer
EngineeringatMissouriUniversityofScienceandTechnologybyprovidingfundingthrough
agraduateteachingandresearchassistantship.
IwouldliketothankMissouriUniversityofScienceandTechnology’sITdatabase
team and Perry Koob for extensive help with providing required equipment and support
during the experimentation phase of this thesis. I would also like the doctoral student,
KatrinaWardforsupportingtheideaspresentedinthiswork.
Finally, I would like to thank ACM, NSF/CANSec, and NSF/Raytheon BBN Tech-
nologies for providing travel grants to attend DCC 2016, CANSec 2016, and GENI-NICE
2016,respectively.
v
TABLEOFCONTENTS
Page
ABSTRACT .................................................................. iii
ACKNOWLEDGMENTS ...................................................... iv
LISTOFILLUSTRATIONS.................................................... viii
LISTOFTABLES............................................................. ix
SECTION
1. INTRODUCTIONANDMOTIVATION ..................................... 1
1.1. CONTRIBUTIONS.............................................................. 5
1.2. PUBLICATIONS ................................................................ 6
1.3. ORGANIZATIONOFTHESIS ................................................. 7
2. BACKGROUNDANDRELATEDWORK ................................... 8
2.1. PRELIMINARIES ............................................................... 8
2.1.1. Cryptography ............................................................ 8
2.1.2. Deduplication............................................................ 9
2.1.3. SearchableEncryption .................................................. 9
2.2. CLOUDSTORAGESERVICE ................................................. 9
2.3. DEDUPLICATION .............................................................. 14
2.4. SEARCHINGENCRYPTEDDATA............................................ 15
2.5. MAPREDUCE ................................................................... 18
2.6. CLOUDLABTESTBED ........................................................ 19
vi
3. ARCHITECTURE ......................................................... 20
3.1. WORKFLOW .................................................................... 20
3.2. ENCRYPTIONINTENSITYSELECTION.................................... 23
3.2.1. LowIntensity ............................................................ 24
3.2.2. MediumIntensity........................................................ 24
3.2.3. HighIntensity............................................................ 25
3.3. DEDUPLICATION .............................................................. 25
3.4. ASSUREDDELETION ......................................................... 27
3.5. SEARCHINGENCRYPTEDDATA............................................ 28
3.5.1. KeywordSearchinEncryptedIndex ................................... 29
3.5.2. DuplicateCheckingofEncryptedData ................................ 30
3.6. MAPREDUCEAPPLICATIONS............................................... 30
3.6.1. Indexing.................................................................. 31
3.6.2. IndexSearch ............................................................. 32
3.6.3. DiskSearch .............................................................. 34
4. RESULTS................................................................. 36
4.1. WIKIMEDIADUMP............................................................ 37
4.2. DEDUPLICATIONPERFORMANCE......................................... 39
4.3. KEYGENERATIONANALYSIS .............................................. 39
4.4. ENCRYPTIONPERFORMANCE.............................................. 41
4.5. DECRYPTIONPERFORMANCE.............................................. 42
4.6. INDEXINGPERFORMANCE.................................................. 44
4.7. SEARCHPERFORMANCE .................................................... 44
4.7.1. IndexSearch ............................................................. 44
4.7.2. DiskSearch .............................................................. 46
4.8. MAPREDUCEAPPLICATIONS............................................... 47
vii
4.8.1. IndexingPerformance................................................... 47
4.8.2. IndexSearchPerformance .............................................. 49
4.8.3. DiskSearchPerformance ............................................... 51
5. CONCLUSIONS........................................................... 53
6. FUTUREWORK .......................................................... 56
APPENDIX................................................................... 58
REFERENCES................................................................220
VITA.........................................................................227
viii
LISTOFILLUSTRATIONS
Figure Page
3.1 DSB-SEISarchitecture............................................................... 21
3.2 Clientapplicationflowchart.......................................................... 21
3.3 Cloudapplicationflowchart.......................................................... 22
3.4 Encryptionintensityselectionexample ............................................. 25
3.5 MapReduceindexingapplicationworkflow......................................... 31
3.6 MapReduceindexsearchapplicationworkflow .................................... 33
3.7 MapReducedisksearchapplicationworkflow...................................... 34
4.1 WikiMediaworddistribution........................................................ 37
4.2 Deduplicationoverhead.............................................................. 38
4.3 Sizeofdata ........................................................................... 38
4.4 Keygeneration ....................................................................... 40
4.5 Encryptionperformance ............................................................. 41
4.6 Decryptionperformance ............................................................. 43
4.7 Indexperformance ................................................................... 43
4.8 Indexsearchperformance............................................................ 45
4.9 Searchhits ............................................................................ 45
4.10 Disksearchperformance............................................................. 46
4.11 MapReduceindexingperformance.................................................. 48
4.12 MapReduceindexsearchperformance.............................................. 49
4.13 Indexsearchhits...................................................................... 50
4.14 MapReducedisksearchperformance ............................................... 51
ix
LISTOFTABLES
Table Page
2.1 Comparisonofcloudstorageschemes .............................................. 10
4.1 Testingvariables...................................................................... 36
Description:It has been accepted for inclusion in Masters Theses by an that enables sub-string search while data is encrypted is introduced [48] the client, and once a command is received, it proceeds to perform the task requested. addition, developers are allowed to implement an interface for splitting the