Hardware Acceleration of Face Detection Module for Mobility Assistant for Visually Impaired Thesis submitted by Saurabh Suresh Agrawal 2015JVL2553 under the guidance of Prof. M. Balakrishnan Dr. Chetan Arora in partial fulfilment of the requirements for the award of the degree of Master of Technology VLSI Design Tools and Technology Indian Institute of Technology Delhi June 2017 Abstract Assisted living with MAVI system is the facility for visually impaired people for helping them in their day-to-day life. MAVI helps them as an outdoor navigation system, provides them safety from animals and helps in social inclusion. Face detection and recognition is an integral part of the assistant devices. It enables social inclusion for the visual impaired community. The existing algorithms for face detection are not fast enough for the real-time applica- tions. In this thesis, we propose a novel hardware/software co-design for the popular Viola-Jones algorithm. Proposed design is implemented using high- level synthesis techniques. Further we discuss in detail, the trade-offs and techniques for the memory bandwidth limitations during the transfer of im- age data from the processor to FPGA. We also include various architecture exploration for real-time application of the design. We have used XC7020 Zynq based ZedBoard for the evaluation of the im- plementation. On the target device, initially we achieved 0.43 frame per second(fps) with processor running at 667 MHz. Subsequently, after hard- ware acceleration we achieved 1.1 frame per second(fps) with the ARM Cortex-A9 processor running at 667 MHz and programmable logic with 147 MHz clock frequency with a speed-up factor 2.73x. ii Certificate This is to certify that the thesis titled Hardware Acceleration of Face Detection Module for Mobility Assistant for Visually Impaired be- ing submitted by Saurabh S. Agrawal for the award of Master of Tech- nology in VLSI Design Tools & Technology is a record of bona-fide work carried out by him under my guidance and supervision at the Pro- gram of VLSI Design Tools & Technology. The work presented in this thesis has not been submitted elsewhere either in part or full, for the award of any other degree or diploma. Prof. M. Balakrishnan Department of Computer Science and Engineering Indian Institute of Technology, Delhi Dr. Chetan Arora Department of Computer Science and Engineering Indraprastha Institute of Information Technology, Delhi © 2017, Indian Institute of Technology Delhi iii Acknowledgments I would take an opportunity express my sincere gratitude to my guide and supervisor Prof. M. Balakrishnan and co-supervisor Dr. Chetan Arora for their invaluable guidance, constant supervision, continuous encouragement and provision of freedom during all the stages of this work made me learn valuable lessons which would help me in my future career. I express my sincere gratitude to Mr. Rajesh Kedia and Mr. Anupam Sobti for devoting their precious time in discussing ideas with me and giving theirvaluablefeedback. Theirencouragement,guidanceandsuggestionshave contributed immensely to the evolution of my ideas on the project. MythanksandappreciationsalsoextendedtoMr. Suman Muralikrishnan, every member of MAVI team and people who have willingly helped me with their abilities. IthanktoMr. S. D. Sharma forprovidingfullsupportintermsoflogistics. I would like to dedicate this thesis to my amazingly loving and supportive parents who have always provided me unconditional love. They have always supported me and have been with me, no matter where I am. My little sister, no matter where you are around the world, you are always with me!! Saurabh Agrawal © 2017, Indian Institute of Technology Delhi Contents Abstract ii Acknowledgment iv List of Figures viii List of Tables ix 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 MAVI system . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Thesis organization . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Literature Survey 6 2.1 Face detection approach . . . . . . . . . . . . . . . . . . . . . 6 2.1.1 Related work . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Face recognition approach . . . . . . . . . . . . . . . . . . . . 8 3 Tools and Techniques 9 3.1 ZedBoard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1.1 Linaro . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.1.2 Ramdisk . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.1.3 Xillinux . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 Vivado HLS and SDSoC . . . . . . . . . . . . . . . . . . . . . 13 © 2017, Indian Institute of Technology Delhi v 4 Software Implementation 15 4.1 Overall approach . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Face detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2.1 AdaBoost learning algorithm . . . . . . . . . . . . . . 18 4.2.2 Features . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2.3 Cascade classifier . . . . . . . . . . . . . . . . . . . . . 19 4.2.4 Integral image . . . . . . . . . . . . . . . . . . . . . . . 19 4.2.5 Implementation of face detection . . . . . . . . . . . . 20 4.3 Face recognition . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.3.1 Face database . . . . . . . . . . . . . . . . . . . . . . . 22 4.4 Porting on ZedBoard . . . . . . . . . . . . . . . . . . . . . . . 24 5 Hardware/Software Co-Design 26 5.1 Flattened algorithm . . . . . . . . . . . . . . . . . . . . . . . . 27 5.2 Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.2.1 Steps for profiling . . . . . . . . . . . . . . . . . . . . . 29 5.3 Challenges in hardware acceleration . . . . . . . . . . . . . . . 31 6 Hardware Implementation 33 6.1 Overall architecture . . . . . . . . . . . . . . . . . . . . . . . . 33 6.2 Integral image . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 6.3 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 6.4 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 7 Design Decisions 46 7.1 Data type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 7.2 Data structures . . . . . . . . . . . . . . . . . . . . . . . . . . 47 7.3 Integral window buffering . . . . . . . . . . . . . . . . . . . . 47 © 2017, Indian Institute of Technology Delhi vi 7.4 Feature rectangle profiling . . . . . . . . . . . . . . . . . . . . 49 7.5 Timing analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 50 7.6 Stage profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 8 Results and Observations 54 8.1 Software implementation . . . . . . . . . . . . . . . . . . . . . 54 8.2 Hardware implementation . . . . . . . . . . . . . . . . . . . . 56 8.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 9 Extensibility and future work 59 References 62 © 2017, Indian Institute of Technology Delhi List of Figures 1.1 MAVI system . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 MAVI system implementation . . . . . . . . . . . . . . . . . . 3 3.1 ZedBoard Block Diagram . . . . . . . . . . . . . . . . . . . . 10 3.2 ZedBoard Block Diagram . . . . . . . . . . . . . . . . . . . . 11 3.3 Xillybus based interface between PS and PL . . . . . . . . . . 12 3.4 Vivado HLS design flow . . . . . . . . . . . . . . . . . . . . . 13 3.5 SDSoC Environment Flow . . . . . . . . . . . . . . . . . . . . 14 4.1 Overall approach for face detection and recognition module . . 16 4.2 Haar Features . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.3 Cascade classifier . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.4 Result of face detection on the image from database . . . . . 21 4.5 Samples from face recognition database . . . . . . . . . . . . . 23 (a) Label 1 with shadow . . . . . . . . . . . . . . . . . . 23 (b) Label 1 with sunny . . . . . . . . . . . . . . . . . . . 23 (c) Label 2 with shadow . . . . . . . . . . . . . . . . . . 23 (d) Label 2 with sunny . . . . . . . . . . . . . . . . . . . 23 5.1 Flattened algorithm of face detection . . . . . . . . . . . . . . 28 5.2 Flat profile using gprof profiler . . . . . . . . . . . . . . . . . 29 5.3 Call graph statistics using gprof profiler . . . . . . . . . . . . 30 6.1 Hardware/software partitioned architecture of face detection . 33 6.2 Dataflow of detect object function . . . . . . . . . . . . . . . . 34 6.3 Architecture for generating integral image window [1] . . . . . 36 6.4 Integrated architecture . . . . . . . . . . . . . . . . . . . . . . 42 © 2017, Indian Institute of Technology Delhi viii 6.5 Modified architecture . . . . . . . . . . . . . . . . . . . . . . . 44 7.1 Feature rectangle profile . . . . . . . . . . . . . . . . . . . . . 49 7.2 Feature rectangle profiling for different stages . . . . . . . . . 50 7.3 Cascade classifiers profiling (plot-I) . . . . . . . . . . . . . . . 52 7.4 Cascade classifiers profiling (plot-II) . . . . . . . . . . . . . . . 53 8.1 Distribution curve for performance with different scaling factor 54 8.2 Run time versus accuracy performance chart for different scal- ing factor (Chart-I) . . . . . . . . . . . . . . . . . . . . . . . . 55 8.3 Run time versus accuracy performance chart for different scal- ing factor (Chart-II) . . . . . . . . . . . . . . . . . . . . . . . 55 8.4 Design performance . . . . . . . . . . . . . . . . . . . . . . . . 57 8.5 Sample result for face detection . . . . . . . . . . . . . . . . . 58 8.6 Sample results for face detection and recognition . . . . . . . . 58 © 2017, Indian Institute of Technology Delhi List of Tables 4.1 Face detection parameters . . . . . . . . . . . . . . . . . . . . 16 4.2 Details of face recognition database . . . . . . . . . . . . . . . 23 6.1 Dimensions of integral image buffers . . . . . . . . . . . . . . 38 6.2 Details of classifier for face detection . . . . . . . . . . . . . . 41 7.1 Partitioning int win buffer . . . . . . . . . . . . . . . . . . . . 48 8.1 Resource utilization of face detection hardware accelerator . . 56 8.2 Resource utilization of integral win function . . . . . . . . . . 56 8.3 Resource utilization of classify function . . . . . . . . . . . . . 56 © 2017, Indian Institute of Technology Delhi
Description: