Short-term Memory for the C Programming Language MASTER’S THESIS to obtain the Master’s degree at the Faculty of Natural Sciences of the University of Salzburg submitted by Martin Aigner, B.Eng. Academic supervisor Univ.-Prof. Dr.-Ing. Dipl.-Inform. Christoph M. Kirsch Department of Computer Sciences Salzburg, August 2012 Acknowledgments Thanks to Professor Christoph Kirsch and his group for guidance and motivation. I had a great time working on my masters thesis and you inspired me to give my best. My special thanks goes to: Ingrid and Manfred for roots and for wings. To Christine, after all, for so many things. To my family and friends, who helped me getting through, I wouldn’t have made it, without anyone of you! Abstract Short-term memory is a memory model for dynamic heap management [1]. Unlike the traditional persistent memory model implemented in explicit or implicit memory man- agement systems, objects allocated in short-term memory expire after a finite amount of time or may be refreshed to extend their lifetime. We propose a concurrent, incremental, and non-moving implementation of short-term memory for the C programming language called self-collecting mutators which relies on explicit refreshing information and utilizes an explicit dynamic memory allocator. We present an extensive performance evaluation of our implementation using a new benchmark tool called ACDC. Based on size and lifetime characteristics of dynamic objects in allocation-intensive C programs, ACDC models a representative workload for dynamic memory allocators. We empirically verify a small and constant per-object time and space overhead of our implementation and show competitive performance to the persistent heap management model. Keywords: Short-term Memory, Dynamic Heap Management, Multi-threaded Alloca- tor Benchmark Contents 1. Introduction 9 1.1. Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.2. Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2. Process Memory Concepts 12 2.1. Explicit Dynamic Memory Management . . . . . . . . . . . . . . . . . . 13 2.1.1. Sources of Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2. Garbage Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2.1. Reference Counting Collectors . . . . . . . . . . . . . . . . . . . . 18 2.2.2. Tracing Collectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.3. Comparison of Reference Counting and Tracing Collectors . . . . 20 2.2.4. Conservative Garbage Collection . . . . . . . . . . . . . . . . . . 22 2.2.5. Drawbacks of Garbage Collection . . . . . . . . . . . . . . . . . . 23 2.3. The Persistent Memory Model . . . . . . . . . . . . . . . . . . . . . . . . 24 2.4. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3. Short-term Memory Model 26 3.1. Single-threaded Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.2. Multi-threaded Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2.1. Global-Time Management . . . . . . . . . . . . . . . . . . . . . . 30 3.3. Sources of Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.4. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4. Self-collecting Mutators in C 34 4.1. Design Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.1.1. Backwards Compatibility . . . . . . . . . . . . . . . . . . . . . . . 35 4.1.2. Representation of Expiration Dates . . . . . . . . . . . . . . . . . 35 4.1.3. Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Contents 4.1.4. Allocator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.2. Operations of Self-collecting Mutators . . . . . . . . . . . . . . . . . . . . 38 4.2.1. Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.2.2. Time Progress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.2.3. Expiration Extensions . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3. Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.3.1. Descriptor Root . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.3.2. Descriptor Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.3.3. Descriptor Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.3.4. Descriptor Page List . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.3.5. Expired Descriptor Page List . . . . . . . . . . . . . . . . . . . . 49 4.3.6. Dynamically Allocated Management Data . . . . . . . . . . . . . 49 4.4. Blocking Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.5. Debug Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 5. ACDC Benchmark 53 5.1. Characteristics of Dynamic Objects in C Programs . . . . . . . . . . . . 54 5.2. A Notion of Time for a Mutator . . . . . . . . . . . . . . . . . . . . . . . 55 5.3. Modeling the Workload . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.3.1. ACDC Runtime Options . . . . . . . . . . . . . . . . . . . . . . . 56 5.3.2. Single Mutator Behavior . . . . . . . . . . . . . . . . . . . . . . . 58 5.3.3. Multi Mutator Behavior . . . . . . . . . . . . . . . . . . . . . . . 63 5.4. Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 5.4.1. Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 5.5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 6. Experimental Evaluation 70 6.1. Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 6.1.1. Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . 70 6.1.2. Technical Measurement Details . . . . . . . . . . . . . . . . . . . 70 6.2. Workload Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 6.2.1. Services Exercised . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 6.2.2. Level of Detail . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 6.2.3. Representativeness . . . . . . . . . . . . . . . . . . . . . . . . . . 74 6.2.4. Timeliness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 7 Contents 6.3. Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 6.3.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 6.3.2. Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 6.3.3. Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 6.4. Evaluation of the Important Factors of LIBSCM . . . . . . . . . . . . . . 82 6.4.1. Collection Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . 82 6.4.2. Number of Threads . . . . . . . . . . . . . . . . . . . . . . . . . . 83 6.4.3. Time Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 6.5. Throughput of LIBSCM Compared to the Persistent Memory Model . . . 90 6.6. Approximation Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 6.7. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 7. Conclusion 98 7.1. Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 A. Source Code of LIBSCM 108 A.1. stm.h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 A.2. scm-desc.h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 A.3. stm-debug.h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 A.4. meter.h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 A.5. arch.h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 A.6. scm-desc.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 A.7. descriptor page list.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 A.8. meter.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 A.9. finalizer.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 A.10.Makefile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 8 1. Introduction Dynamic heap management systems provide an interface to receive (allocate) and return (free) memory objects at runtime. There are two types of interfaces, namely explicit interfaces and implicit interfaces. Both types receive objects through an allocation rou- tine. They differ in the way of returning memory objects. Explicit memory management requires the programmer to explicitly return an object to the management system. This isfastbuterrorpronebecauseitreliesoninformationprovidedbytheprogrammerwhich may be incorrect. Especially in multi-threaded applications concurrent reasoning about the last use of an object is a non-trivial problem. Implicit memory management systems, also known as garbage collectors, solve this problem at the expense of runtime overhead and increased complexity. Also, languages that lack type safety, e.g. the C program- ming language, increase the difficulty to implement implicit memory management. Both approaches guarantee dynamic objects to be persistent for the time between receiving and returning the object. Therefore, we call the model where objects are explicitly or implicitly returned the persistent memory model. Short-term memory [1] is a memory model where objects expire after a finite amount of timewhichmakesitunnecessarytoreturndynamicobjectseitherexplicitlyorimplicitly. In this work we present a concurrent, incremental, and non-moving implementation of short-term memory called self-collecting mutators followed by an extensive performance analysis. “There is as yet no standard suite of benchmarks for evaluating multi- threaded allocators.” [2] This statement from Berger et al. has its origin in the year 2000 and is still valid today. Most recent performance evaluations of dynamic memory allocators are based on simple simulations of allocations patters, e.g., the Larson benchmark [3], or using allocation traces. We present a new benchmark tool designed to create a representative workload 1. Introduction based on lifetime and size characterisitics of dynamic objects observed in allocation intensive applications in [4] and [5] and extend these characteristics to multi-threaded allocation patterns including shared objects. 1.1. Contributions The main contribution of this work is an efficient and scalable, concurrent implementa- tion of self-collecting mutators for the C programming language. Furthermore we con- tribute a multi-threaded memory allocator benchmark tool called ACDC which models a typical allocator workload. We use ACDC to perform a detailed performance evaluation of our implementation of short-term memory. 1.2. Outline of the Thesis The thesis starts with an overview of process memory concepts, i.e., the way a pro- cess organizes its address space. After that we describe the short-term memory model and present the implementation self-collecting mutators. Then we introduce ACDC, a multi-threaded allocator benchmark tool that we use for the final part of the thesis; an extensive performance evaluation of our implementation. Chapter 1, Introduction: We first present te problem and give an outline of the thesis. Chapter 2, Process Memory Concepts: In this chapter we given an overview of existingtechniquestomanagedynamicmemoryobjectsanddiscussthenotionofliveness of such objects. Chapter 3, Short-term Memory Model: Here we present the short-term memory model for managing dynamic objects. We explain the meaning of expiration dates in the context of thread-local and multi-threaded time management. Chapter 4, Self-collecting Mutators inC:Inthischapterwedescribeourconcurrent implementationofshort-termmemoryinC.Wedefinethememorymanagementroutines and the data structures that enable full incrementality and constant-time operations. 10
Description: