ebook img

A Journey in Creating an Operating System Kernel: The 539kernel Book PDF

215 Pages·2022·1.772 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview A Journey in Creating an Operating System Kernel: The 539kernel Book

A Journey in Creating an Operating System Kernel The 539kernel Book Mohammed Q. Hussain 2022 November CONTENTS 1 1 7 Chapter : Let’s Start with the Bootloader 11 7 . Introduction 12 86 7 . x Assembly Language Overview 121 8 . . Registers 122 9 . . Instruction Set 123 10 . . NASM 13 12 . GNU Make 131 13 . . Makefile 14 15 . The Emulators 15 17 . Writing the Boot Loader 151 18 . . Hard Disk Structure 152 20 . . BIOS Services 153 86 21 . . ALittleBitMoreofx AssemblyandNASM 154 31 . . The Bootloader 2 2 86 38 Chapter : An Overview of x Architecture 21 38 . Introduction 22 86 38 . x Operating Modes 23 41 . Numbering Systems 24 43 . The Basic View of Memory 25 86 45 . x Segmentation 251 45 . . Segmentation in Real Mode 252 49 . . Segmentation in Protected Mode 26 86 61 . x Run-time Stack 261 62 . . The Theory: Stack Data Structure 262 86 65 . . TheImplementation: x Run-timeStack 263 67 . . Calling Convention 264 71 . . Growth Direction of Run-time Stack 265 73 . . TheProblemofResizingtheRun-timeStack 27 86 76 . x Interrupts 271 78 . . Interrupt Descriptor Table 3 3 539 81 Chapter : The Progenitor of kernel 31 81 . Introduction 32 82 . The Basic Code of The Progenitor 321 85 . . Writing the Starter 322 96 . . Writing the C Kernel 33 103 . Interrupts in Practice 331 105 . . Remapping PICs 332 108 . . Writing ISRs and Loading IDT 34 114 . Quick View of the Changes of Makefile 35 115 . A Traditionalist Implementer or a Kernelist? 4 4 117 Chapter : Process Management 2 contents 3 41 117 . Introduction 42 117 . The Most Basic Work Unit: A Process 43 119 . The Basics of Multitasking 431 120 . . Mutliprogramming & Time-Sharing 432 120 . . Process Scheduling 433 121 . . Process Context 434 121 . . Preemptive&CooperativeMultitasking 44 86 122 . Multitasking in x 441 122 . . Task-State Segments 442 86 124 . . Context Switching in x 45 539 124 . Process Management in kernel 451 126 . . Initializing the Task-State Segment 452 127 . . The Data Structures of Processes 453 130 . . Process Creation 454 132 . . The Scheduler 455 141 . . Running Processes 456 143 . . Finishing up Version T 5 5 145 Chapter : Memory Management 51 145 . Introduction 52 146 . Paging in Theory 53 148 . Virtual Memory 54 86 150 . Paging in x 541 151 . . TheStructureofLinearMemoryAddress 542 152 . . Page Directory 543 154 . . Page Table 55 539 155 . Paging and Dynamic Memory in kernel 551 156 . . Dynamic Memory Allocation 552 160 . . Paging 553 167 . . Finishing up Version G 6 6 169 Chapter : Filesystems 61 169 . Introduction 62 170 . ATA Device Driver 621 173 . . The Driver 63 181 . Filesystem 631 539 182 . . The Design of filesystem 632 539 184 . . The Implementation of filesystem 64 199 . FinishingupVersionNEandTestingtheFilesystem 7 7 202 Chapter : What’s Next? 71 202 . Introduction 711 . . The Design of Kernel’s Architecture: Mono- 203 lithic vs. Microkernel 72 205 . In-Process Isolation 721 86 206 . . Lord of x Rings 722 208 . . Endokernel 73 209 . Nested Kernel 74 210 . Multikernel contents 4 75 211 . Dynamic Reconfiguration 76 212 . Unikernel INTRODUCTION 17 In about years ago writing an operating system’s kernel was kind 2 of a dream for me. Before years of that time I just started my jour- ney with the wonderful world of computer science through learning programming for web which made me curios about the different as- pects of computers and of course one of the most interesting of those aspects is operating systems. At that time I wasn’t technically ready yet to write an operating system kernel, so, a number of experiments to achieve that goal failed. After these trials, many years passed, I learned a lot through these years and tackled a number of other sys- tem software (such as compilers, virtual machines and assemblers) to learn how they work and even implemented some too simple versions 2017 of them to make sure that I’ve understand their concepts. In I asked myself, why don’t I implement a simple operating system kernel and achieve one of the oldest thing in my to-do list which was a kind of dream for me? “Fine, but how to make it a useful project for people?” that’s what I told myself as a response. Going through this journey was interesting for me to learn more, but I also wanted to make something that’s useful for someone other than me, and at that moment the idea of this book was born. At that time, I was working on my Master’s degree, so I didn’t have enough time to work on this project and that’s made me to defer the work on it until the late of 2019 and after a lot of torture (sorry! dedication?) this book is finally here. In this book we are going to start a journey of creating a kernel I 539 86 32 called kernel which is a really simple x -bit operating system kernel that supports multitasking, paging and has its own filesystem. 539 I wrote kernel for this book and made it as simple as possible, so, anyone would like to learn about operating system kernels can use 539 kernel to start. Due to that, some of you may notice that some part 539 539 of kernel code is written in a naive way, while writing kernel I focused on the readability and easiness of the code instead of the efficiency. Throughthisjourneyyouraregoingtolearnalotaboutthe basics of operating systems, their kernels and of course the platform 539 that is going to run kernel that we will create together, I mean by 86 the platform the processors that use x architecture. For those who don’t know, an operating system kernel is the core of any operating system and its job is managing the computer hardware and resources, distributetheseresourcesfortherunningprogramsandprovidemany services for those programs to make it easy for them the work with these resources and hardware. 5 contents 6 This book requires a knowledge in C programming language, you know; the basics, its syntax, defining variable and functions, pointers and so on, you don’t need to be a master on C’s libraries for example. 539 Thecompilerusedtocreateandtest kernelisGNUGCC7.5. Also, assembly programming language will be used, but the book doesn’t require a knowledge in this language, every aspect you need to learn 86 539 about x assembly in order to create kernel will be explained in this book. We will use NASM assembler for our assembly code and we will use GNU Make to build our kernel, also, QEMU or Bochs will be used as an emulator to test our work through this journey. All these 1 three tools will be discussed in chapter but you need to set them 539 up in your machine. The full source code of kernel is available in GitHub (https://github.com/MaaSTaaR/539kernel), there are two directories in the root directory, src/ is one that contains the last 539 version of kernel, that is, when you finish this book, the code that you will get will be same as the one in src/. The directory evolution_by_versions/ contains the version of 539kernel while it’s underdevelopmentthroughthedifferentchaptersinthisbook. Finally, I hope that you enjoy reading this book and I would be more than happy to hear your feedback and to help me in spreading this book which is available freely in (http://539kernel.com). acknowledgment 1 IwouldliketothankDr.HussainAlmohri forhiskindacceptanceto read this book before its release and for his encouragement, feedback and discussions that really helped me. Also, I would like to thank my friends Anas Nayfah, Ahmad Yassen, DJ., Naser Alajmi and my dearest niece Eylaf for their kind support. Mohammed Q. Hussain ([email protected]) 16 2022 November Kuwait 1 https://almohri.io/ 1 1 CHAPTER : LET’S START WITH THE BOOTLOADER 11 introduction . Thefirstpiecetostartwithwhenwritinganoperatingsystem’skernel is the boot loader which is the code that is responsible for loading the main kernel from the disk to the main memory so the kernel can be executed. Before getting started in the details of the boot loader and all other parts of the kernel, we need to learn a little bit about the tools (e.g. compilers and programming languages) that we will use in our journey of creating a kernel. In this chapter, we start with an overview on the tools and their basics and then we start in writing a boot loader. 12 x86 assembly language overview . To build a boot loader, we need to use assembly language, also, there are some parts of an operating system kernel that cannot be written in a high-level language and assembly language should be used instead as you will see later in this book, therefore, a basic knowledge of the target architecture assembly is required, in our case, the target 86 architecture of our kernel is x . The program that takes a source code which is written in assembly language and transforms this code to the machine language is known asassembler1. Therearemanyassemblersavailableforx86buttheone that we are going to use is Netwide Assembler (NASM). However, the 86 conceptsofx assemblyarethesame,theyaretighttothearchitecture itself, also the instructions are the same, so if you grasp the basics it 2 will be easy to use any other assembler even if it uses other syntax than NASM. Don’t forget that the assembler is just a tool that helps 86 us to generate an executable x machine code out of an assembly code, so, any suitable assembler that we use to reach our goal will be enough. 1 Whiletheprogramthattransformsthesourcecodewhichiswritteninhigh-level languagesuchasCtomachinecodeisknownascompiler. 2 Another popular open-source assembler is GNU Assembler (GAS). One of main differences between NASM and GAS that the first uses Intel’s syntax while the secondusesAT&Tsyntax. 7 1.2 x86 assembly language overview 8 86 In this section I don’t aim to examine the details of x or NASM, 86 you can consider this section as a quick start on both x and NASM, 86 the basics will be presented to make you familiar with x assembly language, more advanced concepts will be presented later when we 86 need them. If you are interested in x assembly for its own sake, there are multiple online resources and books that explain it in details. 1.2.1 Registers 86 In any processor architecture, and x is not an exception, a register is a small memory inside the processor’s chip. Like any other type of memories (e.g. RAM), we can store data inside a register and we can read data from it, the registers are too small and too fast. The 86 processor architecture provides us with multiple registers. In x there are two types of registers: general purpose registers and special purpose registers. In general purpose registers we can store any kind of data we want, while the special purpose registers are provided by the architecture for some specific purposes, we will encounter the 539 second type later in our journey of creating kernel. 86 x provides us with eight general purpose registers and to use them in order to read from or write to them we refer to them by their names in assembly code. The names of these registers are: EAX, EBX, ECX, EDX, ESI, EDI, EBP, and ESP. While the registers ESI, EDI, EBP and 86 3 ESP are considered as general purpose registers in x architecture , we will see later that they store some important data in some cases and it’s better to use them carefully if we are forced to. 86 The size of each one of x ’s general purpose registers is 32 bits (4 86 bytes) and due to that, they are available only on x processors that 4 4 supports 32-bit architecture such as Pentium for instance. These 86 32-bit registers are not available on x processors that support only 16-bitarchitectureorlower,so,forexample,youcan’tusetheregister 8086 86 EAX in Intel because it is a 16-bit x processor and not 32-bit. 86 In old days, when 16-bit x processors were dominant, assembly programmers used the registers AX, BX, CX and DX and each one of 86 them is of size 16 bits (2 bytes), but when 32-bit x processors came, these registers have been extended to have the size 32-bit and their names were changed to EAX, EBX, ECX and EDX. The first letter E of the new names means extended. However, the old names are still usable in 86 32-bit x processors and they are used to access and manipulate the first 16 bits of the corresponding register, for instance, to access the first 16 bits of the register EAX, the name AX can be used. Furthermore, the first 16 bits of these registers can be divided into two parts and each one of them is of size 8 bits (1 bytes) and has its own name that 3 AccordingtoIntel’smanual. 4 Alsotheyareavailableon64-bitx86CPUssuchasCorei7forinstance. 1.2 x86 assembly language overview 9 Figure1:HowtheRegistersEAX,EBX,ECXandEDXareDividedinx86 can be referred to in the assembly code. The first 8 bits of the register are called the low bits, while the second 8 bits are called the high bits. Let’s take one of these register as an example:AX register is a 16-bit register which is a part of the bigger 32-bit EAX register in 32-bit 5 architecture. AX is divided into two more parts, AL for the low 8 bits as the second letter of the name indicates and AH for the high 8 bits as the second letter of the name indicates. The same division holds true 1 for the registers BX, CX and DX, figure illustrates that division. 1.2.2 Instruction Set Theprocessor’sarchitectureprovidestheprogrammerwithabunchof instructionsthatcanbeusedinassemblycode. Processor’sinstructions 6 resemble functions in a high-level languages which are provided by the libraries, in our case, we can consider the processor as the ultimatelibraryfortheassemblycode. Aswithfunctionsinhigh-level programming languages, each instruction has a name and performs a specific job, also, it can take parameters which are called operands. Depending on the instruction itself, the operands can be a static value (e.g. a number), a register name that the instruction is going to fetch the stored value of it to be used or even a memory location. Theassemblylanguageisreallysimple. Anassemblycodeissimply a sequence of instructions which will be executed sequentially. The following is an example of assembly code, don’t worry about its functionality right now, you will understand what it does eventually. 1 mov ah, 0Eh 2 mov al, ’s’ 3 int 10h As you can see, each line starts with an instruction which is pro- 86 vided to us by x architecture, in the first two lines we use an instruction named mov and as you can see, this instruction receives 5 Orinotherwordsfor32-bitarchitecture: Thefirst16bitsofEAX. 6 OraprocedureforpeoplewhoworkwithAlgol-likeprogramminglanguages. 1.2 x86 assembly language overview 10 two operands which are separated by a comma. In the current usage of this instruction we can see that the first operand is a register name while the second operand is a static value. The third line uses another instruction named int which receives one operand. When this code is running, it will be executed by the processor sequentially, starting from the first line until it finishes in the last line. 86 If you are interested on the available instructions on x , there is 64 32 a four-volumes manual named “Intel® and IA- architectures software developer’s manual” provided by Intel that explains each 7 instruction in details . Assigning Values with mov You can imagine a register as a variable in high-level languages. We can assign values to a variable, we can change its old value and we can copy its value to another variable. In assembly language, these operations can be performed by the instruction mov which takes the value of the second operand and stores it in the first operand. You have seen in the previous examples the following two lines that use mov instruction. 1 mov ah, 0Eh 2 mov al, ’s’ Now you can tell that the first line copies the value 0Eh to the register ah, and the second line copies the character s to the register al. The single quotation is used in NASM to represent strings or characters and that’s why we have used it in the second line, based on that, you may noticed that the value 0Eh is not surrounded by a single quotation though it contains characters, in fact, this value isn’t a string, it is a number that is represented by hexadecimal numbering system and due to that the character h was put in the end of that value, that is, putting h in the end of 0E tells NASM that this value is a hexadecimal number, the equivalent number of 0E in the decimal numbering system, which we humans are using, is 14, that is 0E and 14 are the exactly the same, but they are represented in two different 8 numbering system . 1.2.3 NASM 86 Netwide Assembler (NASM) is an open-source assembler for x architecture which uses Intel’s syntax of assembly language, the other well-known syntax for assembly language is AT&T syntax and, of course, there are some differences between the two, the first syntax is used in the official manuals of Intel. NASM can be used through 7 https://software.intel.com/en-us/articles/intel-sdm 8 Numberingsystemswillbediscussedinmoredetailslater.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.