ebook img

Daan Olbrechts een Spotify-alternatief Container orchestration in Kubernetes voor het schalen van PDF

80 Pages·2012·2.43 MB·Dutch
by  
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Daan Olbrechts een Spotify-alternatief Container orchestration in Kubernetes voor het schalen van

Container orchestration in Kubernetes voor het schalen van een Spotify-alternatief Daan Olbrechts Promotoren: prof. dr. ir. Didier Colle, prof. dr. ir. Mario Pickavet Begeleiders: dr. Wouter Tavernier, Steven Van Rossem, Thomas Soenen Masterproef ingediend tot het behalen van de academische graad van Master of Science in de industriële wetenschappen: informatica Vakgroep Informatietechnologie Voorzitter: Faculteit Ingenieurswetenschappen en Architectuur Academiejaar 2016-2017 Container orchestration in Kubernetes voor het schalen van een Spotify-alternatief Daan Olbrechts Promotoren: prof. dr. ir. Didier Colle, prof. dr. ir. Mario Pickavet Begeleiders: dr. Wouter Tavernier, Steven Van Rossem, Thomas Soenen Masterproef ingediend tot het behalen van de academische graad van Master of Science in de industriële wetenschappen: informatica Vakgroep Informatietechnologie Voorzitter: Faculteit Ingenieurswetenschappen en Architectuur Academiejaar 2016-2017 Woord vooraf Deze masterproef werd uitgevoerd in het kader van de opleiding tot Master of Science in de industriële wetenschappen: informatica aan de Universiteit Gent. Het gekozen onderwerp is gebaseerd op het algemene onderwerp “Implementatie van Elastische Netwerk Services”. Oorspronkelijk heb ik de keuze voor dit onderwerp gemaakt omdat het door mijn interesse in infrastructuur en system engineering uitstekend geschikt leek als masterproef. Achteraf ben ik tevreden met de keuze want door het contact met de behandelde nieuwe technologieën heb ik veel bijgeleerd. Graag wil ik dr. Wouter Tavernier en Steven Van Rossem bedanken voor de regelmatige begeleiding tijdens het uitwerken van dit onderzoek. Hun gerichte vragen en opmerkingen leidden steeds tot nieuwe denkpistes die de kwaliteit van de scriptie verbeterden. Daarnaast wil ik ook prof. dr. ir. Didier Colle bedanken voor de tijd die hij besteed heeft om mijn werk te beoordelen en de nuttige feedback die daaruit voortgekomen is. Een woord van dank gaat verder ook uit aan prof. dr. ir. Mario Pickavet en Thomas Soenen. Toelating tot bruikleen "De auteur geeft de toelating deze masterproef voor consultatie beschikbaar te stellen en delen van de masterproef te kopiëren voor persoonlijk gebruik. Elk ander gebruik valt onder de bepalingen van het auteursrecht, in het bijzonder met betrekking tot de verplichting de bron uitdrukkelijk te vermelden bij het aanhalen van resultaten uit deze masterproef." "The author gives permission to make this master dissertation available for consultation and to copy parts of this master dissertation for personal use. In the case of any other use, the copyright terms have to be respected, in particular with regard to the obligation to state expressly the source when quoting results from this master dissertation." Daan Olbrechts, januari 2017 Container orchestration in Kubernetes voor het schalen van een Spotify- alternatief Daan Olbrechts Promotoren: prof. dr. ir. Didier Colle, prof. dr. ir. Mario Pickavet Begeleiders: dr. Wouter Tavernier, Steven Van Rossem, Thomas Soenen Masterproef ingediend tot het behalen van de academische graad van Master of Science in de industriële wetenschappen: informatica Vakgroep Informatietechnologie Voorzitter: prof. dr. ir. Daniël De Zutter Faculteit Ingenieurswetenschappen en Architectuur Academiejaar 2016-2017 Samenvatting In deze masterproef wordt de music streaming service Koel, een Spotify-alternatief, met behulp van containers geschaald op het open-source cloud-platform Kubernetes. Het doel van de scriptie is om te onderzoeken hoe het opzetten van Kubernetes verloopt, hoe beheerders applicaties op een cluster kunnen uitvoeren en hoe het systeem al dan niet automatisch applicaties kan schalen wanneer dat nodig is. De gekozen applicatie, die oorspronkelijk ontwikkeld is voor persoonlijk gebruik, wordt opgedeeld in verschillende onderdelen en de nodige aanpassingen worden gemaakt om efficiënt schalen mogelijk te maken. Er wordt onderzocht welke planning nodig is voor het opzetten van een Kubernetes cluster, en hoe eventuele problemen die daarbij optreden best aangepakt worden. De applicatie wordt dan op het Kubernetes platform geschaald. Door CPU-verbruik en het aantal binnenkomende verbindingen als indicatoren te beschouwen kan de applicatie eventueel zelfs automatisch opgeschaald worden. Er wordt gekeken naar de mogelijkheden van het ingebouwde automatische schalingsmechanisme en dit wordt vergeleken met een zelfgeschreven algoritme. Daaruit blijkt dat het ingebouwde mechanisme niet voldoende is voor alle mogelijke scenario’s. Tijdens het uitwerken van deze masterproef werd ook een tweede masterproef uitgevoerd die hetzelfde doel had, maar waar een ander platform gebruikt werd: het schalen van Koel met virtuele machines in OpenStack. De verschillen tussen deze twee masterproeven worden uiteindelijk ook vergeleken. Container orchestration in Kubernetes for scaling a Spotify-alternative Daan Olbrechts Supervisor(s): prof. dr. ir. Didier Colle, prof. dr. ir. Mario Pickavet, dr. Wouter Tavernier, Steven Van Rossem, Thomas Soenen Abstract - This thesis researches the capabilities of the Other requests, for non-static information, are served by PHP Kubernetes container orchestration platform to scale a media combined with an NGINX webserver (this will be referred to streaming service comparable to the commercial service Spotify. as the back-end). A custom container image was built to include Keywords - Kubernetes, containers, scaling, media server all required dependencies of the application, including the FFmpeg tool that is used for transcoding. To achieve optimal I. INTRODUCTION results, resources must be allocated correctly to ensure requests that require a dynamic response are handled as fast as possible. This thesis covers the scaling of a network service on an Most of the load generated will be from user-requested music open-source cloud platform. Being able to scale applications in transcoding. By default, the transcoding process generates real-life scenarios is very important when ensuring an output as fast as possible, with the exact rate depending on the acceptable user-experience without overcommitting resources available processing power. This behavior has some negative and incurring excessive costs. Kubernetes was selected because effects when used as-is in this context. It is for example not it is a fast-growing cloud platform for container orchestration always required to transcode more than the user is going to that is already in use by several large solution providers, such consume, which means valuable compute and network as Google, who claim it is ready for production workloads. resources are wasted when trying to get as much transcoded information to the client as fast as possible. Because FFmpeg II. THE APPLICATION tries to process as much data as possible this creates a peak in The application that was chosen for this study is Koel, a CPU-usage until the whole file has been converted. This makes media streaming service built primarily in PHP and VueJS. current CPU-load a worthless metric when assessing whether Koel offers a basic functionality roughly comparable to the the currently allocated compute resources are sufficient for the popular commercial cloud music streaming service Spotify. requests the service must process. The application relies on different components that each To solve these problems an adjustment to the application was require correct configuration to be able to function as intended made to limit the output of the FFmpeg command to the bitrate when containerized and run in an environment such as a requested by the client. The change ensures that the only data Kubernetes cluster. to be transcoded is exactly what the client is ready to consume. A feature offered by Koel is the server-side transcoding of By limiting the output FFmpeg must effectively wait until music using FFmpeg, a widely used open source tool for transcoded data is consumed, which changes its impact on the manipulating media files. FFmpeg is used when converting CPU-load from a short peak to a constant load over the duration media files to formats supported by clients and optionally to of the requested music file. Since the CPU-load now scales lower the bitrate of hosted music, which could be an interesting linearly with every transcoding request it is a viable metric for feature for mobile users who have limited network bandwidth. the state of the back-end. Transcoding is considered a task that requires a high amount of Because the Koel application was not built with scaling in compute resources, putting noticeable load on the CPU, mind, these adjustments are critical to achieve good results on especially when performed for multiple clients concurrently. any cloud platform. When using applications other than Koel Kubernetes will be used to scale the resources allocated to similar adjustments would also be necessary, but the exact perform this task, based on user requirements. changes would depend on the workload the application expects. The components that need to be containerized are the server- side PHP application and the MySQL database it relies on. The III. KUBERNETES SET-UP application is split up in a part that serves static content and Kubernetes was set-up as a bare-metal installation to create another part that handles requests that require PHP application the ideal test-environment. The process for setting up the cloud- logic to generate a response. Since Koel offers a web based platform starts with a planning phase, where all requirements interface new users will first send requests to the application are considered and important decisions concerning the physical server requesting the required files to build the user interface in and logical design of the system are made. The Kubernetes the client-side browser. To make sure no compute resources are cluster in this test-environment consists of one master Node and wasted by having these files served using PHP, which would two non-master Nodes. The master Node has been allocated a cause the entire framework Koel is built on to be loaded, a publicly accessible IPv4 address that will be used to make the separate NGINX instance is used to serve this static service available to the Internet. In the planning phase, it is information (this will be referred to as the front-end of the web decided which subnets to use for inter-node communication, service). Pod-networking and Services. The chosen solution uses the Kubernetes Ingress resource. Ingress allows administrators to create a set of rules to distribute incoming traffic across all Pods associated with a certain Service. A daemon that is included in the Pod that runs the load balancer regularly checks the Kubernetes API-server for changes to the Ingress configuration or to the set of Pods incoming requests will be sent to. The publicly available NGINX ingress controller image was implemented as the load balancer for the Koel application. The default behavior of its configuration was altered to ensure new requests are only sent Figure 1: Cluster network to Pods that currently have less open connections than others. Another important aspect that is covered before starting the actual installation of the cluster is the storage solution that will V. LOAD TESTING be used to accommodate the different components in the cluster To accurately approximate the capabilities of the platform to that require decentralized and uninterrupted access data- deliver requested resources to users in a real-world scenario, a storage. It is evident that the served music must be stored load test was performed. Locust, a load simulation tool, was somewhere but the storage solution must be resilient to Node used to create patterns that match the requests human users failures and must be flexible enough that it can manage frequent would perform by visiting the web application and streaming expansions of the music library with minimal effort for the music from a list. administrator, without causing excessive overhead to Pod start- The patterns are defined as Python scripts that are evaluated up times. GlusterFS, a scalable network filesystem, was chosen by each simulated user at runtime, allowing complicated logic for the task. This distributed storage solution is supported by and dynamic requests to be implemented. This approach to load Kubernetes which makes it easy for administrators to set up testing ensures the most realistic simulated application traffic. new Pods using storage volumes served in a GlusterFS-cluster. To ensure testing is as flexible as possible, server responses Aside from music files, the GlusterFS storage will also be used are interpreted for each simulated user. This also makes sure to store album cover images and more importantly the database every part of the application works as intended, by performing files containing all user- and music-metadata. Storing the all requests similarly to how a real user’s browser would. database files this way ensures that no data will be lost when The front-end Pods serve all HTML, CSS and JavaScript the inevitable Node failure occurs, prompting a restart of the needed to render the client-side user interface. Ideally, these database Pod elsewhere in the cluster. resources are only requested at the start of each user session. An initial test-environment was created using the manual This means the number of requests that front-end Pods can installation guide provided for setting up Kubernetes on bare- respond to are also a limiting factor of the maximum growth for metal hardware. Later the kubeadm tool was used because it the entire service. Another change was made to further increase supports newer versions of Kubernetes and it allows easy the maximum amount of concurrently served clients: by installation of clusters on recent operating systems. merging the different resources to a single file, the overhead The installed cluster was then complemented with extra add- caused by multiple requests is avoided. After this change a ons such as a dashboard which allows for easy administration maximum rate of 133 requests per second was reached, with of the cluster and a cluster resource monitoring solution which the limiting factor being the underlying gigabit network. This is required to use automatic scaling based on resource usage. would allow a total of 133 users each second to download the web interface for the application. IV. DEPLOYING AN APPLICATION IN THE CLUSTER An alternative setup was tried out to allow even more traffic To deploy the application in the Kubernetes cluster the to reach the application. By implementing multiple Ingress required cluster-objects must be created. Pods, which is the controllers in the cluster that are each allow direct access to host term used to describe groups of containers that are scheduled interface on the Node, administrators can increase the available together in Kubernetes, are used as a minimal deployment unit. network bandwidth by adding more host interfaces or by adding The exact configuration of these Pods in the cluster is handled more Nodes to the cluster. Front-end Pods are then hosted on by an external controller which also ensures enough Pods are each Node so that Ingress controllers have local access to these online in the cluster at any time. The controller definition resources. In a test-environment where two test clients have includes all configuration needed for the Pods and the direct access to two Nodes using a Gigabit network a result of containers they manage, including access to network storage. 272 requests per second was achieved, more than doubling the Pods are assigned an IP-address that allows intra-cluster original result. It was also noted that the changes decreased the communication but it is strongly advised to not rely on these overall response time, because traffic is no longer handled by addresses to be able to access an application that runs in a Pod, Kubernetes services and because front-end requests no longer since Pods are assumed to not be durable their IP-addresses must be routed over the cluster-internal network. should not be considered permanently available. The recommended way to publish applications in a Kubernetes cluster is to use the Service-mechanism. This layer 4 solution distributes incoming requests using a round-robin algorithm across a specified set of Pods. For each component type a Service is created to allow reliable communication. Because the front-end and back-end components both present a web service, a load balancer is added to direct the traffic to the right service. Time to first ready Pod when starting concurrent Pods 40 35 30 )s 25 ( e 20 Figure 2: Traffic pattern after changes m iT 15 Back-end Pods handle all other requests which require a 10 dynamic response. The most CPU-intensive requests are those requesting a transcoded music stream. Locust was used to 5 simulate an increasing number of users requesting streams, to 0 find the maximum number of concurrent users a single back- 1 5 20 end Pod can support without causing excessive delays for other Amount of concurrently started Pods users requesting information from the back-end. By observing the time it takes to reply to metadata-requests a maximum of 14 First Pod Ready Last Pod Ready to 15 concurrent streams per Pod was decided. Any amount over 15 concurrent streams causes the entire application to slow down and this constantly increases latencies for all users. Graph 1: Difference in Pod creation time between varying concurrent During the test an increase in CPU-usage of 6% to 7% for each scaling increments new transcoding-process was measured, which matches the The fact that manually scaling up or down requires established concurrency maximum. The exact CPU-usage for a interaction by administrators makes this method less ideal process, and thus the exact maximum number of streams per when hosting applications with unpredictable usage patterns. Pod, depends on the file being transcoded. Differences between Kubernetes includes the Horizontal Pod Autoscaler (HPA) files may cause transcoding of one file to be more intensive which allows automatic scaling of Replication Controllers than another. based on current CPU-load and a specified Target-load. By monitoring the current state of each Pod in a Replication VI. SCALING OPTIONS Controller, the HPA decides to scale up or down at regular After the number of streams concurrently supported by one intervals. Its goal is to change the Pod count so that the current back-end Pod is known, it is necessary to research the load would be closer to the Target-load. This approach is capabilities of the platform to timely scale the available limited by a maximum scaling rate: the HPA limits how fast compute resources to accommodate for increasing user load. scaling is allowed, and how much new Pods are added with Kubernetes is built on the principle of horizontal scaling, which each scaling action. Because these limits cannot be changed by is a strategy that allows more compute resources to be allocated the administrator this makes the HPA very hard to effectively to an application by increasing the amount of execution units, use in a situation where frequent scaling is desired. The in this case Pods. By implementing Resource Controllers to research also determines that CPU-load is not an ideal metric manage Pods, administrators can manually change the number to scale, especially when there are (much) more user requests of replicas for each Pod type. Increasing the number of Pods than the back-end can provide a response for. In this case allows supporting more concurrent streaming requests, while scaling up once will not be sufficient to provide enough decreasing the replica count frees up resources so they can be capacity for all users and since the scaling frequency is limited allocated to other applications or to save on costs. this will cause a lot of user requests to not be answered over a A test to measure the exact time it takes to create new Pods long time span. Using custom metrics in the HPA, such as the was performed. When scaling up Kubernetes immediately actual amount of incoming connections, is not supported. creates the necessary number of Pods, but the exact time until To solve the shortcomings of the HPA a custom scaling they are each in a Ready-state is influenced by the number of mechanism was implemented as a script that runs in a Pod. Pods that were started concurrently. Graph 1 shows an Because of the excellent support for custom built applications overview of the results that were noted when adding 1, 5 and that interact with the Kubernetes cluster it is a feasible option 20 new Pods. Overall creating Pods is a fast process, but it is for administrators to implement custom scaling algorithms that important to consider the differences between scaling steps. are not as limited as the HPA. An example script was created Using a larger increment causes a larger delay for the first Pod that scales Resources Controllers based on the current amount to be ready, but the average time to create a Pod is significantly of incoming connections, which ensures there is enough lower than when each Pod would be started separately. capacity to service the current amount of user requests. The script is then periodically started in a Pod by using the Kubernetes ScheduledJob component, which closely resembles Linux cronjobs. VII. COMPARING KUBERNETES TO OPENSTACK Together with this research, another student worked on a thesis with the same global subject. “Scaling a music streaming service using virtual machines on OpenStack” by Magaly Boddin focusses on scaling the same application in OpenStack using virtual machines. In this other thesis the application setup is mostly the same, with the exception that caching was used to prevent PHP from processing requests for static resources. The back-end results measured in OpenStack virtual machines were also very similar to those measured in Kubernetes Pods. The largest difference between both platforms when scaling applications seems to be that the scaling mechanism supplied with OpenStack offers much more flexibility, which makes the use of an external scaling mechanism unnecessary. A large difference between the platforms in general is the way new features are introduced: while OpenStack follows steady release cycles, with stable versions every 6 months, the Kubernetes release schedule carries a higher pace with new features and bugfixes in new versions almost every week. This high pace has the negative side effect that documentation is sometimes out of date or wrong. Another difference that was noted is the smaller storage footprint for containers. Because every VM image includes an entire OS, something that is not the case in container images, there is a large overhead when storing the images used for the different components of the application. VIII. CONCLUSION Kubernetes as a platform is a mature solution for handling container workloads in a cluster. It is however heavily developed and because a lot can change between versions, including the introduction or deprecation of features, it is important to evaluate the requirements of the application and the solutions Kubernetes can offer to accommodate those. Deploying containers on Kubernetes is very fast which allows for rapid scaling and updating of applications. The included mechanism for automatic scaling is insufficient for applications that expect highly fluctuating user demands, the algorithm used imposes unconfigurable time limits and since custom metrics are currently unsupported the options administrators have for influencing the scaling behavior are very limited. Kubernetes offers great support for developers to interact with the API-server in an easy way, which allows them to create custom scaling mechanisms optimized for their application. It is possible to adjust applications so they can interact with the cluster to scale allocated resources when the application deems it necessary, or an external algorithm can be used to monitor the cluster at regular intervals and perform scaling actions when it detects a compute resource shortage or overage. Inhoudsopgave Inleiding ........................................................................................................................................................ 3 1 Applicatie ............................................................................................................................................. 4 1.1 Koel .............................................................................................................................................. 4 1.2 Architectuur ................................................................................................................................. 4 1.2.1 Front-end en back-end gescheiden ..................................................................................... 4 1.2.2 Front-end statisch maken .................................................................................................... 5 1.2.3 Back-end configuratie voor muziekstreaming ..................................................................... 5 1.2.4 Aanpassing aan de applicatie ............................................................................................... 6 1.3 Containerization ........................................................................................................................... 7 2 Kubernetes ........................................................................................................................................... 9 2.1 Termen ......................................................................................................................................... 9 2.1.1 kubectl .................................................................................................................................. 9 2.1.2 Pod ....................................................................................................................................... 9 2.1.3 Node ................................................................................................................................... 10 2.1.4 Replication Controller ........................................................................................................ 10 2.1.5 Deployment ........................................................................................................................ 10 2.1.6 Service ................................................................................................................................ 10 2.2 Planning...................................................................................................................................... 11 2.2.1 Infrastructuur ..................................................................................................................... 11 2.2.2 Netwerk .............................................................................................................................. 12 2.2.3 Storage ............................................................................................................................... 13 2.2.4 Besturingssysteem ............................................................................................................. 15 2.3 Installatie .................................................................................................................................... 16 2.3.1 Installatie GlusterFS ........................................................................................................... 16 2.3.2 Opzetten Kubernetes op Ubuntu 14.04 LTS ...................................................................... 19 2.3.3 Opzetten Kubernetes op Ubuntu 16.04 LTS ...................................................................... 21 2.3.4 Cluster add-ons .................................................................................................................. 24 3 Applicatie in de cluster ....................................................................................................................... 26 3.1 Databank .................................................................................................................................... 26 3.2 Front-end ................................................................................................................................... 29 3.3 Back-end..................................................................................................................................... 31 3.4 Load balancer ............................................................................................................................. 33 1

Description:
cluster, en hoe eventuele problemen die daarbij optreden best aangepakt worden. De applicatie wordt dan op het . and logical design of the system are made. The Kubernetes decided which subnets to use for inter-node communication, used to create patterns that match the requests human users.
See more

The list of books you might like