ebook img

Archer Analysis Pipeline Installation and Usage Guide PDF

40 Pages·2014·25.5 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Archer Analysis Pipeline Installation and Usage Guide

Archer Analysis Pipeline Installation and Usage Guide IFU-008.1 Revision A1 Table of Contents Table of Contents ...................................................................................................................... 2   1   Introduction and Summary ................................................................................................. 3   1.1   Introduction .................................................................................................................. 3   2   Installation ........................................................................................................................... 3   2.1   Installation of the Web interface version ..................................................................... 3   2.1.1   Installation of Oracle VirtualBox .......................................................................... 3   2.1.2   Configuration of Virtual Box ................................................................................ 3   2.1.3   Installing the Virtual Machine(s) .......................................................................... 4   2.1.4   Management of the Virtual Machine instances .................................................... 6   2.2   Installation of the command line version ..................................................................... 7   2.2.1   Dependencies for the command line installation .................................................. 7   2.2.2   Compilation of BWA ............................................................................................ 8   3   Execution ............................................................................................................................ 9   3.1   Run an analysis using the web interface ...................................................................... 9   3.1.1   Create a login account ........................................................................................... 9   3.1.2   Run an analysis ................................................................................................... 11   3.1.3   Retrieving results from an analysis ..................................................................... 15   3.1.4   Full Summary pages ........................................................................................... 17   3.1.5   Adding custom content ....................................................................................... 18   3.2   Command Line execution .......................................................................................... 20   3.2.1   Configuration file format .................................................................................... 20   3.2.2   For Life Technologies/Ion Torrent PGM/Proton users ....................................... 21   3.2.3   Barcodes file ....................................................................................................... 21   3.3   Archer command line execution ................................................................................ 22   4   Description of Output Files and Formats .......................................................................... 23   4.1   Human Readable Format ........................................................................................... 24   4.1.1   File Name ............................................................................................................ 24   4.1.2   Sample name ....................................................................................................... 24   4.1.3   QC Filter ............................................................................................................. 24   4.1.4   Mapping statistics ............................................................................................... 24   4.1.5   Individual Gene Coverage .................................................................................. 27   4.1.6   Fusion results ...................................................................................................... 28   4.1.7   Full Example of the Human Readable Format Results File ............................... 29   4.2   Computer Readable Format ....................................................................................... 30   4.2.1   Full Example of a Machine Readable Results Format File ................................ 34   2 | P a g e Archer Analysis Pipeline Installation and Usage Guide IFU-008.1 Rev. A1 1 Introduction and Summary 1.1 Introduction This document describes the installation and running procedure for the Archer Analysis Pipeline (AAP). The analysis pipeline is available as a fully contained set of VMs (Virtual Machines) that provides a web-based interface to the analysis pipeline as well a set of command line utilities for integration into existing pipelines. Only one of the two approaches should be implemented, there is no need to do both. The VM approach is highly recommended, since it avoids the need to satisfy all the dependencies. 2 Installation 2.1 Installation of the Web interface version To be able to run the Archer Analysis Pipeline through the Virtual Machines, it needs a rather large computer, with at least 7 GB of free memory. It is also highly recommended that the hosting computer is not restarted while an analysis is running as this will invalidate any running analyses. It’s also important to prevent the analysis server from entering a sleep, standby, or hibernation state as this will interrupt and possibly invalidate any running analyses. The Virtual Machines can run on a number of different Virtual Machine implementations, such as VMWare but we recommend and have tested the free VM implementation from Oracle; Virtual Box on Mac OSX (10.9). 2.1.1 Installation of Oracle VirtualBox VirtualBox can be obtained from the VirtualBox website: https://www.virtualbox.org/wiki/Downloads Download the version appropriate for your platform and run it. 2.1.2 Configuration of Virtual Box After launching VirtualBox, open the Preferences to configure the Network adapters. Menu VirtualBox -> Preferences Select the Network settings and select the “Host-only Network” 3 | P a g e Archer Analysis Pipeline Installation and Usage Guide IFU-008.1 Rev. A1 Add a host only network by clicking the Add Network button ( ). A new Host-only network called “vboxnet0” will be added. Leave the default name since the VM will assume this as the name. Press OK to finish the configuration of VirtualBox. 2.1.3 Installing the Virtual Machine(s) The Archer Analysis Pipeline is provided as a set of two .OVA files, one of which (the compute node) is optional. The Head node is required and contains the Web interface and the analysis code as well. The compute node Virtual Machine can be used to provide an additional compute node, allowing for multiple runs to be analyzed in parallel. The VMs can be downloaded from the Enzymatics website in the “Archer” section: http://enzymatics.com/archer 1. Head Node Virtual Machine (REQUIRED) 2. Compute Node Virtual Machine (OPTIONAL) For the documentation below, we assume both the Head node and a compute node are installed. Save the OVA file(s) to your machine and select VirtualBox menu item: “File” -> “Import Appliance” 4 | P a g e Archer Analysis Pipeline Installation and Usage Guide IFU-008.1 Rev. A1 (Virtual Machines are called “Appliances” in VirtualBox). Select the Head node VM OVA file that was downloaded from the link and import the file. This will take between 1-5 minutes, depending on the speed of the host computer. The Head node appliance (VM) is called “analysis-head”. Optional If more compute power is required, download the Compute Node VM and import the appliance in the same manner. The compute node will be called “analysis- compute-1”. This will require a machine of sufficient memory (at least 14GB total – 7 GB for the head, and 7 GB secondary compute node). Click the Start button or double-click the appliance to start the Archer Analysis Pipeline Head node (analysis-head). A window will appear that shows it is starting up a new machine. Once the login screen appears as shown below, start the optional compute node appliance (analysis-compute-1). After the login screen appears for this virtual machine, the Archer Analysis Pipeline is ready for use. 5 | P a g e Archer Analysis Pipeline Installation and Usage Guide IFU-008.1 Rev. A1 It is likely that some notifications are overlaid on the screen as shown below. These can be ignored and removed by clicking the small x in the notification windows. The Archer Analysis Pipeline is now ready for use. See section 3.1 to start an analysis using the web interface. 2.1.4 Management of the Virtual Machine instances Occasionally it may be required to manage the Virtual Machine instance through the Linux command line. To login into the Virtual Machine using the following credentials: Username: root Password: password123 This user is the so-called superuser account and will be able to perform the various task required. NOTE: This account has absolute control over the virtual machine, so utilizing this account should be done with care. Consult your local IT Department for proper use of this account. For security reasons, it’s recommended that you change this password. 6 | P a g e Archer Analysis Pipeline Installation and Usage Guide IFU-008.1 Rev. A1 Location of the web interface code The web interface application code is location in the following directory: /var/www/html/ Occasionally Enzymatics may release a new version of the web interface and/or the underlying analysis pipeline and we will provide instructions for the update at that time. The analysis results are located in the following directory: /var/www/analysis/ This contains a number of directories, each numbered with the Analysis ID as seen in the Web interface. The numbered directories contain all the results and intermediate files. 2.2 Installation of the command line version The Archer Analysis Pipeline command line version is provided as a downloadable TAR package and can be found on the Enzymatics website: http://enzymatics.com/archer • Archer Analysis Pipeline Version 1.0.0 (It is possible that a more recent version other than version 1.0.0 is available on the website. Download the most recent version) Click on the link to download and extract the packages with the following command: $ tar zxf archer_1.0.0.tar.gz This will create a directory “archer_1.0.0” containing all the programs and data. 2.2.1 Dependencies for the command line installation There are a number of packages and libraries that are required for successful running of the Archer Analysis Pipeline on the command line. NOTE: These requirements are NOT a dependency for running the Graphical Web Interface. Many modern unix-style operating systems (Linux, MacOS X etc.) already have many of the required packages and libraries, but links to the websites are provided in case installation is required. • freetype-devel (C Library) § http://www.freetype.org/ • libpng-devel (C Library) § http://www.libpng.org/pub/png/libpng.html • perl 5.14+ • python 2.7 § numpy 1.8 (Python Library) § http://sourceforge.net/projects/numpy/ § matplotlib 1.3.1 (Python Library) § http://matplotlib.org/ § pygtk2 (Python Library) § http://www.pygtk.org/ § pygobject2 (Python Library) • htseq, 0.5.4p5 (Python Module) 7 | P a g e Archer Analysis Pipeline Installation and Usage Guide IFU-008.1 Rev. A1 § http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html • pycogent, 1.5.3 (Python Module) § http://pycogent.org/ • bedtools, 2.17.0 § https://code.google.com/p/bedtools/ • samtools, 0.1.19 § http://samtools.sourceforge.net/ § Requires ncurses-dev • tabix 0.2.6 § http://sourceforge.net/projects/samtools/files/ § Part of samtools, usualy. • freebayes, 0.9.9 § https://wiki.gacrc.uga.edu/wiki/Freebayes § Requires cmake On ubuntu, also required: • liblist-moreutils-perl § sudo apt-get install liblist-moreutils-perl § dos2unix § sudo apt-get install dos2unix 2.2.2 Compilation of BWA The Archer Analysis Pipeline uses a slightly modified version of BWA that was optimized for gene fusion detection. The source code for this version of BWA is provided in the archer package, and, while it has already been compiled for the CentOS operating system, probably needs to be re-compiled for the system that will run the analysis pipeline. Verify that compiled version that is provided in the directory archer_1.0.0/bwa_enz is operational by executing the program “bwa_enz” $ archer_1.0.0/bwa_enz/bwa_enz If this produced an error, a recompilation is required. $ cd archer_1.0.0/bwa_enz $ make clean $ make This will create a new version of the bwa_enz program. This program should be in the path of the user running the Archer pipeline, so copy it to the appropriate location (/usr/bin or equivalent). 8 | P a g e Archer Analysis Pipeline Installation and Usage Guide IFU-008.1 Rev. A1 3 Execution This section describes the execution of an analysis on the Archer Analysis Pipeline. 3.1 Run an analysis using the web interface The Archer Analysis Pipeline can be accessed through any Web Browser. Start your favorite web browser application and enter the following address in the address box or click the link below: http://192.168.56.101 The login screen for the Archer Analysis Pipeline will appear as shown below. Figure 1 Archer login page If the web page does not appear, check that the virtual machine is running (See section 2.1 for installation of the virtual machines) 3.1.1 Create a login account The Archer Analysis Pipeline is a fully contained and secure environment that allows users to run their analyses under their own account. Create a login account by selecting the “Create Account” link on the bottom left of the screen. Use your email address as your login and create a new password. At this time of the beta version, no email can be sent from the VM, so enter a Password Retrieval Question and the appropriate answer. Later, if you lose your password, you will be asked the Password Retrieval Question to reset your password. 9 | P a g e Archer Analysis Pipeline Installation and Usage Guide IFU-008.1 Rev. A1 Figure 2 Archer account creation page Click the “Create Account” button to create the account and it will automatically log in as well. After a successful login, the home screen will show any currently running jobs. 10 | P a g e Archer Analysis Pipeline Installation and Usage Guide IFU-008.1 Rev. A1

Description:
4.1.7 Full Example of the Human Readable Format Results File Many modern unix-style operating systems (Linux, MacOS X etc.) already have.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.