1 Introduction to Pichia pastoris David R. Higgins and James M. Cregg 1 Introduction n Pichiapastoris has become a highly successful system for the expression of heterologous genes. Several factors have contributed to its rapid acceptance, the most important of which include: A 1. A promoter derived from the alcohol oxldase I (AOXZ) gene of P paste that is uniquely surted for the controlled expressron of forergn genes, 2 The similarity of techniques needed for the molecular genetlc manipulation of l R pastorzs to those of Saccharmyces cerev~ae, one of the most well-character- ized experimental systems in modem bloiogy; 3 The strong preference of P pastors for respiratory growth, a key physlologlcal trait that greatly facrlitates its culturing at high cell densitres relatrve to fementa- tlve yeasts; and 4 A 1993 decision by Phillips Petroleum Company (contrnued by Research Corpo- ratlon Technologies [RTC]) to release the P pas~uns expression system to aca- demrc research laboratories, the consequence of which has been an explosion in the knowledge base of the system (Fig. 1). As listed in the Appendix to this volume, more than 100 different proteins have been successfully produced in I? pastoris. As a yeast, I? pastoris is a single-celled mrcroorganism that IS easy to manipulate and culture. However, it is also a sukaryote and capable of many uf the posttranslational modifications performed by higher eukaryotic cells, such as proteolytic processing, folding, disulfide bond formation, and glycosylation. Thus, many proteins that end up as inactive inclusion bodies m bacteria1 systems are produced as biologically active molecules in P, pastoris. The I? pastoris system is also generally regarded as being faster, easier, and less expensive to use than expression systems derived from higher eukaryotes, such From Methods rn Molecular Bmlogy, Voi 703 Pichla Protocols Edrted by 0 R l-hggns and J M Gregg 0 Humana Press Inc , Totowa, NJ 2 Higgins and Cregg 70 I 63 60 54 50 20, 10. 3030422 O a7 aa a9 90 91 92 93 94 95 96 97 Fig. 1. Graph showmg number of publications describing the expression of a for- eign protein in P pastor-u each year from 1985 to 1997. as insect and mammalian tissue culture cell systems, and usually gives higher expression levels. A second role played by P pastor-is in research is not directly related to its use as a protein expression system. I! pastoris serves as a useful model system to investigate certain areas of modem cell biology, including the molecular mechanisms involved in the import and assembly of peroxisomes (Chapter lo), the selective autophagic degradation of peroxisomes, and the organization and function of the secretory pathway in eukaryotes. In this chapter, we review basic aspectso f the P. pustorts expression system and highlight where useful information on the system can be found m this book. Further information on the l? pastorzs system can be found in the numerous reviews describing the system (1-9) and the Pichia Expression Kit Instructron Manual (Invitrogen Corporation, Carlsbad, CA). The DNA sequence of many I? pastoris expression vectors and other useful information can be found on the Invitrogen web site (http://www.mvitrogen.com). 2. A Brief History of the I? pastor-is Expression System The ability of certain yeast species to utilize methanol as a sole source of carbon and energy was discovered less than 30 years ago by Koichi Ogata (10). Because methanol could be mexpensively synthesized from natural gas (meth- ane), there was immediate interest in exploiting these organisms for the gen- eration of yeast biomass or single-cell protein (SCP) to be marketed primarily as high protein animal feed. During the 197Os,P hillips Petroleum Company of Bartlesville, OK, developed media and methods for growmg t? pustons on methanol in continuous culture at high cell densities (>130 g/L dry cell weight) (11). However, during this same period, the cost of methane increased dramati- cally because of the oil crisis, and the cost of soy beans---the major alternative source of animal feed protein-decreased. As a result, the SCP process was never economically competitive. In the early 198Os, Phillips Petroleum Company contracted with the Salk Institute Biotechnology/Industrial Associates, Inc. (SIBIA), a btotechnology company located in La Jolla, CA, to develop I? pastoris as a heterologous gene expression system. Researchers at SIBIA isolated the AOXl gene (and its pro- moter) and developed vectors, strains, and methods for molecular genetic manipulation of I? pastoris (12-17, Chapters 2,3, and 6). The combination of strong regulated expression under control of the AOXl promoter, along with the fermentation media and methods developed for the SCP process, resulted m strikingly hrgh levels of foreign proteins in I? pastoris. In 1993, Phillips Petroleum sold its patent position with the I! pastoris expression system to RCT, the current patent holder. In addition, Phillips Petroleum licensed Invitrogen to sell components of the system to researchers worldwide, an arrangement that continues under RCT. 3. i? pastoris as a Methylotrophic Yeast P pastoris is one of approximately a dozen yeast species representing four different genera capable of metabolizing methanol (18). The other genera include Can&da, Hanserzula, and Torulopsis. The methanol metabolic path- way appears to be the same in all yeasts and involves a unique set of pathway enzymes (19). The first step in the metabolism of methanol is the oxidation of methanol to formaldehyde, generating hydrogen peroxide in the process, by the enzyme alcohol oxidase (AOX). To avoid hydrogen peroxide toxicity, this first step in methanol metabolism takes place wtthin a specialized organelle, called the peroxisome, that sequestersto xic hydrogen peroxide away from the rest of the cell. AOX 1s a homo-octomer with each subunit containing one noncovalently bound FAD (flavin adenine dinucleotide) cofactor. Alcohol OXI- dase has a poor afftmty for 02, and methylotrophic yeasts appear to compen- sate for this deficiency by synthesizing large amounts of the enzyme. There are two genes in I? pastoris that code for AOX-AOXI and AOXZ- but the AOXZ gene is responsible for the vast majority of alcohol oxidase activ- ity in the cell (16). Expression of the AOXl gene is tightly regulated and induced by methanol to high levels. In methanol-grown shake-flask cultures, this level is typically -5% of total soluble protein but can be 230% in cells fed methanol at growth limiting rates in fermentor cultures (20). Expression of the AOXI 4 Higgins and Gregg gene is controlled at the level of transcrrptton (12,14,16). In methanol-grown cells, -5% of polyA+ RNA IS from the AOXI gene, whereas in cells grown on other carbon sources, the AOXl message is undetectable. The regulation of the AOXZ gene is similar to the regulatron of the GAL1 gene of S cerevzslae in that control appears to involve two mechanisms: a repressron/derepression mecha- msm plus an induction mechanism. However, unlike GALI regulation, dere- pressing conditions (e.g., the absence of a repressmg carbon source, such as glucose m the medium) do not result m substantial transcription of the AOXI gene. The presence of methanol appears to be essential to induce high levels of transcription (14). 4. Secretion of Heterologous Proteins With P pastoris, heterologous protems can etther be expressed mtracellu- larly or secreted mto the medmm. Because I? pastorzs secretes only low levels of endogenous proteins and because its culture medium contams no added pro- terns, a secreted heterologous protein comprises the vast maJorlty of the total protein m the medium (21,22). Thus, secretion serves as a major first step m purrficatron, separatmg the forergn protein from the bulk of cellular proteins. However, the option of secretion IS usually limited to foreign proteins that are normally secreted by their native hosts. Secretion requires the presence of a signal sequence on the foreign protein to target it to the secretory pathway. Although several drfferent secretron signal sequences have been used success- fully, including the native secretion signal present on some heterologous protems, success has been varrable. The secretion srgnal sequence from the S. cerevisiae a-factor prepro peptrde has been used wrth the most success 5. Common Expression Strains All I? pastorzs expression strams are derivatives of NRRL-Y 11430 (North- ern Regional Research Laboratories, Peona, IL) (Table 1). Most have a muta- tion in the histrdmol dehydrogenase gene (H&I) to allow for selectron of expression vectors contammg HIS4 upon transformation (13). Other brosyn- thettc gene/auxotrophtc mutant host marker combmattons are also available, but used less frequently (Chapter 2). All of these strains grow on complex media but require supplementation with histtdine (or other appropriate nutrient) for growth on mmimal medra. Three types of host strains are available that vary with regard to their abrhty to utilize methanol resulting from deletions m one or both AOX genes. Strains with deleted AOX genes sometimes are better producers of a foreign protein than wild-type strains (21,23,24). These strains also require much less metha- nol to induce expression, which can be useful in large fermentor cultures where a large amount of methanol IS sometrmes considered a srgmticant fire hazard. Introduction 5 Table 1 /? pastoris Expression Host Strains Strain name Genotype Phenotype Reference Y-l 1430 Wild-type NRRLa GSl15 hzs4 Mut+ HIS- 13 KM7 1 aoxlA aSARG4 his4 arg4 Muts His- I4 MC 100-3 aoxl A *SARGI aox2A*:Phis4 Mut- HIS- I6 hu4 arg4 SMD1168 pep4A hu4 Mut+ His- Protease- Chapter 7 deficient SMDI 165 prbl hls4 Mut+ HIS- Protease- Chapter 7 deficient SMDl163 pep4 prbl hu4 Mut+ His- Protease- Chapter 7 deficient aNorthern Reglonal Research Laboratones, Peoria, IL However, the most commonly used expression host IS GS 115 (hu4), whtch IS wild-type with regard to the AOXI and AOX genes and grows on methanol at the wild-type rate (methanol utilization plus [Mut+] phenotype) KM71 (Iris4 arg4 aoxlA:.ARGI) IS a strain in which the chromosomal AOXZ gene IS largely deleted and replaced with the S. cereviszaeA RG4 gene (15). As a result, this strain must rely on the much weaker AOX gene for AOX and grows on methanol at a slow rate [methanol utihzation slow (MutS) phenotype]. Wrth many l? pastoris expression vectors, it is possible to insert an expression cas- sette and simultaneously delete the AOXl gene of a Mut+ stram (23, Chapters 5 and 13). The third host MC 100-3 (his4 arg4 aoxld:. SARG4 aox2A.. Phwl) is deleted for both AOX genes and IS totally unable to grow on methanol [metha- nol utilization minus (Mur) phenotype] (‘16,24, Chapter 9). Some secreted foreign proteins are unstable m the R pastoris culture medrum in which they are rapidly degraded by proteases. Major vacuolar proteases appear to be a significant factor in degradatron, particularly in fermentor cul- tures, because of the high cell density envtronment m combmation wtth the lysts of a small percentage of cells. The use of host strams that are defective in these proteases has proven to help reduce degradation in several instances (Chapters 7, 11, and 14). SMDl163 (his4pep4prbl), SMD1165 (hislprbl), and SMD1168 (his4 pep4) are protease-deficient strains that may provide a more suitable envn-onment for expression of certain heterologous proteins. The 6 Higgins and Gregg PEP4 gene encodes protemase A, a vacuolar aspartyl protease required for the activation of other vacuolar proteases, such as carboxypeptidase Y and protein- ase B. Proteinase B, prior to processing and activation by proteinase A, has about half the activity of the processed enzyme. The PRBZ gene codes for pro- teinase B. Therefore, pep4 mutants display a substantial decrease or elimina- tion in protemase A and carboxypeptidase Y activities, and partial reduction in proteinase B activity. In the prbl mutant, only proteinase B activity is elimt- nated, whereaspeplprbl double mutants show a substantial reduction or elimi- nation in all three of these protease activities. 6. Expression Vectors Plasmid vectors designed for heterologous protein expression m l? pastoris have several common features (Table 2). The foreign gene expression cassette is one of those and is composed of DNA sequences containing the P pastoris AOXl promoter, followed by one or more unique restriction sites for insertion of the foreign gene, followed by the transcripttonal termination sequence from the P pastoris AOXl gene that directs efftcient 3’ processing and polyadenylation of the n-&NAs. Many of these vectors also include the P pastorzs HIS4 gene as a selectable marker for transformation into hu4 mutant hosts of P pastons, as well as sequences required for plasmid replication and maintenance in bacteria (i.e., ColEl replication origin and ampicillin-resistance gene). Some vectors also contam AOXI 3’ flanking sequences that are derived from a region of the I? pastoris genome that lies immediately 3’ of the AOXI gene and can be used to direct fragments contaming a forergn gene expression cassette to integration at the AOXI locus by gene replacement (or gene insertion 3’ to AOXl gene). This is discussed m more detail in Subheading 7. and Chapter 13. Additional features that are present in certam I? pastoris expression vectors serve as tools for specialized functions. For secretion of foreign proteins, vec- tors have been constructed that contain a DNA sequence immediately follow- ing the AOXZ promoter that encodes a secretion signal. The most frequently used of these is the S. cerevisiae a-factor prepro signal sequence (25,26, Chapter 5). However, vectors containing the signal sequence derived from the I? pastoris acid phosphatase gene (PHOI) are also available. Vectors with dominant drug-resistance markers that allow for enrichment of strains that receive multiple copies of foreign gene expression cassettesd uring transformations have been developed. One set of vectors (pPIC3K and pPIC9K) contains the bacterial kanamycm-resistance gene and confers resistance to high levels of G418 on strains that contain multiple copies of these vectors (26, Chapter 5). Another set of vectors (the pPICZ series) contains the Sh ble gene from Streptoalloteichus hzndustanus (Chapter 4). This gene IS small (375 bp) and confers resistance to the drug Zeocm in Escherichza coli, yeasts (Including Table 2 Common I? pastoris Expression Vectors Selectable Vector name markers Features References Intracellular PHIL-D2 HIS4 Not1 sitesf or AOXl gener eplacement Sreekrishna, personal communication ~A0815 HIS4 Expresstocna ssettbeo undedby BamHI (2) andB glII suesf or generation of multicopy expressionve ctor pPIC3K HIS4 and kanr Multiple cloning sitesf or mserttono f (33) foreigng enes,G 418 selection for multicopy strains pPICZ bier Multiple cloning sites Chapter5 for insertiono f foreign genes, Zeocm selectionf or multicopy strains, potentialf or fusion of foreignp rotein to His6a ndm yc epitopet ags pHWO10 HIS4 Expressiocno ntrolledb y constitutive GAPp (27) pGAPZ bier Expresstocno ntrolledb y constltutrve GAPp, Invitrogen mulhpIec lonmgs ite or insertiono f fore&n genesZ; eocms electionfo r multicopys trains; potenttal for fusiono f foreignp rotein to His6a ndm yc epttopet ags Secretton PHIL-S 1 HIS4 AOXip fusedt o PHOl secretion signal; Sreekrrshna, XhoI, EcoRI, andB amHI sitesa vailable personal for insertiono f foreign genes communication, Invitrogen pPIC9K HIS4 and kanr AOXlp htsedt o a-MF prepros ignasl equence, (33) XhoI (not umque)E, coRI, NotI, SnaBI and AvrII sitesa vailablef or msertiono f foreign genesG; 4 18s electionfo r multicopy strains pPICZa ble’ AOXlp fusedt o a-MF prepros ignasl equence, Chapter5 multiple cloning site for insertiono f foreigng enesZ; eocms elechonfo r multicopy strainsp, otentiafl or fusiono f foreignp rotein to Hissa ndm yc epitopet ags pGAPZa ble* Expressiocno ntrolledb y constttutwe GAPp; Invitrogen GAPp fusedto a-MF prepros ignasl equence, multiplec lonings ite for insertiono f foreign genesZ, eocms electionfo r mulucopys trains, potentialf or fusion of foreignp rotein to His6a nd myc epttopet ags 8 Higgins and Cregg I! pastoris), and other eukaryotes. Because the ble gene serves as the select- able marker for both E. colz and I? pastoris, the ZeoR vectors are much smaller (-3 kb) and easier to mampulate than other I? pastorzs expression vectors. These vectors also contain a multiple cloning site (MCS) with several unique restriction sites for convenience of foreign gene insertion and sequences encoding the His6 and myc epitopes so that foretgn proteins can be easily epitope-tagged at their carboxyl termim, if desired. Another feature present on certain vectors (e.g., pA0815 and the pPICZ vector series) is designed to facilitate the construction of expression vectors with multiple expression cassette copies (Chapter 11). Multiple copies of an expression cassette are introduced m these vectors by msertmg an expression cassette bounded by a BamHI and a BgflI site mto the BamHI site of a vector already containing a single expression cassette copy. The resultmg BamHIl BglII Junction between the two cassettes can no longer be cleaved by either enzyme allowmg for the insertion of another BamHI-BgnI-bounded cassette mto the same vector to generate a vector with three cassette copies. The pro- cesso f addition is repeated until 6-8 copies of a cassette are present m a smgle final vector that is then transformed mto the I? pustons host strain. Finally, vectors containing a constitutive P pastorzs promoter derived from the P pastoris glyceraldehyde-3-phosphate dehydrogenase gene (GAP) have recently become available (27). The GAP promoter is a convenient alternative to the AOXI promoter for expression of genes whose products are not toxic to P pastoris In addition, its use does not involve the use of methanol, which may be problematic in some mstances. 7. Integration of Vectors into the F? pastoris Genome As in S. cerevisiae, linear vector DNAs can generate stable transformants of I! pastorzs via homologous recombmation between sequences shared by the vector and host genome (13,23, Chapters 5 and 13). Such integrants show strong stability in the absence of selective pressure even when present as mul- tiple copies. All P pastoris expression vectors carry at least one I? pastorzs DNA segment (the AOXI or GAP promoter fragment) with unique restriction sites that can be cleaved and used to direct the vector to mtegrate mto the host genome by a single crossover type msertion event (Fig. 2A). Vectors contam- ing the P pastons HIS4 gene can also be directed to integrate into the I? pastoris genomic hu4 locus. Expression vectors that contain 3 YOX1 sequences can be integrated mto the R pastor-is genome by a single crossover event at either AOXl or HIS4 1oc1o r by a gene replacement (0 insertion) event at AOXI (Fig. 2B). The latter event arises from crossovers at both the AOXZ promoter and 3’AOXI regions of the vector and genome, and results in the deletion of the AOXl codmg region (i.e., introduction 9 A Bgt If amti I I hi34 EamH I -‘Ib*u M hls4 3’AOXl YFQ 5’AOXl 3’AOXl HIS4 55’’AAOOXXll AAOOXXll OORRFF 33’’AAOOXXll Eigl II Sgl II I 5’AOXl YFG 3’AOXl HIS4 3’AOXl Fig. 2. Integration of expression vectors into the P pastorts genome (A) Single crossover integration into the hu4 locus. (B) Integration of vector fragment by re- placement of AOXl gene. gene replacement). Transformants resulting from such an AOXI replacement event are phenotypically His+ and MutS. As described in Subheading 6., such MutS strains sometimes express higher levels of foreign protein. In addition, a MutS phenotype serves as a convenient indicator to confirm the presence of an integrated expression cassette in the I? pastoris genome. With either single crossover or gene replacement integration strategies and selection for His+ transformants, a significant percentage of transformants will IO Higgins and Cregg not contam the expression vector. This appears to be the result of gene conver- sion events between the H’S4 gene on the vector and the l? pastorzs hu4 locus such that the wild-type HIS4 gene recombines into the genome without any additional vector sequences. These events account for IO-50% of His+ transformant colonies and appear to occur at highest frequency when using electroporation to introduce vector DNAs. Multiple gene insertlon events at a single locus occur spontaneously at a low but detectable frequency-between 1 and 10% of His+ transformants (28, Chapters 4,5, 13, and 14). Multicopy events can occur as gene insertions either at the AOXZ or his4 loci and can be detected by DNA analysis methods (e.g., PCR, Southern/dot blotting, or differential hybridization) (29,30) or by meth- ods that directly examme levels of the foreign protein (e.g., activity assay, sodium dodecyl sulfate polyacrylamlde gel electrophoresls [SDS-PAGE], or colony immunoblotting) (28,31). As mentioned m Subheading 6, it 1s pos- sible to enrich transformant populations for ones that have multiple copies of an expression vector by use of either a G41gR or ZeoR gene-containing vector and selecting for hyper-resistance to the appropriate drug (26, Chapters 4 and 5). It is important to note that, with the G418R vectors, it is essential to first select for His+ transformants and to then screen for ones that are resistant to G4 1 8R. With ZeoR vectors, it is possible to directly select for hyper-zeo-resis- tant transformants. Most drug-resistant strains resulting from either the G418R or ZeoR selection methods contain between one and five copies of the expres- sion vector. To find strains with 20 or more copies, it is usually necessary to screen at least 50-100 drug-resistant strains. 8. Posttranslational Modifications I? pastoris has the potential to perform many of the posttranslational modi- fications typically associated with higher eukaryotes. These include process- ing of signal sequences (both pre- and prepro-type), folding, disulfide bridge formation (Chapter 7), and 0- and N-lmked glycosylatlon. Glycosylation of secreted foreign (higher) eukaryotlc proteins by P pastoris and other fungi can be problematic. In mammals, O-linked ohgosaccharldes are composed of a variety of sugars, including N-acetylgalactosamine, galac- tose, and siahc acid. In contrast, lower eukaryotes, mcluding P. pastoris, add O-ohgosaccharides solely composed of mannose (Man) residues (Chapter 11). The number of Man residues per chain, their manner of linkage, and the fre- quency and specificity of U-glycosylation in I! pustoris have yet to be deter- mined. One should not assume that, because a protein is not 0-glycosylated by its native host, I? pastoris will not glycosylate it. I? pastoris added O-linked mannose to -15% of human IGF-1 protein, although this protein 1s not glycosylated at all in humans. Furthermore, one should not assume that the

