ebook img

Systemwide Genomic and Biochemical Comparisons of Sialic Acid Biology Amongst Primates and ... PDF

45 Pages·2006·1.3 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Systemwide Genomic and Biochemical Comparisons of Sialic Acid Biology Amongst Primates and ...

JBC Papers in Press. Published on June 12, 2006 as Manuscript M604221200 The latest version is at http://www.jbc.org/cgi/doi/10.1074/jbc.M604221200 Systemwide Genomic and Biochemical Comparisons of Sialic Acid Biology Amongst Primates and Rodents – Evidence for Two Modes of Rapid Evolution Tasha K. Altheide1*, Toshiyuki Hayakawa1*#, Tarjei S. Mikkelsen2, Sandra Diaz1, Nissi Varki3 and Ajit Varki1,4 Glycobiology Research and Training Center, Departments of Cellular & Molecular Medicine1, Medicine4, and Pathology3, University of California at San Diego, La Jolla, CA, 92093, 2Broad Institute of MIT and Harvard, Cambridge, MA, USA *These authors contributed equally to this work #Current address: Research Institute for Microbial Diseases, Osaka University, Suita, Osaka 565-0871, Japan Running Head: Comparative Analysis of Sialic Acid Biology Address correspondence to: Ajit Varki CMM-E, Room 1065, Mail Code 0687 University of California, San Diego La Jolla, CA 92093-0687 Phone: 858-534-2214. FAX: 858-534-5611, e-mail: [email protected]. Numerous vertebrate genes are from the ongoing need of organisms to involved in the biology of the evade microbial pathogens that use Sias as oligosaccharide chains attached to receptors. The rapid evolution of Sia- glycoconjugates. These genes fall into binding domains of the inhibitory CD33- diverse groups within the conventional related Siglecs is likely to be a secondary D o Gene Ontology classification. However, consequence, as these inhibitory receptors wn lo they should be evaluated together from presumably need to keep up with a d e functional and evolutionary perspectives in recognition of the rapidly evolving “self” d fro a “biochemical systems” approach, sialome. m h considering each monosaccharide unit’s ttp biosynthesis, activation, transport, Sialic acids (Sias)1 are negatively ://w w modification, transfer, recycling, charged nine-carbon sugars typically w degradation, and recognition. Sialic acids occupying the distal ends of glycan chains on .jbc .o (Sias) are monosaccharides at the outer end cell surface and secreted molecules in the brg/ of glycans on cell surface and secreted y deuterostome lineage of animals (1, 2). The g molecules of vertebrates, mediating ue two most common forms of Sia in vertebrates s recognition by intrinsic or extrinsic are N-acetylneuraminic acid (Neu5Ac) and N- t on N (pathogen) receptors. The availability of o glycolylneuraminic acid (Neu5Gc), which v e multiple genome sequences allows a system- m differ by a single oxygen atom that is added b wide comparison amongst primates and by the CMP-Neu5Ac hydroxylase (CMAH) er 22 rodents of all genes directly involved in Sia , 2 enzyme. A third basic type of Sia is 2-keto-3- 0 biology. Taking this approach, we present 1 8 deoxy-nonulosonic acid (Kdn), less common further evidence for accelerated evolution in Sia-binding domains of CD33-related in mammals. Rarely, the amino group of Siglecs. Other gene classes are more neuraminic acid (Neu) can remain conserved, including those encoding the unmodified. These four forms of Sia are sialyltransferases that attach Sias to subject to a variety of modifications (most glycans. Despite this conservation, tissue prominently O-acetylation at the 4,7,8,or 9- sialylation patterns are shown to differ positions), and can be presented in many widely amongst these species, presumably different linkages to the underlying sugar due to rapid evolution of sialyltransferase chain (1-3). The sum total of this diversity expression patterns. Analyses of N- and O- has been termed the “sialome” (4) 2. glycans of erythrocyte and plasma Sias are involved in many biological glycopeptides from these and other processes, often involving binding by intrinsic mammalian taxa confirmed this and extrinsic Sia-recognizing proteins. As an phenomenon. Sialic acid modifications on example of intrinsic Sia recognition, these glycopeptides also appear to be complement factor H uses Sia as a means to undergoing rapid evolution. This rapid identify ‘self’ and prevent autoimmune attack evolution of the sialome presumably results 1 Copyright 2006 by The American Society for Biochemistry and Molecular Biology, Inc. by the alternate complement pathway – in same cell surface or on other cells. Sias contrast, foreign cells lacking Sia are not attached to macromolecules are eventually protected (5). Current evidence suggests that cleaved from glycan chains in the lysosome, the CD33-related subset of Siglecs (Sia- actively returned to the cytosol, and then recognizing Ig-like lectins) may also serve to recycled or degraded (Figure 1). recognize host Sias as “self” (4), thus Humans and chimpanzees share >99% dampening auto-reactivity of cells of the identity in typical protein sequences (9-12). Of innate immune response. Meanwhile, the few published genetic differences between numerous vertebrate pathogens recognize and humans and chimpanzees and other “great bind to glycan structures containing Sias (2), apes”4 with known/potential functional using them as portals to gain entry. consequences, several involve genes related to Elimination of vertebrate host Sia production Sia biology: a human-specific exon deletion to avoid such pathogens is not an option, as in CMAH resulting in the inability to convert this results in embryonic lethality (6). the N-acetylneuraminic acid Neu5Ac to N- Complicating matters, many successful glycolylneuraminic acid Neu5Gc (13-16); a microbes express surface Sias, mimicking the human-specific point mutation in SIGLEC12 D o w host and avoiding recognition by many arms (previously called Siglec-L1) eliminating its n lo of the immune system (7). Taken together, Sia-recognizing property (17); a human- ad e d these data suggest that Sias are involved in an specific upregulation of α2-6 linked Sia fro ongoing biochemical “arms race” between expression on selected cell types, presumably hm hosts and pathogens, driven to diversify by due to changed expression of the ttp “Red Queen” effects3, even while conserving sialyltransferase ST6GAL1 (18); human- ://ww w critical endogenous functions (4, 8). specific changes in one SIGLEC9 exon .jb c Taking a “biochemical systems” associated with the accommodation of .o rg approach to analyzing all extant data (see Neu5Ac recognition by Siglec-9 (19); human- b/ y Figure 1), we find that there are <60 known specific loss of an entire primate-specific gu e s genetic loci directly involved in the Siglec gene (SIGLEC13) (20); a human- t o n biosynthesis, activation, transport, specific gene conversion of SIGLEC11, N o v modification, transfer, recycling, degradation, causing changes in binding properties and em b and recognition of Sias within humans and newly induced expression in the brain (21); er 2 2 other vertebrates (Figure 1 and Table 1). With and selective down-regulation of , 2 0 the exception of the CD33-related Siglecs CD33rSiglecs in human T cells (22). 18 (CD33rSiglecs), all these loci are conserved Additional studies suggest other species- between the human and mouse genomes, specific gene conversion events amongst some indicating their functional importance. hominid Siglecs (23), and additional examples Precursor molecules are first converted into of human-specific changes in Siglec gene Sias, which are then activated to CMP-Sia and expression (22). The finding of so many transported to the Golgi apparatus, where human-specific functional differences from members of a family of sialyltransferases chimpanzees and other great apes within one transfer them on to the terminal ends of glycan biochemical/biological system suggests that it chains in various types of structurally distinct was subjected to major selective pressure(s) at linkages (Figure 1). Sias may also be some point(s) in human evolution. modified from one form into another, such as While all these genes are part of a from Neu5Ac to Neu5Gc or the addition of O- well-defined system (Sia metabolism and acetyl groups, and these alterations are function), they are not represented as a single differentially recognized by receptors on the biological process in widely used genomic 2 classification systems such as the Gene Siglecs) were used to extract orthologous Ontology system (24) or PANTHER (25) - coding sequences from the chimpanzee something that is also true of most other genes genome assembly (NCBI Build 1 Version 1) involved in glycan biology. These as identified by reciprocal best BLASTZ functionally-related genes actually fall into alignments (12). Phred quality scores for each diverse groups within the conventional Gene site in chimpanzee sequences were also Ontology classification. They are also (with provided by the Chimpanzee Sequencing and exception of CD33-related Siglecs) randomly Analysis Consortium. For 8 of the CD33- distributed throughout the genome. We related human Siglecs (SIGLEC-3, -5 through suggest that all these genes should be -10 and -12) high quality sequences were also evaluated together in a “biochemical systems” obtained from our independent high-resolution approach, considering the biosynthesis, comparative analyses of human, chimpanzee, activation, transport, modification, transfer, baboon, mouse and rat (20). One more Siglec recycling, degradation, and recognition of locus (SIGLEC13) is found in the chimpanzee Sias. Here we undertake such an approach and baboon genomes, but its complete towards understanding the evolution of Sia deletion in humans was reported in the above D o w biology in primates, rodents and other paper (20). SIGLEC13 was, therefore, used n lo mammals, in combination with selected only for the domain-specific comparative ad e d biochemical studies. We first ask whether analyses. fro specific loci or functional classes of loci in Mouse and Rat Orthologs. RefSeqs of m h this system have been subjected to adaptive mouse and rat orthologs were obtained from ttp ://w selective pressures, whether any common the NCBI Locus Link web site w w principles emerge, and whether differences (http://www.ncbi.nlm.nih.gov/LocusLink/). .jb c between chimpanzees and humans are more Reliable mouse sequences were obtained for .o rg significant, despite a shorter divergence time. all but 4 loci; reliable rat sequences were b/ y We then take a biochemical approach to put obtained for all but 9 sequences. All rodent gu e s the genomic data into context. loci obtained have one sequence as the t o n Experimental Procedures RefSeq, with the exception of the mouse N o v Human Loci. We identified fifty-five loci in ST3GalII locus. The representative sequence em b humans that are known to be (or to potentially of mouse ST3GalII locus was selected by er 2 2 be) involved in Sia biology (see Table 1). following the procedure as described in the , 2 0 Human RefSeqs and the genomic sequences first section. The high quality sequences of 18 of target loci were obtained from the NCBI mouse and rat CD33rSiglec orthologs Locus Link web site (SIGLEC-3, -E, -F and -G) were obtained as (http://www.ncbi.nlm.nih.gov/LocusLink/) in described above (20). 2003. Some loci have several RefSeqs Evolutionary Analysis. Sequence alignments representing splice variants, and/or show some of coding regions were performed in ClustalW sequence differences between the RefSeqs and (26) and manually checked to see whether the genome sequences. In the former case, the chimpanzee sequences had insertions or sequence with the most inclusive number of deletions, causing frameshifts in the aligned exons was used as a representative of the open reading frame. These were handled locus. In the latter case, the actual genome with reference to human and mouse sequences were used for the analyses. sequences, which showed identical open Identification of Chimpanzee Orthologs. reading frames for all loci studied, except for Human RefSeqs or human genome sequences one locus (NAGK, see supplemental text). from human loci (excluding the CD33-related Chimpanzee-specific insertions were assumed 3 to be errors and were deleted in order to (Akiyama et al. 1998; maintain an open reading frame even if they http://www.cbrc.jp/papia-cgi/ssp_menu.pl). had high quality scores. Frame-shifts caused Lectin Staining of Sialic Acids on Tissue by deletions were left in the alignments as Sections. Paraffin sections of lung, kidney gaps but the codons they were located in were and spleen samples from 7 humans, 8 removed from the analyses (see chimpanzees, 6 rats and 6 mice were Supplementary information). The sequences deparaffinized, blocked and overlaid with modified by these processes are labeled as predetermined concentrations of biotinylated “modified sequences” in Supplemental Table SNA lectin or biotinylated MAH lectin, or 1. In the alignments, some sites that are with control reagent. Binding was detected substitutions or indels between the human and with alkaline phosphatase-labeled chimpanzee sequence show low-quality Phred Streptavidin, using Vector Blue substrate, with scores in the chimpanzee. Since these low- Nuclear Fast Red counterstaining and aqueous quality sites could be artifacts from the mounting. Samples were washed in Tris sequencing and base-calling process, a second buffered saline containing 0.2% Tween with round of analyses were done whereby such 1% bovine serum albumin to block non- D o w low-quality chimpanzee sites were changed to specific binding. Digital photomicrographs n lo match to the human sequences at the sites in were taken while viewing with an Olympus ad e d question. The chimpanzee sequences in which BH2 microscope with Macrofire camera and fro substitutions were modified to match to the Adobe photoshop. m h human sequences are named as “humanized Preparation of erythrocyte ghosts and ttp ://w sequences” (Supplementary Table 1). Several plasma. Blood from multiple taxa was w w chimpanzee sequences also showed regions of collected directly into BD Vacutainer tubes .jb c non-called bases, represented by “N”s. Gene containing EDTA, stored at 4oC overnight, .o rg sequence regions that had N calls in the and then spun at 2000 X g for 10 min at 4oC. b/ y chimpanzee sequence were excluded from The plasma was removed and stored frozen gu e s analyses. until further work-up. The buffy coat was t o n The evolutionary parameters shown in removed and the erythrocyte pellet washed N o v Table 1 were calculated in multiple species twice with 10 volumes of ice-cold PBS pH7.4. em b comparisons using human, chimpanzee, Lysis of erythrocytes was accomplished by er 2 2 mouse and rat. The numbers of synonymous adding 15 volumes of ice-cold 10mM Tris- , 2 0 (Ks) and nonsynonymous (Ka) substitutions HCl, pH7.5 containing 1mM EDTA. The 18 per site were estimated by the method of Nei sample was transferred into glass Sorvall and Gojobori (1986) (27) with the Jukes- tubes and centrifuged at 10,000 X g for 20 Cantor correction (28). Values for Ka and Ks min. The supernatant was carefully aspirated were calculated using DNASP 3.51 (29), or as to not disturb the remaining “soft” pellet. MEGA2 (30). Statistical tests were performed The creamy, particulate material that did not to assess the significance of evolutionary diffuse easily (representing contaminating differences obtained in the analyses by using white cells) was also removed. The tubes InStat 0.6 (GraphPad software) or MEGA2. were filled with lysis buffer and centrifuged Protein Secondary Structure Prediction. again. The process was repeated until ghosts For secondary structure prediction of the were white. The last wash was made with ice- sialyltransferase loci, the New Joint method cold water containing 0.01% BHT as a analysis was performed using web-based preservative. software at the PAPIA (Parallel Protein Glycopeptide preparation from erythrocyte Information Analysis system) web site ghosts and plasma. Plasma (0.5 ml) was 4 lyophilized in a glass conical tube and 250ul The dialysis solution was changed to 2mM of water was added. 0.5 ml of the ghosts EDTA for 8-12 hr, and changed back to water were transferred into a glass conical tube, overnight. The sample from the dialysis assuming ~50% water. The lipids were tubing was recovered, frozen, and lyophilized. extracted from each of the above samples with The resulting powder was dissolved in 1 ml of 20 volumes of Chloroform/Methanol 2:1 water, transferred to a smaller container and (v/v), using a Brinkmann Instruments Polytron frozen and lyophilized again. The resulting (Westbury, N.Y.) at a high setting for 30-60 glycopeptides were recovered and weighed. sec. The samples were centrifuged at 800 X g Release of N-and O-glycans from for 5 min after each extraction. All glycopeptides by automated hydrazinolysis. supernatants containing the lipids were pooled The glycopeptides were dissolved in 500 ul of into a single glass vessel. Each sample was water, and a 2 mg equivalent transferred to a extracted again with Chloroform/Methanol 2:1 GlycoPrep reactor vial, frozen, and (v/v). The pellets were extracted twice with lyophilized. Hydrazinolysis was performed in Chloroform/Methanol 1:1 (v/v), and twice the N+O mode using an automated with Chloroform/Methanol 1:2 (v/v). The hydrazinolysis instrument (GlycoPrep 1000. D o w remaining glycoprotein pellet was extracted Oxford GlycoSciences, U.K.), which was set n lo with 95% ethanol, and the supernatant also to heat at 95oC for 4 hr followed by automated ad e d added to the pool. The glycoprotein pellet purification (16-24 hr). The released glycans fro was immediately dissolved in 100mM Tris- were filtered through 0.5µm PTFE filters to m h HCl pH pH 6.5. Small MW molecules were remove silica gel particles and lyophilized. ttp ://w removed from the samples by performing Analysis of N-and O-glycans by HPAEC- w w dialysis using 3500 MWCO tubing against a PAD. Free oligosaccharides were analyzed by .jb c 500-fold volume of 100mM Tris-HCl pH 6.5, HPAEC-PAD (31) on a CarboPac PA-1 .o rg 2mM EDTA at 4 oC overnight. The retentate column (4 X 250mm) in line on a DX500 b/ y was recovered and digested with 1/10th HPLC system equipped with a PAD detector gu e s volume of 20mg/ml of Proteinase K made in and an AS3500 ThermoSeparations t o n 50mM Tris-HCl pH8.0, 2mM calcium acetate, autosampler. The various oligosaccharides N o v incubating at 50 oC for 8 hrs. At the end of the were eluted with a linear gradient of sodium em b day, another aliquot of the 10 X Proteinase K acetate from 20-250mM over 60 min in er 2 2 solution was added to the sample and the 100mM sodium hydroxide. Data acquisition , 2 0 digestion left incubating overnight. The and processing was performed with Dionex 18 enzyme was inactivated by boiling for 10 min, PeakNet software. Elution profiles of the the sample centrifuged to remove particulates, glycans were compared to standard N- and O- and the resulting supernatant was loaded onto glycans of known elution behaviour. a 1 ml column of DEAE-Sephacel (GE Determination of the sialic acid types in Healthcare), equilibrated in 20mM Tris-HCl erythrocyte ghosts and plasma samples. pH 6.5, 0.1 M NaCl. (Bame and Esko, 1989). Sialic acids were released from the The column run-through was collected and erythrocyte ghost or plasma glycopeptides by reloaded onto the column. The column was hydrolysis in 2M Acetic acid at 80oC for 3 hr. washed with 30 ml of 20mM Tris-HCl pH 6.5, The released Sias were separated from high 0.1 M NaCl. The column run-through molecular weight proteins by passing through fractions containing glycopeptides were an Amicon Microcon-10 filter. The flow pooled with the wash, and dialysis was through was derivatized with an equal volume performed against 100-fold volume of water at of 2X DMB Reagent ((32), and heated at 50oC 4 oC using 1000MWCO tubing for 12-16 hr. for 2.5 hr. The fluorescent tagged sialic acids 5 were separated on a Varian Microsorb-MV Modification of CMP-Neu5Ac to CMP- 100-5 C18 column (4 X 250 mm) in the Neu5Gc occurs by the action of the CMAH isocratic mode using 85%water, locus, which is a pseudogene in humans but is 8%acetonitrile, and 7% methanol and detected functional in chimpanzees (13). Additional with a Spectrovision FD-300 fluorescence “Modification” genes presumed to be involved detector with the emission set at 373nM and in other modifications of Sias such as O- excitation at 448nM. Elution profiles were acetylation, O-methylation, O-sulfation etc. compared to standard sialic acids of a known have yet to be identified. “Recycling and elution profile. Degradation” genes encode sialidases which Results release Sias from glycan chains, as well as a Identification of Loci and Analysis of stabilizer protein, a lysosomal sialic acid O- Functional Categories. Genes involved in Sia acetylesterase, a lysosomal Sia exporter, and biology encode proteins with widely differing the Sialate:pyruvate lyase that cleaves free functions, ranging from cell-surface receptors Sias in the cytosol into pyruvate and that recognize Sias, to enzymes cleaving Sias acylmannosamines. “Recognition” molecules in the lysosome, to transporters making them do not directly participate in the Sia D o w available for re-use by the cell (See Table 1, biochemical life cycle, but act as receptors for n lo Figure 1). Chimpanzee orthologs were Sias. The major category of these molecules ad e d identified for all 55 human loci known to be are the Siglecs, a family of cell-surface fro involved in Sia biology (see Supplemental receptors that recognize and bind to different m h Text), indicating that there have been no linkages and structural variants of Neu5Ac ttp ://w major chimpanzee-specific deletions in this and Neu5Gc, both in cis and in trans on cell w w system. Although identifiable, not all loci surfaces (4, 33). Other known Sia- .jb c were analyzable (see Table 1 and recognizing intrinsic receptors include E-, P- .o rg Supplemental text). Additionally, the presence and L-Selectins (34-36), factor H (5), and b/ y of all corresponding loci in the mouse genome L1CAM (37). We also included the G domains gu e s (with the exception of some primate-specific of two laminin loci (LAMA1 and LAMA2) in t o n CD33rSiglecs, see Supplemental Text) this classification, since these domains are N o v suggests that these loci are generally thought to recognize Sias (38) - although em b conserved in mammals. The loci fall into conclusive proof is lacking. er 2 2 different functional biochemical categories, Some of these functional labels , 2 0 which we have termed as “Biosynthesis”, correspond generally to those listed for loci in 18 “Activation, Transport, and Transfer (ATT)”, the Gene Ontology (24) or PANTHER (25) “Modification”, “Recognition”, and databases, but most are more specific in the “Recycling and Degradation (RD)”. context of Sia biology. A few loci appear to “Biosynthesis” refers to loci involved in the have additional functions or capabilities production of Sias from precursor molecules, external to the Sia biology pathway (e.g. the such as UDP-GlcNAc and ManNAc, and RENBP gene product is also a Renin-binding include epimerases, kinases, and protein, and the PPGB gene product is a phosphatases. “Activation, Transport, and cathepsin protease that also serves to stabilize Transfer (ATT)” refers to loci that activate lysosomal beta-galactosidase). By taking a free Sia into the nucleotide donor CMP-Sia systematic “sialic acid biochemistry-based” and transport it into the lumen of the Golgi, approach to grouping these loci, rather than a where multiple sialyltransferases (STs) then strict categorical label via the current GO transfer the Sias from the CMP-donors to ontology scheme5, we hoped to uncover newly synthesized glycoconjugates. information that is specifically relevant to the 6 evolution and diversity of Sia biology in selection and accelerated evolution, and the humans and other mammals. higher the ratio the greater the relative number Differences in Evolution Rates Amongst of non-synonymous substitutions (and hence Functional Gene Categories Indicate Rapid potential adaptive evolution). However, many Evolution in Genes That Recognize Sialic protein sequences simultaneously experience Acids. We found overall differences in the strong purifying selection (disallowing evolutionary rates between functional deleterious amino acid changes) over most of categories of human and chimpanzee loci their length and can thus be targeted by involved in Sia biology (Table 2). The percent adaptive evolution (positive selection) at only amino acid divergence between human and a few sites. Thus, a Ka/Ks ratio taken over all chimpanzee ranged from 0% at several loci to sites in a given protein can result in a value 4.40% (CD33) (Supplemental Table 1). The much less than 1, even if strong positive recognition category (N=16) had the highest selection has occurred at one or a few sites. average amino acid divergence across Taking this approach, the average ratio across categories (1.84%), followed by the RD all 49 loci is 0.322, slightly greater than the category (N=7) with 1.02% divergence. The human-chimpanzee genome-wide average of D o w activation/transport/transfer (N=20) and 0.23 (12). Within humans and chimpanzees, n lo biosynthesis (N=5) categories had lower the recognition category had the greatest ad e d levels of amino acid divergence (0.82% and average Ka/Ks ratio (0.465), followed by the fro 0.78%, respectively). RD category (0.292), Biosynthesis (0.293), m h As a measure of the rate of evolution, and ATT (0.213). The one currently known ttp ://w we used the Ka/Ks ratio, a commonly used modification gene, CMAH, has been w w statistic that provides an indication of pseudogenized in humans by a 92-bp exon .jb c selection for amino acid changes during deletion (13, 14), so comparisons between .o rg evolution. This ratio is based on the rate of human and chimpanzee are not appropriate. b/ y non-synonymous (amino acid changing) The average Ka/Ks values between the gu e s nucleotide substitutions compared to the rate recognition and ATT categories, the two t o n of synonymous (non-amino acid changing) extremes, are significantly different from each N o v nucleotide substitutions between two taxa. other (P = 0.003, T test), and a comparison em b Both numbers need to be normalized to the between the recognition category and the RD er 2 2 number of possible events that could have category approaches significance (P = 0.08). , 2 0 occurred. Thus, the ratio is calculated as the Average Ka/Ks ratios for the Sia 18 number of nonsynonymous substitutions/the functional categories are greater for primates total number of nonsynonymous sites in the compared to rodents across all categories sequence of interest divided by the number of (Figure 2), a general finding consistent with synonymous substitutions/the total number of other studies. For 39 loci for which reliable rat synonymous sites in the same sequence. The orthologs are available (see Supplementary underlying assumption is that synonymous Table 2), rodent Ka/Ks ratios ranged from changes (Ks) are neutral with regard to natural 0.005 to 0.757, and the average Ka/Ks ratio selection, and should occur at a fixed rate in a from M-R comparisons is 0.156, a smaller given region of the genome. In contrast, the value than that from H-C comparison. non-synonymous changes (Ka) could Primates have significantly greater average represent selection, if they occur at a higher Ka/Ks values than rodents for the ATT (P = rate than expected from the background Ks 0.005, T test) and RD (P = 0.009, T test) rate. A ratio greater than 1 is thus commonly categories. As with the H-C pair, the M-R used as an indicator of strong positive comparison showed the highest average Ka/Ks 7 ratio for the Recognition group, although there The Sia-binding Ig V-set Domains of Siglecs is no statistically significant difference are Evolving Most Rapidly. Ka/Ks ratios between taxa for this category. Among 33 calculated across the entire coding regions of orthologous loci examined between primates genes can miss important changes due to and rodents (excluding the CD33rSiglecs, substitutions at a relatively small number of which are not strictly orthologous), 22 (67%) sites. We therefore looked for domain-specific show greater Ka/Ks ratios in H-C comparisons evolutionary changes between humans and than in M-R comparisons (P = 0.05 by a chimpanzees. Such potentially important binomial test, data not shown), suggesting an changes were found in Siglecs, overall acceleration in primates compared to Sialyltransferases, and HF1. Details regarding rodents. the first two are presented here, and evaluation Overall, the high rate of substitution of HF1 will be reported elsewhere. and the relatively high Ka/Ks values suggests Siglecs have multiple extracellular that the recognition category is evolving more immunoglobulin (Ig)-like domains followed rapidly than the others. This difference by a single transmembrane domain and a short between gene categories may reflect a cytoplasmic tail (4, 33). The first Ig-like D o w difference in evolutionary environment. domain (Ig1; V-set Ig-like domain) is known n lo Previous work in the anthocyanin pathway to be responsible for Sia recognition. Prior ad e d (39) suggested that genes upstream in a analyses have suggested domain-specific fro biosynthetic pathway tend to evolve more accelerated evolution associated with a m h slowly than downstream ones. While the Sia functional change in the Ig1 domain of human ttp ://w biology pathway as we have defined it here is SIGLEC9 (19), as well as a more rapid w w not strictly linear, there is a general trend accumulation of nonsynonymous substitutions .jb c towards early-acting loci such as those compared to an adjacent domain (Ig2; C2-set .o rg involved in biosynthesis and Ig-like domain) (20). These data indicate that b/ y activation/transport/transfer evolving under Ig1 might be the target for evolutionary gu e s more constraint than downstream loci such as change in the primate lineage. We therefore t o n those in the recognition category (Table 2). reexamined our prior analyses (20), by N o v Within the Recognition group, Siglecs examining non-CD33rSiglec loci (SIGLEC1 em b account for 56% (9 of 16) of human genes and and CD22) and excluding SIGLEC11 and er 2 2 42% (5 of 12) of mouse and rat genes. Ka/Ks SIGLEC5 (due to recent published and , 2 0 ratios for Siglec loci are significantly greater unpublished evidence of gene conversion). 18 than ratios for non-Siglec members of the The Siglec loci thus used are as follows: recognition group in both primates (P = 0.006) human SIGLEC1 , CD22, CD33, SIGLEC -6, and rodents (P=0.007) (Table 3), indicating -7, -8, -9, -10 and –12 (SIGLEC13 is deleted that Siglecs are driving the higher values for in the human genome (20)); chimpanzee this category. This difference appears to come SIGLEC1 , CD22, CD33, SIGLEC -6, -7, -8, - mainly from an increase in Ka values rather 9, -10, –12 and –13; baboon CD33, SIGLEC6, than Ks values (Table 3), consistent with the -8, -9, –10 and -13 (SIGLEC7 and –12 are notion that Siglecs may be undergoing deleted in baboon genome; baboon SIGLEC1 adaptive evolution in humans and primates and CD22 are not available (20)). For mouse (19, 20). Indeed, comparisons of the and rat, we used the available reliable chimpanzee and human genomes indicated sequences, which were Cd33, Siglec-E, -F and that CD33rSiglecs are among the fastest –G. Orthology between primate and rodent evolving groups of genes in the entire genome CD33rSiglecs is unclear because several (12). exon/domain-shuffling events appear to have 8 occurred in the primate lineage (20). Thus, Ig1 (con-Ig1), all primate comparisons we could not reliably compare individual showed Ka/Ks> 1, indicating rapid evolution CD33rSiglecs primate and rodent genes. (Figure 3). The mouse-rat comparison did not In comparisons between closely have a Ka/Ks value greater than 1 (0.821), but related species such as the human and the is still rather high. In contrast, all chimpanzee, a lack of synonymous nucleotide comparisons of concatenated Ig2 sequences substitution can result in a Ks of 0, which (con-Ig2) gave relatively low Ka/Ks ratios. renders the Ka/Ks ratio statistically We performed Fisher’s exact tests to compare meaningless. Thus we used the statistic Ka-Ks rates of synonymous and nonsynonymous to detect the signature of natural selection evolution between con-Ig1 and con-Ig2. (Ka-Ks >0, = 0, and < 0 are consistent with Concatenated Ig1 domains had a greater positive selection, neutral selection and number of total substitutions than con-Ig2 purifying selection, respectively). In human- domains in all species pairs. All species pairs chimpanzee (H-C) comparisons, 6 of 9 V-set also had significant differences in the Ig1-coding sequences show Ka - Ks value of > proportions of nonsynonymous and 0, and only one C2-set Ig2-coding sequence synonymous substitutions between con-Ig1 D o w showed a Ka-Ks of > 0 (see Supplementary and con-Ig2 (P < 0.010 for all four n lo Table 3). Fisher’s exact test supported the comparisons), with more nonsynonymous ad e d significance of this difference (P = 0.0498), changes in concatenated Ig1 (data not shown). fro indicating that Ka>Ks is found more Taken together, the above findings indicate m h frequently in Ig1 domains than in Ig2 that an accelerated accumulation of ttp ://w domains. In human-baboon and chimpanzee- nonsynonymous substitutions has occurred in w w baboon comparisons, the proportion of genes Ig1 compared to Ig2, and that the Sia .jb c showing Ka > Ks was nearly equal between recognition function of the Siglec Ig1 domains .o rg Ig1 and Ig2 (Supplementary Table 3). The are more rapidly evolving, in at least two b/ y mean Ka-Ks value of each Ig1- and Ig2- different mammalian clades, primates and gu e s coding sequence was calculated in every rodents. t o n comparison. Mann-Whitney tests (paired) Sialyltransferase Sequences are Highly N o v indicate a significant difference of mean Ka- Conserved but Their Tissue Expression em b Ks values between Ig1- and Ig2-coding Patterns are Not. Sialyltransferases (STs) are er 2 2 sequences in H-C comparisons (P = 0.0195), responsible for the formation of , 2 0 supporting the hypothesis that Sia-binding Ig1 sialylglycoconjugates by transferring the Sia 18 is the target of accelerated evolution in human group from CMP-Sia to one of many possible and chimpanzee lineages. Similar tests in glycoconjugate acceptors. In striking contrast human-baboon (H-B), chimpanzee-baboon to the Sia-recognizing proteins, ST sequences (C-B) and mouse-rat (M-R) comparisons gave were found to be highly conserved amongst mean Ka–Ks values between Ig1- and Ig2- primates and rodents (see Table 2, coding regions that showed the same trend, Supplemental Tables 1 and 2). Despite this, but were not statistically significant (P > we found that the actual tissue pattern of Sia 0.05). linkages generated by these enzymes varies The above approach compares Ig1- widely across different tissue types among and Ig2- coding sequences that are only ~ 400 humans, chimpanzees, mice, and rats (Figure bp and ~300 bp, respectively. To obtain more 4). Using the lectins from Sambucus nigra robust statistical power, we concatenated all agglutinin (SNA) and Maackia amaurensis available Siglec Ig1- or all Ig2-coding hemagluttinin (MAH) to detect alpha2-6 and sequences for each species. For concatenated alpha2-3 Sia linkages, respectively, we found 9 many inter-species differences and only a few locations of helix, coil, and sheet structures consistent similarities (see Figure 4). For among primates and rodents suggests that one example, expression of SNA-positive alpha 2- locus (ST8SIA3) has potentially important 6 linked Sia in the lung bronchioles was structural changes between rodents due to human-specific. Alpha 2-6 linked Sia is also both mouse-specific and rat-specific amino expressed in B cell areas of spleen in human, acid changes, and that two additional loci chimpanzee and mouse, but not in the rat. (ST6GALNAC3 and S T 8 S I A 2) show SNA reactivity in the red pulp area of the potentially major structural changes in spleen was only seen in the chimpanzee. In primates resulting from human-specific amino the kidney distal tubules, expression of alpha acid changes (Figure 5A and Figure 5B). Of 2-6 linked Sia was found discordantly in these, the human-specific change in ST8SIA2 human and rat. However, expression of this is of particular interest, since it appears to be linkage is preserved across all four species in mainly expressed in fetal brain (45) and endothelial cells and kidney glomeruli. generates polysialic acid chains, which are Expression of MAH-positive alpha 2-3 linked known to be involved regulating in neural Sia was also found in T cell areas of spleen plasticity and neurite outgrowth (46-48). D o w and in kidney glomeruli of all four species Multi-Species Comparisons of Erythrocyte n lo examined. In contrast, it was only seen in and Plasma Protein N- and O-Glycans ad e d chimpanzee lung bronchial epithelium goblet Confirms Rapid Evolution of Sialylation fro cells and in the red pulp of chimpanzee spleen. Patterns. The above tissue sialylation patterns m h Thus, each species appears to have were determined using linkage-specific ttp ://w experienced specific gains and losses of Sia lectins, which do not differentiate amongst w w expression, despite general conservation of ST different classes of glycans, nor do they .jb c sequences. provide information about underlying glycan .o rg Species-specific Changes in Sialylmotifs. structure. In order to obtain further b/ y While the causes of species-specific biochemical evidence for the diversity of gu e s differences in sialylation are mostly unclear, a tissue sialylation, we studied glycoproteins t o n few focused sequence changes in ST catalytic from erythrocyte ghosts and plasma proteins N o v domains could have effects on in several mammalian taxa (mice were not em b sialyltransferase action. All eukaryotic studied because of the small quantities of er 2 2 sialyltransferases have four conserved peptide material obtainable). Total N- and O-glycans , 2 0 regions in their catalytic domains, referred to were released by hydrazinolysis and profiled 18 as sialylmotifs “L” (long) and “S” (short) (40) using Dionex HPAEC with PAD detection. “3” (41) and “VS” (very short) (42). The L- As can be seen from Figure 6, the elution sialylmotif is mainly involved in donor profiles of negatively charged (sialylated) substrate binding (43) and the S-sialylmotif is glycans from each species was unique, important for binding to both donor and indicating that sialylation patterns are also acceptor substrates (44). We identified a unique (we did not further study the other number of species-specific amino acid potential cause of diversity, varying N-glycan changes in the sialylmotif regions of several branching). N- and O-glycan diversity is sialyltransferases. Since crystal structures of pronounced between taxa, as evidenced by sialyltransferases are not currently available, gains, losses, and shifts of various peaks. For protein secondary structure prediction was example, both ghost and plasma proteins show performed to obtain information about differences, with peak shifts in the gorilla and consequences of these species-specific amino orangutan compared to the other primates; a acid changes. Comparison of predicted relative lack of tri- and tetra-sialyl N-glycans 10

Description:
expression patterns. Analyses of N- and O- . expression on selected cell types, presumably . (Bame and Esko, 1989). the colloquial sense, as phylogenetic analysis of genomic information no longer supports this species
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.