M E T H O D S I N M O L E C U L A R M E D I C I N ETM PPssyycchhiiaattrriicc GGeenneettiiccss MMeetthhooddss aanndd RReevviieewwss EEddiitteedd bbyy MMaarriioonn LLeebbooyyeerr,, ,, MMDD PPhhDD FFrraannkk BBeelllliivviieerr,, ,, MMDD PPhhDD HHuummaannaa PPrreessss Psychiatric Genetics 3 1 Psychiatric Genetics Overview on Achievements, Problems, Perspectives Wolfgang Maier 1. The Progress of Psychiatric Genetics Psychiatric genetics is a relatively new term for an old research question: “Are behavioral and psychological conditions and devia- tions inherited?” The systematic empirical inquiries in this field started in the late nineteenth century with the work of F. Galton and his monograph Talent and Character, which was motivated by Darwin’s theory and the concept of degeneration. During the twen- tieth century, the methodological standard of the field was improved by the development of epidemiological, biometrical, and clinical research tools. This was the precondition to perform valid family, twin, and adoption studies. These methods revealed that all psychi- atric disorders aggregate in families, and that genes influence the manifestation of these disorders. It became clear that the degree of familiality and extent of genetic influence varies among diseases, with schizophrenia showing the strongest genetic background and disorders such as obsessive-compulsive and borderline personality disorder showing the weakest genetic background. Although there is some overlap, the familial patterns of diagnoses reveal a surpris- From: Methods in Molecular Medicine, vol. 77: Psychiatric Genetics: Methods and Reviews Edited by: M. Leboyer and F. Bellivier © Humana Press Inc., Totowa, NJ 3 4 Maier ingly high specificity, which was considered an argument for the appropriateness of diagnostic definitions. Considering the limita- tions in the pathophysiological understanding of psychiatric disor- ders, “breeding true” of diagnosis in families became the hallmark indicator of clinical validity (1). Segregation analyses of the specific mode of transmission were performed in many family samples over an extended period of time. One major goal was to find Mendelian patterns. It took decades to rule out the theory that the familial pattern of aggregation does not fit into the Mendelian mode of transmission. Environmental influ- ences on the manifestation of all psychiatric disorders were also unequivocally demonstrated. Thus, like other common diseases, all psychiatric disorders revealed a complex genetic and a multifacto- rial etiology rather than a monogenic etiology. Since about 1980, developments in molecular genetics made is possible to systematically map genes on the DNA strand by so-called linkage studies without any knowledge of the “true” patho- physiology and of the gene products (proteins) involved. This strat- egy required: • systems of positional DNA markers placed densely on the whole genome—first restriction-fragment-length polymorphism (RFLP), then microsatellite and now single-nucleotide polymor- phism (SNP) markers; and • samples of genetically informative families, each with more than one affected case (e.g., extended pedigrees with multiple cases or pairs of affected siblings). Linkage analysis identifies regions on the genome that hosts disease genes through the position of markers that segregate together with the disease in the families. This method is most con- clusive when the disease is transmitted in a Mendelian fashion. Thus, monogenic diseases were the first target for this method. Thousands of disease genes for monogenic (Mendelian) diseases were successfully mapped and subsequently identified by stepwise application of this strategy during the last two decades. The detec- tion of disease genes and etiologically relevant proteins using only Psychiatric Genetics 5 positional information (positional cloning) became the major tool in revealing the etiology of Mendelian diseases. Simultaneously, the success of the positional cloning approach in monogenic diseases motivated hopes and optimism that the genetic basis of more complex diseases (with a genetic component but with- out Mendelian transmission) would be revealed, including the most common chronic diseases. Their etiology is not as fully understood as that of monogenic diseases, presumably because of phenotypic and genetic heterogeneity; this is particularly true for all psychiatric disorders. Therefore, the positional cloning strategy offers an espe- cially promising method to reveal the unknown etiology of psychi- atric disorders, because other strategies have failed to fully elucidate the etiology and pathophysiology. The positional cloning strategy based on linkage analysis was first applied two decades ago to complex diseases, yet the early hopes for a new success story of the linkage strategy failed. Until now, the search for genes was disappointing for all complex diseases, par- ticularly for psychiatric disorders. Frustration initiated a process of revising the most appropriate strategy. Arguments and proposals can be subdivided into two lines of reasoning: 1. What is the most appropriate analytic strategy to detect disease genes for complex diseases? 2. How can phenotypes be properly defined in order to detect disease genes? How should the etiological heterogeneity of common diseases be approached? This book addresses these important questions with a series of articles. This outlines the current status of progress in psychiatric genetics and discusses perspectives on these questions on a more general level. 2. The Search for Genes: Current Status 2.1. Linkage Studies Genome-wide linkage studies are the key to finding the genes that carry mutations causative for monogenic diseases. Is this strat- 6 Maier egy as useful for complex diseases? Although a positive answer to this crucial question is not guaranteed—especially with regard to psychiatric disorders—genome-wide linkage studies in specific psy- chiatric disorders were also initially claimed to be the success strat- egy. A series of chromosomal regions with at least suggestive linkage to the disease emerged in the various genome-wide scans. These positive results are contrasted by an unexpected pattern of findings: • The linkage signals were only modest, and a very broad inter- val on the genome was implicated independently of the structure of the family sample under study (extended family or affected sibs). • Even the strongest linkage results were not consistently repli- cable. In general, some of the initial reports with at least suggestive evidence for linkage to schizophrenia, manic-depressive illness, or alcoholism were replicable with similar magnitude of the linkage signal, but neither of the initially positive linkage results were con- sistently replicable in four or more scans. • Linkage strategy in large extended pedigrees with a Mende- lian-like pattern of familial loading did not produce linkage signals that were clearly more distinct and pronounced than in samples of affected siblings up to now (e.g., the most distinct signal in schizo- phrenia observed by Brzustowicz et al. (2) in an inbred sample in contrast to an outbred population with signals up to 6.5). In particu- lar, there is no single extended family with a known influential dis- ease gene. • In light of this scenario, meta-analyses were offered as a con- sensus strategy. However, even after combining multiple samples with approx 1000 families, the magnitude of the linkage signals did not exceed the magnitude observed in the first positive result (3). The analogy to monogenic diseases would recommend first to replicate and then to systematically sharpen the linkage signal (i.e., increase the magnitude of the signal, and reduce the length of the linked region) in a stepwise manner by extension to other informa- tive families. Finally, the disease gene can be identified in this stepwise manner. Linkage in monogenic diseases is powerful, and recombination events between marker and disease loci can be iden- Psychiatric Genetics 7 tified, which is impossible in complex traits. Thus, it does not come as a surprise that this strategy does not work in common diseases using the available tools, as the extension of the sample size does not increase the magnitude of linkage. This constellation has been anticipated on theoretical grounds (see ref. 4). From the mutiple genome-wide scans in schizophrenia, bipolar disorder, alcoholism, or late-onset Alzheimer’s disease, we can con- clude that: 1. No single gene causes any of these disorders. Thus, susceptibility genes rather than causal disease genes are operating; otherwise, a sharp, consistently replicable linkage signal should have been detected. 2. There is no evidence that a major gene contributes most to the genetic variance. 3. Multiple susceptibility genes account for each of these disorders; neither of these contributing genes is necessary and/or sufficient for the manifestation of the disorder (vulnerability or susceptibility genes in complex disorders in contrast to causal genes in monogenic diseases). 4. Each of these multiple genes contributes only modest effects. Some authors speculate that in schizophrenia, for example, the contribu- tion of each susceptibility gene is limited to an odds ratio of less than 2.0(5). 5. The genetic heterogeneity cannot be decomposed to more homoge- neous subtypes. In particular, subtypes of the major psychiatric dis- orders that are influenced by a single gene or a major gene have not been found by linkage studies (although postulated on the basis of segregation analysis). Thus, although a few susceptibility genes have been suggested, no susceptibility gene has been clearly identified in major psychiat- ric disorders with more than 50% heritability (such as schizophre- nia or bipolar disorder). Given the difficulties of narrowing down a candidate region in complex diseases in a systematic manner (as in monogenic diseases), additional opportunities and tools are required to find vulnerability genes. The few successful examples of identi- fying susceptibility genes for complex diseases reveal the need for additional strategies or favorable conditions: 8 Maier • Good luck: In late-onset Alzheimer’s disease, a candidate gene ApoE was located in a linked region, and was confirmed first by association and finally by functional studies (6). • A combination with association or linkage disequilibrium strategy: In diabetes type 2, a promising candidate gene (calpain- 10) was detected in a linked region using a combined linkage- association approach (7). Thus, although some progress has been made, the speed needed for disease-gene discoveries is substantially slower than expected when the first linkage studies with DNA markers began nearly 20 yr ago. Several factors may contribute to the lack of replicability of positive linkage findings and to other disappointments and may challenge our initial assumptions, but these may also stimulate new, more promising approaches: • Magnitude of gene effects: Given the results of genome scans in psychiatric disorders, the susceptibility genes are likely to con- tribute only with small or modest effects. Thus far, linkage analysis has been enormously successful in detecting causal or major gene effects, but not for small effects. In addition, model-based consider- ations have demonstrated that association studies are usually far more powerful in detecting minor or modest gene effects. • Non-additive interaction of susceptibility genes: Biometrical analysis of the familial pattern of aggregation of diagnoses make it possible to draw conclusions on the putative number of underlying interacting genes and on the mode of interaction. The analysis of cumulative family studies by Risch (8) suggested the non-additive interaction of multiple genes in schizophrenia. Risch et al. (9) also concluded from the extended and widespread weak linkage signals detected in a genome scan in autism that more than 20 different loci are interacting. Linkage analysis may also identify interacting loci, but only with a distinct loss of power. Thus, in the presence of non- additive interaction, the required sample size is even higher. • Strength of the magnitude of linkage signals across popula- tions: Some linkage signals were found to be only replicable with comparable genetic background, but not in other populations. Indeed, some susceptibility genes (such as ApoE4 for late-onset Psychiatric Genetics 9 Alzheimer’s disease) are only influential in some populations (ApoE4 is mainly relevant in Caucasian but not in black popula- tions) (6). In schizophrenia, some linkage findings on 8p, 9q, and 15q were exclusively replicable in African populations, whereas 10p was until now only replicable among Caucasian populations (5). • Sample size problem: Small effect sizes as odds ratios (OR) of about 1.5 require unrealistically large numbers of informative fami- lies (e.g., affected sib-pairs). Risch and Merikangas (10)calculated for OR = 1.5 the number of families required to detect the gene by linkage analysis as 18.000 and more depending on the model; the currently available family sample sizes (~200) are at best able to identify genes with an OR of approx 4. It is evident from these con- siderations that narrowing down the candidate region to the disease gene cannot be accomplished by linkage analysis alone. • The sample size required for replication of a specific true link- age finding in complex disorders is substantially higher than for detecting one among many susceptibility genes (11). Thus, consid- ering the available sample sizes and the previously mentioned com- plicating factors, replication of “true” linkage findings cannot regularly be expected. Even a single replication of a reported link- age among 10 replication tests is a non-random event that argues for the validity of the initial positive result. Currently, the positional cloning approach through linkage ana- lysis has also proven disappointing in non-psychiatric complex diseases. The human genome project produced millions of polymor- phic genetic markers for fine-mapping of candidate regions, which will improve the power to detect linkage and to refine the candidate regions (see Chapter 3). However, there appear to be serious inher- ent limitations of linkage analysis in complex diseases. It even remains doubtful that the application of most informative marker systems such as SNPs will be able to identify susceptibility genes with modest effects (12). Therefore, the skeptical attitudes on the utility of linkage analysis in complex diseases are gaining more and more acceptance (12,13). Thus, alternatives to linkage analyses are receiving growing attention. 10 Maier 2.2. Association Studies Another strategy to identify susceptibility genes is association studies in case-control samples. This strategy has thus far focused on candidate genes, which are either hypothetically involved in the pathophysiology of the disease (functional candidates), or located in candidate regions identified by linkage analysis (positional can- didates). Association studies are based on linkage disequilibrium, which refers to correlations among neighboring alleles in the genome.“True” associations between a disease and a marker allele may have two meanings: • the marker allele impacts on the risk for the disease, or • a genetic variant near the marker allele is the actual determi- nant and is in linkage disequilibrium with the disease allele. Generally, many studies have followed this approach. The asso- ciation approach was only clearly successful in psychiatric disor- ders in identifying the two functional candidates—the ADH-2 and ALDH-2 genes—as susceptibility genes for alcoholism (with the ADH-2*2 and ALDH-2*2 alleles shown to be less common among Asian alcoholics (14). Similarly, an association between ApoE4 and late-onset Alzheimer’s disease has been proven with no negative report in Caucasian populations after linkage analysis identified ApoE as a positional candidate. Functional studies have shown that the identified ADH/ALDH alleles and the ApoE4 allele are suscep- tibility alleles that directly increase disease risk. In other diseases and candidate genes, the results are very diverse and difficult to interpret. Reported associations were followed by some positive replications. But there is no claim for association without non-replication. Thus, the association strategy was blamed as the cause of a very high number of false-positives. However, this limitation is not a result of the association technique, but of the inappropriate chosen levels of significance (12). Meta-analyses for particularly promising associations covering several thousand patients and controls—e.g., 5-HT-2a-receptor or D3-receptor gene in schizophrenia (15,16)—were performed to clarify this diversity; Psychiatric Genetics 11 relative risks for susceptibility alleles of 1.2 to 1.5 were suggested for a very limited number of claimed associations in schizophrenia. A major advantage of association compared to linkage studies is their relatively high efficiency in the detection of genes with small effect size. Thus, it was suggested that testing every gene in the genome for association may be more feasible than detecting a sus- ceptibility gene by linkage analysis (10). Although thousands of cases and controls are needed, this strat- egy is a priori more realistic. However, difficulties and warnings with the association strategy should not be ignored and have to be weighed against the prospects and limitations of linkage analysis (12,17,18). As there is not a convincingly optimal decision for either of two strategies, both must be considered as complementary. There are several unresolved problems with the association strategy. One problem is the selection of the most appropriate study group: Are all cases with a specific diagnosis appropriate, or only those with a secondary case in the family? Should probands with comorbidity for two disorders, each with a genetic determination, also be included? A related problem: Should the non-genetic influences on the manifesta- tion of the disorder being studied be taken into consideration? Would an adjustment for impacting environmental factors increase the power of analysis, or even decrease the power (19)? Valid answers depend on the knowledge of underlying etiologi- cal mechanisms, which are largely unknown for psychiatric disor- ders. Currently, decisions must be based on the most plausible assumptions. 2.3. Combination of Linkage and Association It has already been demonstrated that a combination of the link- age and the association strategy may overcome the limitations of either strategy alone: the identification of ApoE as a susceptibility gene for late-onset Alzheimer’s disease, and calpain-10 as a suscep- tibility gene for non-insulin-dependent diabetes. In both cases, link- age analysis identified a candidate region. Either: