Molecular and Human Genetics
Baylor College of Medicine
Houston, TX, US
Dan L Duncan Comprehensive Cancer Center
Baylor College of Medicine
Houston, Texas, United States


BS from Fordham University
MS from Columbia University
PhD from Columbia University
Post-Doctoral Fellowship at University of Tübingen

Professional Interests

  • Statistical genetics
  • Method development
  • Genetic epidemiology
  • Gene mapping and identification
  • Complex and Mendelian traits, e.g. Hearing Impairment

Professional Statement

I am currently a tenured Professor in the Department of Molecular and Human Genetics at Baylor College of Medicine and Director of the Center for Statistical Genetics. I am also an adjunct Professor in the Department of Statistics at Rice University and a Senior Research Associate at The Rockefeller University. I received a Master of Science degree in Biostatistics in 1989 and a Doctor of Philosophy degree in Epidemiology in 1994 from Columbia University. My interest lies in statistical genetics and genetic epidemiology. In addition to applied work, mapping disease\trait loci, I have worked extensively in developing methods to aid in gene identification and understanding disease etiology.

Currently, my major interest is in the development of methods to analyze rare variants. This work includes the Combined Multivariate and Collapsing (CMC) method (Li and Leal, 2008) which was the first test specifically developed to detect complex trait rare variant associations and the Kernel Based Adaptive Cluster (KBAC) method (Liu and Leal, 2010a). My group proceeded to develop additional rare variant association methods: to control type I error when there is missing variant data (Auer et al. 2013); to control type I error when sequencing is used for variant discovery in a subset of samples (Liu and Leal, 2012a); to detect secondary trait associations (Liu and Leal, 2012b; Liu and Leal, 2012c) and to analyze family data (Liu and Leal, 2012d; He et al. 2014). We also developed methods to estimate genetic effects and quantify the heritability for rare variants (Liu and Leal 2012e). Simulation programs, SimRare (Li, Wang and Leal 2012) and SeqPower (Wang et al. 2014a), were developed in order to evaluate power and type I error for rare variant association methods. The Variant Association Tools (VAT) pipeline was developed by my group to perform both quality control and association analysis of sequence, genotype and imputed data with an emphasis on the analysis of rare variants (Wang et al. 2014b). Additionally we investigated the best strategies to design rare variant studies (Li and Leal 2009) and to perform replication studies (Liu and Leal 2010b). Recently we published an article on analyzing rare variants in trio data and these methods were applied to the analysis of autism (He et al. 2014). We are continuing to develop rare variant association and linkage (parametric and allele sharing) methods for analyzing family data. Some of these methods are being developed specifically for a project on autism project, for example methods that incorporate information from both transmitted and de novo events.

On the applied side, I have studied a wide variety of phenotypes including: autism, platelet reactivity, bipolar disorder, coronary diseases (i.e. LVOTO and aneurysm and dissection) and non-syndromic hearing impairment (NSHI). The study on NSHI, for which I am the principal investigator, has been funded for the past 18 years by the National Institute of Deafness and other Communication Disorders. This research is currently funded by two National Institute of Health (NIH) R01 grants for which I am the sole principal investigator. Over 650 families with NSHI from Pakistan, USA, Jordan, Switzerland, Poland and Turkey have been ascertained leading to the identification and publication of >20 new NSHI loci as well as eleven novel NSHI genes. This study now utilizes exome sequencing which has greatly enhanced the speed of novel gene identification. A number of genes have been identified using exome sequence data including: KARS (Santos-Cortez et al. 2013); TBC1D24 (Rehman et al. 2014) and ADCY1 (Santos-Cortez et al. 2014). In addition to studying NSHI I have also recently expanded my research to study several other Mendelian traits which include phenotypes which encompass the eyes, skin, nail, brain and bone. Thus far we have collect 85 families for which we are currently performing mapping of these traits and following up the findings with exome sequencing.

I am also currently involved in the analysis of several large-scale next generation sequencing projects: National Heart, Lung and Blood Institute (NHLBI) - Exome Sequencing Project (ESP), Mendelian Exome Sequencing Project, Minority Health-GRID Network on Hypertension and the Colorectal Cancer Susceptibility Genome Sequencing Project. I led the statistical analysis of NHLBI-ESP project one of the first large scale exome sequencing projects. Using data from ~6,800 exomes, analysis was performed on >70 heart, blood and lung related traits. Data from this project has also been used for the study of population genetics (Tennessen et al. 2012; Fu et al. 2013; O’Conner et al. 2013). I presented highlights of the findings from NHLBI-ESP at the plenary session at the American Society of Human Genetics in 2012 and I am currently leading the writing team describing the major results from the study.

My research is currently supported by 10 grants from the NIH and a grant from the Department of Defense. I have published >170 peered reviewed articles and I am either sole, first or senior author on >65 of these articles. I am an associate editor for the American Journal of Human Genetics and PLoS Genetics and serve on the program committee for the American Society of Human Genetics. I previously was a member of the board of directors for the International Society of Genetic Epidemiology (2007-2013) and was also the President of this society in 2012.

I am also involved in teaching and organizing courses in statistical genetics. For the past 19 years I have organized and taught the Advanced Gene Mapping Course which is held annually at The Rockefeller University. This course is supported by a grant from the NIH for which I am the principal investigator. I also organize and teach statistical genetic courses at the Max Delbrück Center in Berlin. In recent years, I have also taught courses at Beijing University, Fudan University (Shanghai), Kyoto University, University of Helsinki, University of Toronto, University of Oslo, Erasmus University (Rotterdam), Max Planck Institute (Berlin) and Seoul National University. I am involved in training graduate students and postdoctoral fellows from Rice University and Baylor College of Medicine as well as mentoring a number of international pre-doctoral trainees.


Auer PL, Wang G, NHLBI Exome Sequencing Project, Leal SM (2013) Testing for rare variant associations in the presence of missing data. Genet Epidemiol 37:529-38. PMID: 23757187 PMCID: in progress

He A, O’Roak BJ, Smith JD, Wang G, Hooker S, Li B, Kan M, Krumm N, Nickerson DA, Shendure J, Eichler EE, Leal SM (2013) Rare variant extensions of the transmission disequilibrium test: Application to autism exome sequence data. Am J Hum Genet 94:33-46. PMID:24360806

Fu W, O’Connor TD, Jun G, Kang HM, Goncalo A, Leal SM, Stacey Gabriel S, Altshuler D, Shendure J, Nickerson DA, Bamshad MJ, Population Genetics Working Group, Broad GO, Seattle GO, NHLBI Exome Sequencing Project, Akey JM (2013) Analysis of 6,515 exomes reveals a very recent origin of most human protein-coding variants. Nature 493:216-20 PMID: 23201682; PMCID (in progress)

Li B, Leal SM (2008) Methods for Detecting Associations with Rare Variants for Common Diseases: Application to Analysis of Sequence Data. Am J Hum Genet 83:311-21 PMID: 18691683; PMCID: PMC2842185

Li B, Leal SM (2009) Discovery of rare variant via sequencing: Implications for the design of complex trait association studies PLoS Genet 5:e1000481 PMID: 19436704; PMCID: PMC 2674213

Li B, Wang G, Leal SM (2012) SimRare: a program to generate and analyze sequence-based data for association studies of quantitative and qualitative traits. Bioinformatics 28:2703-4 PMID: 22914216; PMCID: PMC3467746

Liu DJ, Leal SM (2010a) A novel adaptive method for the analysis of next generation sequencing data to detect complex trait associations with rare variants due to gene main effects and interactions. PLoS Genet 6(10):e1001156 PMID: 20976247; PMCID: PMC2954824

Liu DJ, Leal SM (2010b) Replication strategies for rare variant complex trait association studies via next generation sequencing. Am J Hum Genet 87:790-801 PMID: 21129725; PMCID: PMC2997372

Liu DJ, Leal SM (2012a) SEQCHIP: A powerful method to integrate sequence and genotype data for the detection of rare variant associations. Bioinformatics 28:1745-51 PMID: 22556370; PMCID: PMC3381973

Liu DJ, Leal SM (2012b) A flexible likelihood framework for mapping secondary phenotypes in genetic studies using selected samples: application to sequence data. Eur J Hum Genet 20:449-56 PMID: 22166943; PMCID: PMC3306858

Liu DJ, Leal SM (2012c) A unified method for detecting secondary trait associations with rare variants: Application to sequence data. PLoS Genet 8:e1003075 PMID: 23166519; PMCID; PMC3499373

Liu DJ, Leal SM (2012d) A unified framework for detecting rare variant quantitative trait associations in pedigree and unrelated individuals via sequence data. Hum Hered 73:105-22 PMID: 22555759; PMCID: PMC3369372

Liu DJ, Leal SM (2012e) Estimating genetic effects and quantifying missing heritability explained by identified rare variant associations. Am J Hum Genet 91:585-96 PMID: 23022102; PMCID: PMC3484659

O’Connor TD, Fu F, NHLBI GO Exome Sequencing Project2, ESP Population Genetics and Statistical Analysis Working Group, Mychaleckyj JC, Logsdon B, Auer P, Carlson C, Leal SM, Smith J, Rieder M, Bamshad MJ, Nickerson DA, Akey JM (2013): Rare variation facilitates inferences of fine-scale population structure in humans. PLoS One. 8:e65834. PMID:23861739; PMCID: PMC3701690

Rehman AU, Santos-Cortez RL, Morell RJ, Drummond MC, Ito T, Lee K, Khan A, Basra M, Wasif N, Ayub M, Ali R, Raza S,University of Washington Centre for Mendelian Genomics, Nickerson DA, Shendure J, Bamshad M, Riazuddin S, Billington N, Khan SN, Friedman PL, Griffith AJ, Ahmad W, Riazuddin S, Leal SM, Friedman TB (2014) Mutations in TBC1D24, a gene associated with epilepsy, also cause nonsyndromic deafness DFNB86. Am J of Hum Genet 94:144-52. PMID: 24387994; PMCID: PMC3882911

Santos-Cortez RL, Lee K, Azeem Z, Antonellis P, Pollock LM, Khan S, Irfanullah, Andrade-Elizondo PB, Chiu I, Adams MD, Basit S, Smith JD, University of Washington Center for Mendelian Genomics, Nickerson DA, McDermott Jr. BM, Ahmad W, Leal SM (2013) Mutations in KARS, encoding Lysyl-tRNA synthetase, cause autosomal recessive nonsyndromic hearing impairment DFNB89. Am J Hum Genet 93:132-40. PMID: 23768514; PMCID: PMC3710764

Santos-Cortez RLP, Lee K, Giese AP, Ansar M, Amin-ud-din M, Rehn K, Xin X, Aziz A, Chiu I, Ali RH, Smith JD, University of Washington Center for Mendelian Genomics, Shendure J, Bamshad M, Nickerson DA, Ahmed ZM, Ahmad W, Riazuddin S, Leal SM (2014) Adenylate Cyclase 1 (ADCY1) mutations cause recessive hearing impairment in humans and defects in hair cell function and hearing in zebrafish. Hum Mol Genet 23:3289-98 PMID: 24482543l; PMCID:PMC4030782

Tennessen JA, Bigham AW, O’Connor TD, Fu W, Kenny EE, Gravel S, McGee S, Do R, Liu X, Jun G, Kang HM, Jordan D, Leal SM, Gabriel S, Rieder MJ, Abecasis G, Altschuler D, Nickerson DA, Boerwinkle E, Sunyaev S, Bustamante CD, Bamshad MJ, Akey JM, Broad GO, Seattle GO, on behalf of the NHLBI Exome Sequencing Project (2012) Evolution and functional impact of rare coding variation from deep sequencing of 2,440 human exomes. Science 337:64-9 PMID: 22604720; PMCID (in progress)

Wang GT, Peng B, Leal SM. (2014b) Variant association tools for quality control and analysis of large-scale sequence and genotyping array data. Am J Hum Genet 94:770-83 PMID: 24791902; PMCID:PMC4067555

Wang GT, Li B, Lyn Santos-Cortez RP, Peng B, Leal SM. (2014a) Power analysis and sample size estimation for sequence-based association studies. Bioinformatics PubMed PMID: 24778108; PMCID (in progress)

Selected Publications