Statistical Genetics Lab
The lab headed by Dr. Yongtao Guan is a computational lab with primary research interests in developing statistical and computational methods to address problems arising from genetics and genomics. Some of the current projects include:
- Structure of haplotypes and local ancestry inference
Motivated by approximating coalescence and recombination, we developed a two-layer hidden Markov model to detect structure of (ancestry) haplotypes. This allows to model two scales of LD and thereby infer local ancestry of admixed individuals. Our method has the following advantage compare to competing methods: 1) it can directly work with dipoid genotypes in source population; 2) it cleanly handles missing data; and 3) it has a high resolution -- can detect ancestry track length of a few tenths of a centimorgan. See software page for manual and software ELAI.
- Haplotype-phenotype association mapping
Using the two-layer model, we can define and infer local haplotype sharing (LHS) between cohort individuals. Then, using a random effect model, we may link phenotypes and the LHS at each marker to perform association mapping. Because LHS is inferred using local haplotypes, our method detects (unspecified) haplotype association. Our method has the following advantages: 1) it directly works with diplotype; 2) avoids arbitrariness in specifying haplotypes; 3) same number number of tests as single SNP analysis. The haplotype association method can be extended to multiple related phenotypes. A visiting student, Hanli Xu, is working on this project.
- De novo assembly and variants calling
Identifying difficulties associated with the de Bruijn graph based approach for de novo assembly, we are currently developing algorithms and software packages for a Monte Carlo approach for de novo assembly. A postdoc, Liang Zhao, who is trained in compute science, is working on this project.
- Directed acyclic graphs and epistatic interactions
A directed acylic graph (DAG) specifies a joint distribution on all nodes. When there is no epistatic interaction between parental nodes of a node of interest, the direction of edges can point either way with equil probabilies. If we penalize the edge that goes into the node of interest, the posterior edges that jointly point to the node must contain interactions that can overcome the penalty. This is the rationale behind using DAG to detect interaction. Computation is the main challenge. Quan Zhou, a graduate student from SCBMB program at Baylor, is working on this project.
Students who are interested in statistical genetics are encouraged to send Yongtao Guan, Ph.D an e-mail to discuss potential rotation projects.