BS: University of Illinois, 1989 Mathematics/Theory of Computation
MAJOR ACADEMIC/RESEARCH INTEREST
|Statistical Methods of Analyzing
Gene Expression Data
Expression data are generated by hybridizing transcripts to microarrays or gene chips from tissues under controlled conditions. If one gene regulates (up or down) another gene, or both are involved in a biochemical pathway, the profile of their expressions over time will correlate. Expression data are often analyzed using clustering procedures: clusters represent sets of genes displaying coordinately regulated expression profiles. As expression data contain significant amounts of random variation, and as clusters are dependent on the procedure applied, the assignment of confidence measures to clusters is useful. Specifically, we have implemented an algorithm in the statistical programming language R that assigns confidence measures to groupings of genes obtained by clustering routines. By the use of permutation testing and convex hull methods to simulate pseudo-random gene expression data sets, statistics are obtained from these randomly generated sets to provide a basis for comparison to the original data.
My contribution to the GeneX OpenSource gene expression database and software system [http://sourceforge.net/projects/genex] consists of several gene expression normalization and analysis programs, two of which presents a novel approach to clustering techniques. These methods are being generalized for applications to microarray data generated on different technology platforms (Affymetrix, NimbleGen™, and custom two-color cDNA arrays). Enhancements are being made to include metrics that provide the researcher with (more) biologically meaningful results.
Experimental Design and Normalization Methods
As the accumulation of genetic data continues to grow at a rapid speed, there is a need for immediate data analysis methods to assess experiments as they are in progress. Properties of the experimental design, which provide control and understanding of the source of variation in both signal and noise, affect the manner in which data should be analyzed and appropriate models constructed. The ultimate aim of any gene expression data analyst is to be involved in the experimental design of the microarray. Too often, experiments are placed in the analysts' hands without proper design. Poorly designed experiments most often result in meaningless analysis results, and always increase the efforts (and creativity) of the analyst. I am currently developing several experimental designs of plant and human array experiments, with several sets of both positive and negative controls, and am assessing their performance within different experiments.
Graph-theoretic Modeling of Temporal Gene Expression Data
The analysis of large amounts of microarray data is a significant challenge for the researcher. The parallel assay of thousands of data points, not all of which are independent, across a number of temporal states, provides an interesting platform for statistical analyses and the construction of models. To identify clusters within temporal gene expression profiles is equivalent to finding patterns in time series data. Although standard hierarchical clustering techniques can be applied to this type of data, no standard tools to identify such patterns exist. I have developed a graph-theoretic approach for constructing putative functional network models that suggest hypotheses about functions of unknown genes. This technique has been applied to several experiments of Dr. John Cushman at the University of Nevada Reno, with promising results. Specifically, the experiments measure the expression levels of the common ice plant, Mesembryanthemum crystallinum, under abiotic stress. Ice plant is a facultative halophyte, which can shift from C3 to Crassulacean acid metabolism (CAM) photosynthesis in response to environmental stress conditions such as water stress or conditions of hypersalinity. By understanding the complex adaptive mechanisms of this plant, a long-term goal is the deployment of these processes in agriculturally important crops to improve drought and salinity tolerance. An innovative distance metric is under development to provide a measure of similarity between any pair of genes in a more biologically grounded manner than commonly utilized distance metrics. Using these similarity relations, a bi-directional graph is generated by connecting genes based on their degree of similarity. From this graph one can detect "clusters" within the structure of the graph’s connectivity. These clusters provide hypotheses of gene function and interaction, and guide in the association of genes with biochemical pathway changes involved in stress responses and adaptive mechanisms of the organism under study. An on-going study focuses also on the post-analysis findings and the biological meaning behind clusters, an often-neglected step in microarray analysis.
Modeling Gene Interactions with Combinatorial Methods
Complex networks are often used to model hierarchical social, biological or communication systems, as well as genetic systems. As a first approximation, Boolean networks are often used. As part of my research at the Virginia Bioinformatics Institute with Professor Reinhard Laubenbacher, we developed a method of encoding a Boolean network as a collection of simplicial complexes. We also established a combinatorial analogue of the homotopy theory of topological spaces to analyze these simplicial complexes. The resulting combinatorial invariants provide information on the dynamics of the network. By representing genetic relationships via (Boolean) network structures, applications of combinatorial homotopy theory may reveal overall network behavior and patterns of influence within and across gene subgroups.
Visualization of Microarray Gene Expression Data
An artificial heatmap of the intensity levels of a 2-color cDNA microarray is generated for each channel, and for the background-corrected ratio values. This image allows the user to quickly determine whether any spatial variation appears on the array, or whether control spots are behaving as predicted. Similarly, the tool is applicable to high density oligonucleotide arrays, such as those made by Affymetrix and Nimblegen™. This technique provides the researcher with a bird's eye view of each array in the experiment. The software is written in the R programming language, and is very simple to use and implement.
Visualization of Haplotype Sharing and Fine Mapping using SNP Data
For the analysis of data stemming from our high-throughput genotyping experiments, we have developed a tool that automates the selection of SNPs for fine-mapping genetic associations. The tool generates a graph of genotypes from phased chromosomes that are grouped by haplotype via a hierarchical clustering approach to display long-range linkage disequilibrium patterns for a given allele of interest. We are currently using phased chromosome data from the HapMap project, and among other things, highlight those SNPs included on the Affymetrix 100K SNP GeneChip. These graphs make it possible to identify the haplotypes on which an associated SNP occurs and identify the region likely to contain the causative variant for a given association.
A separate module within HapMapper identifies SNPs that serve to distinguish haplotypes, as well as those in strong linkage disequilibrium with an associated allele, and those that are proxies for other SNPs in the region. These data are integrated into the visual display, aiding in the selection of SNPs for fine mapping haplotypes that contain the associated allele. The software is written in R and has been implemented for our use in fine-mapping several regions of interest.
|Cushman JC, Tillett RL, Wood JA, Branco
JA, Schlauch KA Large-scale mRNA expression profiling
in the common ice plant, Mesembryanthemum crystallinum, performing C3
photosynthesis and Crassulacean acid metabolism (CAM). (2008). J. Exp.
Botany. In press.
Deluc LG, Grimplet J, Wheatley MD, Tillett RL, Quilici DR, Osborne C, Schooley DA, Schlauch KA, Cushman JC, Cramer GR. Transcriptomic and metabolite analyses of Cabernet Sauvignon grape berry development. (2007). BMC Genomics. 2007 Nov 22;8(1):429 [Epub ahead of print]
Herbert, A, Lenburg, M, Ulrich, D, Gerry, N, Schlauch, K, Christman, M. Open access database of candidate quantitative-trait associations from a SNP-based genome-wide association study of the Framingham Heart Study. (2007). Nature Genetics, 39 (2), 135-136.
Grimplet, J, Deluc, L, Tillet, R, Wheatley, M, Schlauch, K, Cramer, G, Cushman, J. Tissue-specific mRNA expression profiling in grape berry tissues. (2007). BMC Genomics, 8, 187.
Vincent, D, Ergül, A, Bohlman, MC, Tattersall, EA, Tillett, RL, Wheatley, MD, Woolsey, R, Quilici, DR, Joets, J ,Schlauch, K, Shooley, D, Cushman, JC, and Cramer, GC. Proteomic analysis reveals differences between Vitis vinifera L. cv. Chardonnay and cv. Cabernet Sauvignon and their responses to water deficit and salinity. (2007). Journal of Exp. Botany, 58:1873-1892.
Tattersall EA, Grimplet J, Deluc LG, Wheatley MD, Vincent D, Osborne C, Ergül A, Lomen E, Blank RR, Schlauch KA, Cushman JC, Cramer GR. (2007) Transcript abundance profiles reveal larger and more complex responses of grapevine to chilling as compared to osmotic and salinity stress. Funct. Int. Genomics. 7:317-33.
Cramer, G, Ergul, A, Grimplet, J, Tillett, R, Tattersall, E, Bohlmann, MC, Vincent, D, Sonderegger, J, Evans, J, Obsorn, C, Quilici, D, Schlauch, K, Schooley, D and Cushman, J. Transcript and metabolite profiling of grapevines exposed to gradually increasing, long-term water deficit or isoosmotic salinity. (2006). Functional & Integrative Genomics, Online First.
Baranova, A, Gowder, S, Schlauch, K, Elariny, H, Collantes, R, Afendy, A, Ong, J, Goodman, Z, Chandhoke, V, Younossi, ZM. Gene Expression of Leptin, Resistin, and Adiponectin in the Adipose Tissue of Obese Patients with Non-Alcoholic Fatty Liver Disease and Insulin Resistance. (2006). Obesity Surgery, Surgery, 16, 9, 1118-1125.
Baranova, A, Gowder, S, Naouar, S, King, S, Schlauch, K, Jarrar, M, Ding, Y, Cook, B, Chandhoke, V and Christensen, A. Expression profile of ovarian tumors: distinct signature of Sertoli-Leydig cell tumor. (2006). Int J Gynecol Cancer, 16, 1963-1962.
Espinoza, C, Vega, A, Medina, C, Schlauch, K, Cramer, G and Arce-Johnson, P. Gene expression associated with compatible viral diseases in grapevine cultivars. (2006). Functional and Integrative Genomics, Online First.
Cramer, GR, Ergul, A, Vincent, D, Bohlmann, C, Grimplet, J, Tattersall, EA, Tillet, R, Evans, J, Quilici, D, Schooley, D, Cushman, J, Schlauch, K, and Mendes, P. Integrative functional genomics of abiotically-stressed grapevine: A system for discovery of gene and plant functions. (2006). Proceedings of the International Grape Genomics Symposium, pp. 30-37. Editors: W. P. Qiu and L. G. Kovacs, Missouri State University, Springfield Missouri.
Younossi, Z, Baranova, A, Ziegler, K, Del Giacco, L, Schlauch, K, Born, T, Elariny, H, Gorreta, F, VanMeter, A and Younoszai, A. (2005). A Genomic and Proteomic Study of the Spectrum of Non-alcoholic Fatty Liver Disease. Hepatology, 42, (3), 665-674.
Baranova, A, Collantes, R, Gowder, S, Elariny, H, Schlauch, K, Younoszai, A, King, S, Randhawa, M, Pusulury, S, Alsheddi, T, Ong, J, Martin, L, Chandhoke, V and Younossi, ZM. Obesity-related differential gene expression in the visceral adipose tissue. (2005). Obesity Surgery, 15 (6), 758-765.
Baranova, A, Schlauch, K, Gowder, S, Collantes, R, Chandhoke, V and Younossi, ZM. Microarray Technology in the Study of Obesity and Non Alcoholic Fatty Liver Disease. (2005). Liver International, 25: 1091-1096.
Younossi, ZM, Gorreta, F, Ong, JP, Schlauch, K, Del Giacco, L, Elariny, H, Van Meter, A,Younoszai,, A, Goodman, A, Baranova, A, Christensen, A, Grant, G and Chandhoke, V. Hepatic Gene Expression in Patients with Obesity-related Non-Alcoholic Steatohepatitis. (2005). Liver International, 25, (4), 760-771.
Cramer, GR, Cushman, JC, Schooley, DA, Schlauch, K, Quilici, D, Vincent, D, Bohlman, MC, Ergul, A, Tattersall, EAR, Tillett, R, Evans, J and Delacruz, R. Progress in Bioinformatics: “The Challenge of Integrating Transcriptomic, Proteomic and Metabolomic Information”. (2005). Acta Horticulturae 689:417-425.
Davletova S, Schlauch K, Coutu J and Mittler, R. The zinc-finger protein Zat12 plays a central role in reactive oxygen and abiotic stress signaling in Arabidopsis. (2005). Plant Physiol. 139, 847-856.
Davletova, S, Rizhsky, L, Liang, H, Shengqiang, Z, Oliver, D, Coutu, J, Shulaev, V, Schlauch, K, and Mittler, R. Cytosolic Ascorbate Peroxidase 1 Is a Central Component of Reactive Oxygen Gene Network of Arabidopsis. (2005). The Plant Cell, 17:268-281.
Munneke, B, Schlauch, K, Simonsen, K Beavis, WD and Doerge, RW. Adding Confidence to Gene Expression Clustering. (2005). Genetics, 170: 2003-2011.
Lee, JK, Laudeman, T, Kanter, J, James, T, Siadaty, MS, Knaus, WA,
Prorok, A, Bao, Y, Freeman, B, Puiu, D, Wen, LM, Buck, G,
Schlauch, K, Weller, J, and Fox, JW. GeneX Va: VBC Open Source
Microarray Database and Analysis Software for Multiple Users in
Biomedical Research (2004). Biotechniques. 36:634-642.
Laubenbacher, R, and Schlauch, K. (2000). An Algorithm for the Quillen-Suslin Theorem for Quotients of Polynomial Rings by Monomial Ideals. Journal of Symbolic Computation, 30 (5), 555-571.
|College of Agriculture, Biotechnology
and Natural Resources
University of Nevada, Reno