Dr. Sumeet Dua

Max P. and Robbie L. Watson Eminent Scholar Chair

  • Full Screen
  • Wide Screen
  • Narrow Screen
  • Increase font size
  • Default font size
  • Decrease font size

Sree Harsha Pothireddy (2006)

E-mail Print PDF

Discovery of Active Metabolic Paths Using Association Rules; MS-CS Practicum; Student: Sree Harsha Pothireddy (2006)

Due to the many high-throughput experiments currently conducted in molecular biology and enabled by high computational abilities, a wide variety of gene-expression and proteomic data is available for research and analyses. Analysis of multiple data types for a single purpose is an effective methodology for novel biological discovery. While the challenge of comparison and integration is amplified by the complexity and heterogeneity of these data sources, the flexible aptitude of data mining techniques offers promise for rapid analysis of such disparate data. The determination of active metabolic paths from a large network of paths is one significant area of such research. Our work proposes a novel data mining methodology designed to determine active and biologically significant metabolic pathways, and patterns within a given metabolic pathway, by combining pathway information and gene-expression data using association rule theory. In most of the publicly available metabolic pathway databases, the metabolic processes are represented diagrammatically, however, most of these diagrams do not indicate whether or not the presented paths are biologically active. By combining gene-expression data and metabolic pathway information we are able to confirm long-range, biologically significant findings in various well-known metabolic pathways: the Pentose-Phosphate, the Oxidative-Phosphorylation, Riboflavin Metabolism, and the Purine Metabolism pathway.
We compare our methodology with two major works previously performed in the field of determining active pathways. Because we studied a wider range of paths and also ranked our pathways according to the number of active sub-paths, our results have proven to provide a greater depth, and more accurate understanding of metabolic pathways. Rather than claiming the entire pathway as active, we have produced sufficient proof in this work to show that only certain sub-paths within them are active. We have confirmed our results with the KEGG and other reliable (and federally-funded) database sources, further solidifying the surety of our claim that the pathways which we have obtained are really biologically significant and active. In short, our method yields long term, biologically significant facts through the simple and effective implementation of an association rule-based technique that combines disparate data.

You are here: Research Student Thesis