Dr. Sumeet Dua

Max P. and Robbie L. Watson Eminent Scholar Chair

  • Full Screen
  • Wide Screen
  • Narrow Screen
  • Increase font size
  • Default font size
  • Decrease font size

Sireesha Krishna Guntaka (2006)

E-mail Print PDF

A Computational Framework for Autonomous Comparison of Protein Classification Schemas; MS-CS Practicum, Student: Sireesha Krishna Guntaka (2006)

The completed research assesses Orthoprot and Dihedprot, two separately proposed, novel protein structural comparison and classification techniques. A computational framework has been developed to compare the performances of the research assesses against one another, and also against the Pride2 classification method. The objective of creating this computational framework (protein mining engine) is to allow proteomic researchers to compare and analyze the strength of the various protein classification techniques currently employed, and it shall be developed in the near future. To achieve this, the three classification techniques—Orthoprot, Dihedprot, and Pride2—have been ported to the World Wide Web using the Matlab webserver.
The Orthoprot classifier uses the secondary geometric descriptors of the phi dihedral angle and bond distance to represent a protein’s structure. The Dihedprot classifier employs two dihedral angles to represent a protein’s secondary structure. While the Orthoprot classifier performs a wavelet analysis using only a specified number of coefficients to represent each protein, a two dimensional Fast Fourier Transform is employed by the Dihedprot classifier to represent each protein using a specified number of coefficients. The first test of the protein mining engine used a 45 protein dataset to compare the strength of the Orthoprot and Dihedprot classifiers against that of Pride2. First, the classification results of Orthoprot and Dihedprot classifiers and the distance matrix of Pride2 were obtained from the computational framework. Then the standalone program was used to compute the dendrogram, percentage accuracy, and the kappa statistic of the Pride2 distance matrix. The second test was performed using an 80 protein dataset in order to compare the performance of the Orthoprot and Dihedprot classifiers to each other. To accomplish this, a dendrogram, a confusion matrix, the percentage accuracy, the kappa statistic, the false alarm, and the ROC plots for both classifiers were analyzed. In summary, the protein mining engine serves as a tool to compare and analyze the results of various protein structural comparison and classification techniques with those of the Orthoprot and the Dihedprot classifiers.

You are here: Research Student Thesis