Home | CV | Projects | Publications | Work Experience | Personal

Past Projects

Fall 2008-Spring 2009

Master Thesis : Structural Pattern Detection and Domain Recognition for Protein Function Prediction

Supervised by

Proteins are essential players of the cell that control and affect all functions. In proteins, structural patterns consist of a few amino acids which assemble in a specific arrangement. Due to their specific structures, they are recognized as the functionally important sites of the proteins, and conserved even in distantly related proteins. Moreover, several structural patterns merge and form domains which are also associated with the proteins function.

In this work, we introduced a method for finding structure patterns common to a protein pair by using graphlet mappings. We presented protein structures with graphs, and then generate graphlets. Local alignments are produced by mapping the generated graphlets from protein pairs. Moreover, by merging these local alignments, we tried to recognize functionally important domains.

These common domains are very useful in protein function prediction, fold classification and homology relationship detection. In this work, our algorithm was first applied to fold classification problem and 80% accuracy was observed. Furthermore, our algorithm was also used for protein function prediction and 97% accuracy was observed.

Fall 2008 EnzyMiner: Automatic Identification of Protein Level Mutations and Their Impact on Target Enzymes from PubMed Abstracts

Supervised by Ugur Sezerman, Ph.D., Biological Sciences and Bio Engineering, Sabanci University

A better understanding of the mechanisms of an enzyme's functionality and stability, as well as knowledge and impact of mutations is crucial for researchers working with enzymes. Though, several of the enzymes’ databases are currently available, scientific literature still remains at large for up-to-date source of learning the effects of a mutation on an enzyme. However, going through vast amounts of scientific documents to extract the information on desired mutation has always been a time consuming process. In this project, therefore, our aim was to develope an unique method, termed as EnzyMiner, which automatically identifies the PubMed abstracts that contain information on the impact of a protein level mutation on the stability and/or the activity of a given enzyme.

EnzyMiner identifies the abstracts that contain a protein mutation for a given enzyme and checks whether the abstract is related to a disease with the help of information extraction and machine learning techniques. For disease related abstracts, the mutation list and direct links to the abstracts are retrieved from the system and displayed on the Web. For those abstracts that are related to non-diseases, in addition to having the mutation list, the abstracts are also categorized into two groups. These two groups determine whether the mutation has an effect on the enzyme’s stability or functionality followed by displaying these on the web.

Suveyda Yeniterzi and Ugur Sezerman, EnzyMiner: Automatic Identification of Protein Level Mutations and Their Impact on Target Enzymes from PubMed Abstracts, accepted for publication in BMC Bioinformatics.

Fall 2007 Developing A New Approach to Measure the Similarities of Protein Structures Using Network Properties
  • Reyyan Yeniterzi, M. Sc., Computer Science and Engineering, Sabanci University
  • Alper Kucukural, Ph.D. student, Biological Sciences and Bio Engineering, Sabanci University
  • Ugur Sezerman, Ph.D., Biological Sciences and Bio Engineering, Sabanci University
  • Nilay Noyan, Ph.D., Manufacturing Systems and Industrial Engineering, Sabanci University

Protein structure prediction is one of the most important research areas in bioinformatics. CASP has been one of the world-wide experiments in this area. It assesses the quality of methods and results of international research in this area. CASP evaluation is based on comparison of each model with the corresponding native model. In this work, we aim to estimate a new function to calculate the measure of similarity between model and native protein structures. Moments of graph theoretical properties were used to find a similarity measure between two protein structures. Multiple Linear Regression was applied to these graph properties to estimate a new function.

Suveyda Yeniterzi, Reyyan Yeniterzi, Alper Kucukural, Nilay Noyan and Ugur Sezerman, A New Approach to Measure the Similarities of Protein Structures Using Network Properties, Proceedings of the 3rd International Symposium on Health Informatics and Bioinformatics (HIBIT08).

A New Approach to Measure the Similarities of Protein Structures Using Network Properties, poster presented in BIOSYSBIO 2008, Synthetic Biology, Systems Biology and Bioinformatics, April 20-22, 2008, London, UK.

Fall 2007 Determining the important career factors which affect the success of a manager as a CEO
  • Reyyan Yeniterzi, M. Sc., Computer Science and Engineering, Sabanci University
  • Ugur Sezerman, Ph.D., Biological Sciences and Bio Engineering, Sabanci University
  • Nilay Noyan, Ph.D., Manufacturing Systems and Industrial Engineering, Sabanci University
  • Ayse Karaevli, Faculty of Management, Sabanci University

We approached this problem in two ways. Our first method was to use factor analysis to examine the underlying structure among the variables. As the second approach we used genetic algorithm to find a subset of features which helps us to to do better classifications among CEOs.

Fall 2006 - Spring 2007
Graduation Project : Error tolerance of Gene Networks and their robustness to different timescales

Supervised by Canan Atılgan, Ph.D., Material Science and Engineering, and Ali Rana Atılgan, Ph.D., Sabanci University

A gene network is composed of genes that interact with each other and thereby affecting their expression levels in the cell. In this work, we implemented gene networks computationally and investigated the effect of random errors in the expression of genes. We further included genes that operate at different time scales. We investigated the degree of robustness of gene networks upon these changes.

Fall 2006 - Spring 2007 Using Genetic Algorithms to Select the Minimum Number of Features for Classification
  • Reyyan Yeniterzi, M. Sc., Computer Science and Engineering, Sabanci University
  • Alper Kucukural, Ph.D. student, Biological Science and Engineering, Sabanci University
  • Ugur Sezerman, Ph.D., Biological Science and Engineering, Sabanci University

Selecting most relevant factors from genetic profiles that can optimally characterize cellular states is of crucial importance in identifying complex disease genes and biomarkers for disease diagnosis. In this work, we present an approach using a genetic algorithm for feature subset selection problem that can be used in selecting optimum set of genes for classification of gene expression data. We implemented a dynamic parent generation procedure which is inspired by the nature. The idea of fitter and fewer genes (features) make up for fitter and more evolved efficient parents enabled us to dynamically reduce number of genes. This way we could obtain optimum number of features with the highest classification accuracy for each data set.

Alper Kucukural, Reyyan Yeniterzi, Suveyda Yeniterzi and Ugur Sezerman, Evolutionary Selection of Minimum Number of Features for Classification of Gene Expression Data Using Genetic Algorithms, Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation (GECCO 2007).

Suveyda Yeniterzi, Reyyan Yeniterzi, Alper Kucukural and Ugur Sezerman, Feature Selection with Genetic Algorithms on Medical Data, Proceedings of the 2nd International Symposium on Health Informatics and Bioinformatics (HIBIT08).

Summer 2006 Determining the relationship between the secondary structure of proteins and their mutation rates during selection

Supervised by Serafim Batzoglou , Ph.D., Computer Science Department, Stanford University

Proteins with strong structural requirements evolve more slowly than proteins with weak constrains, because a stringent negative selection pressure limits the number of substitutions. Therefore, selection associated with structural requirements is really the main factor that determines the rate of protein evaluation. In this work, we focused on the relationship between mutation rates, especially the non synonymous substitution rates, and the secondary structure of proteins.

Spring 2006 Developed an Online Search Database for SU Sponsored Research Award/Proposal Projects
  • Reyyan Yeniterzi, M. Sc., Computer Science and Engineering, Sabanci University
  • Akdes Serin , Ph.D. student, International Max Planck Research School for Computational Biology and Scientific Computing

We developed an online search database application with necessary administrative functions. Apache, PHP and MySQL was used.

Summer 2005 Transcription Factor Binding Site Determination Using Data Mining Methods

Transcription factors (TF) control the expression levels of the genes by binding to the regulatory DNA sequences in the genome. Finding these regulatory sequences will enable determination of TFs. We used data mining tools to find TF binding motifs. Using structural TF-DNA complex information, we performed association rule mining to determine the binding residues of TF. With the combination of these rules, we built a predictor which can predict the binding site. Moreover, using the rules derived from the genomic sequences together with TF sequences, our algorithm is able to determine the possible regulatory motifs of a given TF.

Transcription Factor Binding Site Determination Using Data Mining Methods, poster presented in FEBS 2006, Federation of European Biochemical Societies, June 24-29, 2006, Istanbul, Turkey.

 

 

 
The University of Southern California does not screen or control the content on this website and thus does not guarantee the accuracy, integrity, or quality of such content. All content on this website is provided by and is the sole responsibility of the person from which such content originated, and such content does not necessarily reflect the opinions of the University administration or the Board of Trustees