Delasa Aghamirzaie

Interpreting Cancer Patient Genomics Data to Assist Clinicians

Delasa Aghamirzaie, Erhan Bilal, Takahiko Koyama, Fang Wang, Filippo Utro, Kahn Rhrissorrakrai, Raquel Norel, Laxmi Parida, and Ajay Royyuru

Presentation Intern Seminar Series, IBM T.J Watson Research Center, August 2015

CodeWise: An accurate support vector machine classifier for assessing coding potential of transcripts using several sequential and structural features

Delasa Aghamirzaie, Lenwood S. Heath, Ruth Grene and Eva Collakova

Presentation Biological Data Science meeting, November 2014, Cold Spring Harbor Laboratory.

Video will be posted here soon!

Abstract

Emergence of next generation sequencing has revealed significant levels of transcriptional activity within both unannotated and annotated regions of the genome, leading to construction of novel tramscripts. These novel transcripts may be located in the genic regions such as antisense, overlapped intronic, and overlapped exonic or may be located in the intergenic regions. However, they can be coding or noncoding in the broader aspect. Hence, one of the main tasks is to functionally characterize these novel transcripts and to determine if they are coding or noncoding (ncRNA). Although the functions of coding genes have been studied for many years, trends to characterize the function of noncoding transcripts have been started recently. Several evidences for implication of ncRNAs in control of development, growth, and disease have been reported so far. ncRNAs can perform their functions through different mechanisms such as chromatin modifications (epigenetic control of gene expression), RNA-protein interactions, and transcriptional inference. We propose a support vector machine classifier, which can classify novel transcripts into coding or noncoding with over 98% accuracy. Several sequential and structural features have been compiled for training the classifier. The classifier has been used to classify the novel assembled transcripts from RNA-sequencing pipelines for Soybean and Arabidopsis organisms. However, it can be adapted to any species.

Alternative Splicing During Soybean Embryo Development

Ruth Grene, Delasa Aghamirzaie

Presentation 2nd Plant Genomics Congress, September 2014, St. Louis, MI.

Abstract

Developing soybean seeds accumulate oils, proteins, and carbohydrates that are used as oxidizable substrates providing metabolic precursors and energy during seed germination, in a process that has been intensively studied at the biochemical, but not yet at the genomic level. Seed maturation also involves highly regulated processes that are only partially understood. RNA sequencing was used to provide comprehensive information concerning transcriptional and post-transcriptional events that take place in developing soybean embryos. Distinct classes of alternatively spliced isoforms were detected and corresponding changes in their levels on a global scale during soybean embryo development were distinguished, using bioinformatics tools. Novel and known splice variants (SV) involved in various metabolic and developmental processes, including central carbon and nitrogen metabolism, induction of maturation and dormancy, and splicing itself were identified. The SVs were analyzed in further detail for their coding potential, conservation, and their protein domains using machine-learning tools. Coding and noncoding SVs were detected, including transcripts where alterations in individual domains had occurred over the time-course of embryo development. Changes in subcellular localization of the resulting proteins, protein-protein, enzyme-substrate interactions, and/or regulation of protein activities in developing oilseed embryos may occur as a result of these alternative splicing activities.

Transcriptome-Wide Functional Characterization Reveals Novel Relationships Among Differentially Expressed Transcripts In Developing Soybean Embryos

Delasa Aghamirzaie, Dhruv Batra, Lenwood S. Heath, Andrew Schneider, Ruth Grene and Eva Collakova

Journal PaperAccepted, BMC Genomics.

Abstract

Background

Transcriptomics reveals the existence of transcripts of different coding potential and strand orientation. Alternative splicing (AS) can yield proteins with altered number and types of functional domains, suggesting the global occurrence of transcriptional and post-transcriptional events. Many biological processes, including seed maturation and desiccation, are regulated post-transcriptionally (e.g., by AS), leading to the production of more than one coding or noncoding sense transcript from a single locus.

Results

We present an integrated computational framework to predict isoform-specific functions of plant transcripts. This framework includes a novel plant-specific weighted support vector machine classifier called CodeWise, which predicts the coding potential of transcripts with over 96% accuracy, and several other tools enabling global sequence similarity, functional domain, and co-expression network analyses. First, this framework was applied to all detected transcripts (103,106), out of which 13% was predicted by CodeWise to be noncoding RNAs in developing soybean embryos. Second, to investigate the role of AS during soybean embryo development, a population of 2,938 alternatively spliced and differentially expressed splice variants was analyzed and mined with respect to timing of expression. Conserved domain analyses revealed that AS resulted in global changes in the number, types, and extent of truncation of functional domains in protein variants. Isoform-specific co-expression network analysis using ArrayMining and clustering analyses revealed specific sub-networks and potential interactions among the components of selected signaling pathways related to seed maturation and the acquisition of desiccation tolerance. These signaling pathways involved abscisic acid- and FUSCA3-related transcripts, several of which were classified as noncoding and/or antisense transcripts and were co-expressed with corresponding coding transcripts. Noncoding and antisense transcripts likely play important regulatory roles in seed maturation- and desiccation-related signaling in soybean.

Conclusions

This work demonstrates how our integrated framework can be implemented to make experimentally testable predictions regarding the coding potential, co-expression, co-regulation, and function of transcripts and proteins related to a biological process of interest.

Changes in RNA Splicing in Developing Soybean (Glycine max) Embryos

Delasa Aghamirzaie, Mahdi Nabiyouni, Yihui Fang, Curtis Klumas, Lenwood S. Heath, Ruth Grene and Eva Collakova

Journal Paper Journal of Biology 2013, 2(4), 1311-1337; doi:10.3390/biology2041311

Abstract

Developing soybean seeds accumulate oils, proteins, and carbohydrates that are used as oxidizable substrates providing metabolic precursors and energy during seed germination. The accumulation of these storage compounds in developing seeds is highly regulated at multiple levels, including at transcriptional and post-transcriptional regulation. RNA sequencing was used to provide comprehensive information about transcriptional and post-transcriptional events that take place in developing soybean embryos. Bioinformatics analyses lead to the identification of different classes of alternatively spliced isoforms and corresponding changes in their levels on a global scale during soybean embryo development. Alternative splicing was associated with transcripts involved in various metabolic and developmental processes, including central carbon and nitrogen metabolism, induction of maturation and dormancy, and splicing itself. Detailed examination of selected RNA isoforms revealed alterations in individual domains that could result in changes in subcellular localization of the resulting proteins, protein-protein and enzyme-substrate interactions, and regulation of protein activities. Different isoforms may play an important role in regulating developmental and metabolic processes occurring at different stages in developing oilseed embryos.

Metabolic and Transcriptional Reprogramming in Developing Soybean (Glycine max) Embryos

Eva Collakova, Delasa Aghamirzaie, Yihui Fang, Curtis Klumas, Farzaneh Tabataba, Akshay Kakumanu, Elijah Myers, Lenwood S. Heath and Ruth Grene

Journal Paper Metabolites 2013, 3(2), 347-372; doi:10.3390/metabo3020347

Abstract

Soybean (Glycine max) seeds are an important source of seed storage compounds, including protein, oil, and sugar used for food, feed, chemical, and biofuel production. We assessed detailed temporal transcriptional and metabolic changes in developing soybean embryos to gain a systems biology view of developmental and metabolic changes and to identify potential targets for metabolic engineering. Two major developmental and metabolic transitions were captured enabling identification of potential metabolic engineering targets specific to seed filling and to desiccation. The first transition involved a switch between different types of metabolism in dividing and elongating cells. The second transition involved the onset of maturation and desiccation tolerance during seed filling and a switch from photoheterotrophic to heterotrophic metabolism. Clustering analyses of metabolite and transcript data revealed clusters of functionally related metabolites and transcripts active in these different developmental and metabolic programs. The gene clusters provide a resource to generate predictions about the associations and interactions of unknown regulators with their targets based on “guilt-by-association” relationships. The inferred regulators also represent potential targets for future metabolic engineering of relevant pathways and steps in central carbon and nitrogen metabolism in soybean embryos and drought and desiccation tolerance in plants.

A Highly Parallel Multi-class Pattern Classification on GPU

Mahdi Nabiyouni, Delasa Aghamirzaie

Conference Paper Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)

Abstract

Multi-class pattern classification has a variety of applications and could be achieved using artificial neural networks (ANN). There are two major system architectures for using ANNs in multi-class pattern classification: using a single ANN and using multiple ANNs. Independent of what architecture is used, one of the main concerns of using ANNs is that with increasing number of pattern classes and training datasets, the training time will increase dramatically which renders the ANN unfeasible. In this paper, the vast computational power of Graphics Processing Units (GPUs) is utilized to mitigate this problem. Different architectures and different methods of feeding pattern classes are implemented in a GPU platform. Different methods have been proposed to achieve maximum parallelism and subsequently maximize throughput. Our implementation exceeds the state-of-the-art in literature in terms of speed and the accurate use of GPU resources. As a result, the proposed approach's run time is about 75% shorter than the previous approaches. In multi-ANN architecture, due to the inherent parallelism in the proposed implementation, the execution time of a system for a digit recognition application is reduced from seven hours in CPU to about 4 seconds in GPU.

Reduction of process variation effect on FPGAs using multiple configurations

Delasa Aghamirzaie, Seyyed Ahmad Razavi, Morteza Saheb Zamani, Mahdi Nabiyouni

Conference PaperVLSI System on Chip Conference (VLSI-SoC), 2010 18th IEEE/IFIP

Abstract

In recent years, parameter variations present critical challenges for manufacturability and yield on integrated circuits. In this paper, a new method for improving the timing yield of field programmable gate array (FPGA) devices affected by random and systematic within-die variation is proposed. By selection of an appropriate configuration from a set of functionally equivalent configurations average critical path delay is reduced under conditions of large random and systematic variation considering spatial correlation. Compared to the previous approach which is limited to a fixed placement, our method improves timing yield by attempting several placements and routings without lengthy placement and routing phases to handle systematic variations and spatial correlation. The average critical path delay is reduced by 7% compared to the previous work over 20 MCNC benchmarks.

Expresso: Exploring Arabidopsis’ transcription factors and target genes from ChIP-Seq data

Delasa Aghamirzaie, Shuchi Wu, Karthik Raja Velmurugan and Ruth Grene

Poster Biological Data Science meeting, Cold Spring Harbor Laboratory, November 2014.

Expresso: Exploring Arabidopsis’ transcription factors and target genes from ChIP-Seq data

Delasa Aghamirzaie, Shuchi Wu, Karthik Raja Velmurugan and Ruth Grene

Poster 2nd Plant Genomics Congress, September 2014, St. Louis, MI.

Changes in RNA Splicing in Soybean Seed Embryos

Delasa Aghamirzaie, Lenwood S. Heath, Ruth Grene and Eva Collakova

Poster q-bio conference, August 2014, Santa Fe, New Mexico.

Changes in RNA Splicing in Soybean Seed Embryos

Delasa Aghamirzaie, Lenwood S. Heath, Ruth Grene and Eva Collakova

Poster PPWS Symposium, April 2014, Best Poster Award

Modeling and Identifying Regulatory Modules in (Glycine max) Soybean Time Series Gene Expression Data using Bayesian Networks

Delasa Aghamirzaie, Dhruv Batra, Eva Collakova, Lenwood S. Heath and Ruth Grene

Poster International Conference on Computational Cell Biology (ICCCB 2013), August 2013, Blacksburg, Virginia.