top of page

Gene expression prediction in Brassica napus using deep learning

Kevin Rockenbach

Associated student, JLU

Deep leaning has been shown to be a useful tool for a multitude of tasks involving complex pattern recognition, including biological predictions from genomic data. My PhD project aims at developing deep learning models for crop gene expression prediction, directly from genomic sequence. Prediction tasks include overall expression levels in terms of median transcript abundances, as well as tissue-specific expression patterns. Brassica napus serves as a model organism for the implementation of such prediction models in crop plants with complex polyploid genomes. While polyploidy poses challenges to attaining high quality ground truth data on which deep learning models can be trained, a well-fitted model may elucidate some of the mechanisms determining homoeolog-specific gene expression and may provide valuable insight into the regulatory architecture or the Brassica napus genome. Ultimately, we want to understand the extent to which promoters and other elements contribute to gene expression in Brassica napus and plan to use our models to predict the impact of small-scale variants, as well as structural variation on gene expression. Possible future applications include screening for genomic regions and genotypes of interest for breeding purposes and, perhaps, generating novel synthetic regulatory sequences.

bottom of page