Cpmoutational tools for metabolic modeling and gene duplication analysis
This thesis presents new computational methods to analyse both short and long-term effects of temperature increase on biological systems. First, we consider the problem of acclimation of an organism to increased temperatures on short timescales. We develop a novel method of network regression, AccliNet, based on the acclimation times, which takes into account prior knowledge of functional links between genes to improve the performance of the algorithm. The results obtained by AccliNet are compared with the performance of existing algorithms and are shown to be an improvement in this area. Next, we delve deeper into the metabolic response of the organism to changing temperatures, and develop methods to model and simulate the fluxes of metabolites occurring through a metabolic network. In particular, we construct a simplified model of aerobic respiration for an Antarctic species, and, given a gene expression dataset across different temperatures, we develop two different machine learning approaches to model the fluxes through the metabolic network. The first approach we use is based on denoising autoencoders. The performance of this method is compared to a traditional Bayesian inference approach and found to have higher accuracy. Next, we develop a different machine learning approach to model the unknown data distributions, in this case using a Generative Adversarial Network (GAN) to learn an SDE path through the sampled data points. The performance of this method is compared to the earlier autoencoder approach, as well as to other algorithms. The GAN method is found to have similar accuracy but less robustness to noise than the autoencoder approach. Lastly, we also consider the long-term effects of changing temperatures on biological systems. In particular, we develop a novel package for phylogenetic analysis, called PhylSim, which allows simulations and studies of adaptation and evolution under different scenarios of climate change. We apply the package to the case of adaptation of Antarctic species to their environment in recent evolutionary history. The work in this thesis was carried out in collaboration with the British Antarctic Survey, and used genetic datasets of Antarctic organisms, although the methods developed here are general and can be readily applied to other datasets as well. Thus, the proposed modeling framework holds some promise for tackling important problems in the future, in areas ranging from bioinformatics to environmental science.