his.sePublications
Change search
Refine search result
1 - 41 of 41
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Anders, Patrizia
    University of Skövde, School of Humanities and Informatics.
    A bioinformaticians view on the evolution of smell perception2006Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Background:

    The origin of vertebrate sensory systems still contains many mysteries and thus challenges to bioinformatics. Especially the evolution of the sense of smell maintains important puzzles, namely the question whether or not the vomeronasal system is older than the main olfactory system. Here I compare receptor sequences of the two distinct systems in a phylogenetic study, to determine their relationships among several different species of the vertebrates.

    Results:

    Receptors of the two olfactory systems share little sequence similarity and prove to be a challenge in multiple sequence alignment. However, recent dramatical improvements in the area of alignment tools allow for better results and high confidence. Different strategies and tools were employed and compared to derive a

    high quality alignment that holds information about the evolutionary relationships between the different receptor types. The resulting Maximum-Likelihood tree supports the theory that the vomeronasal system is rather an ancestor of the main olfactory system instead of being an evolutionary novelty of tetrapods.

    Conclusions:

    The connections between the two systems of smell perception might be much more fundamental than the common architecture of receptors. A better understanding of these parallels is desirable, not only with respect to our view on evolution, but also in the context of the further exploration of the functionality and complexity of odor perception. Along the way, this work offers a practical protocol through the jungle of programs concerned with sequence data and phylogenetic reconstruction.

  • 2.
    Andersson, Malin
    University of Skövde, Department of Computer Science.
    A method for identification of putatively co-regulated genes2002Independent thesis Advanced level (degree of Master (One Year))Student thesis
    Abstract [en]

    The genomes of several organisms have been sequenced and the need for methods to analyse the data is growing. In this project a method is described that tries to identify co-regulated genes. The method identifies transcription factor binding sites, documented in TRANSFAC, in the non-coding regions of genes. The algorithm counts the number of common binding sites and the number of unique binding sites for each pair of genes and decides if the genes are co-regulated. The result of the method is compared with the correlation between the gene expression patterns of the genes. The method is tested on 21 gene pairs from the genome of Saccharomyces cerevisiae. The algorithm first identified binding sites from all organisms. The accuracy of the program was very low in this case. When the algorithm was modified to only identify binding sites found in plants the accuracy was much improved, from 52% to 76% correct predictions.

  • 3.
    Axelsson, K. F.
    et al.
    Department of Orthopaedic Surgery, Skaraborg Hospital, Skövde, Sweden / Geriatric Medicine, Department of Internal Medicine and Clinical Nutrition, Institute of Medicine, University of Gothenburg, Gothenburg, Sweden.
    Wallander, M.
    Geriatric Medicine, Department of Internal Medicine and Clinical Nutrition, Institute of Medicine, University of Gothenburg, Gothenburg, Sweden / Department of Medicine Huddinge, Karolinska Institute, Stockholm, Sweden.
    Johansson, H.
    Institute for Health and Ageing, Catholic University of Australia, Melbourne, Vic., Australia.
    Lundh, Dan
    University of Skövde, School of Health and Education. University of Skövde, Health and Education.
    Lorentzon, M.
    Geriatric Medicine, Department of Internal Medicine and ClinicalNutrition, Institute of Medicine, University of Gothenburg, Gothenburg, Sweden / Geriatric Medicine, Sahlgrenska University Hospital, Mölndal, Sweden.
    Hip fracture risk and safety with alendronate treatment in the oldest-old2017In: Journal of Internal Medicine, ISSN 0954-6820, E-ISSN 1365-2796, Vol. 282, no 6, p. 546-559Article in journal (Refereed)
    Abstract [en]

    Background. There is high evidence for secondary prevention of fractures, including hip fracture, with alendronate treatment, but alendronate's efficacy to prevent hip fractures in the oldest-old (80 years old), the population with the highest fracture risk, has not been studied. Objective. To investigate whether alendronate treatment amongst the oldest-old with prior fracture was related to decreased hip fracture rate and sustained safety. Methods. Using a national database of men and women undergoing a fall risk assessment at a Swedish healthcare facility, we identified 90 795 patients who were 80 years or older and had a prior fracture. Propensity score matching (four to one) was then used to identify 7844 controls to 1961 alendronate-treated patients. The risk of incident hip fracture was investigated with Cox models and the interaction between age and treatment was investigated using an interaction term. Results. The case and control groups were well balanced in regard to age, sex, anthropometrics and comorbidity. Alendronate treatment was associated with a decreased risk of hip fracture in crude (hazard ratio (HR) 0.62 (0.49-0.79), P < 0.001) and multivariable models (HR 0.66 (0.51-0.86), P < 0.01). Alendronate was related to reduced mortality risk (HR 0.88 (0.82-0.95) but increased risk of mild upper gastrointestinal symptoms (UGI) (HR 1.58 (1.12-2.24). The alendronate association did not change with age for hip fractures or mild UGI. Conclusion. In old patients with prior fracture, alendronate treatment reduces the risk of hip fracture with sustained safety, indicating that this treatment should be considered in these high-risk patients.

  • 4.
    Birkmeier, Bettina
    University of Skövde, School of Humanities and Informatics.
    Integrating Prior Knowledge into the Fitness Function of an Evolutionary Algorithm for Deriving Gene Regulatory Networks2006Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The topic of gene regulation is a major research area in the bioinformatics community. In this thesis prior knowledge from Gene Ontology in the form of templates is integrated into the fitness function of an evolutionary algorithm to predict gene regulatory networks. The resulting multi-objective fitness functions are then tested with MAPK network data taken from KEGG to evaluate their respective performances. The results are presented and analyzed. However, a clear tendency cannot be observed. The results are nevertheless promising and can provide motivation for further research in that direction. Therefore different ideas and approaches are suggested for future work.

  • 5.
    Chawade, Aakash
    University of Skövde, School of Humanities and Informatics.
    Inferring Gene Regulatory Networks in Cold-Acclimated Plants by Combinatorial Analysis of mRNA Expression Levels and Promoter Regions2006Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Understanding the cold acclimation process in plants may help us develop genetically engineered plants that are resistant to cold. The key factor in understanding this process is to study the genes and thus the gene regulatory network that is involved in the cold acclimation process. Most of the existing approaches1-8 in deriving regulatory networks rely only on the gene expression data. Since the expression data is usually noisy and sparse the networks generated by these approaches are usually incoherent and incomplete. Hence a new approach is proposed here that analyzes the promoter regions along with the expression data in inferring the regulatory networks. In this approach genes are grouped into sets if they contain similar over-represented motifs or motif pairs in their promoter regions and if their expression pattern follows the expression pattern of the regulating gene. The network thus derived is evaluated using known literature evidence, functional annotations and from statistical tests.

  • 6.
    Chen, Lei
    University of Skövde, School of Humanities and Informatics.
    Construction of Evolutionary Tree Models for Oncogenesis of Endometrial Adenocarcinoma2005Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Endometrial adenocarcinoma (EAC) is the fourth leading cause of carcinoma in woman worldwide, but not much is known about genetic factors involved in this complex disease. During the EAC process, it is well known that losses and gains of chromosomal regions do not occur completely at random, but partly through some flow of causality. In this work, we used three different algorithms based on frequency of genomic alterations to construct 27 tree models of oncogenesis. So far, no study about applying pathway models to microsatellite marker data had been reported. Data from genome–wide scans with microsatellite markers were classified into 9 data sets, according to two biological approaches (solid tumor cell and corresponding tissue culture) and three different genetic backgrounds provided by intercrossing the susceptible rat BDII strain and two normal rat strains. Compared to previous study, similar conclusions were drawn from tree models that three main important regions (I, II and III) and two subordinate regions (IV and V) are likely to be involved in EAC development. Further information about these regions such as their likely order and relationships was produced by the tree models. A high consistency in tree models and the relationship among p19, Tp53 and Tp53 inducible

    protein genes provided supportive evidence for the reliability of results.

  • 7.
    Dodda, Srinivasa Rao
    University of Skövde, School of Humanities and Informatics.
    Improvements and extensions of a web-tool for finding candidate genes associated with rheumatoid arthritis2005Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    QuantitativeTraitLocus (QTL) is a statistical method used to restrict genomic regions contributing to specific phenotypes. To further localize genes in such regions a web tool called “Candidate Gene Capture” (CGC) was developed by Andersson et al. (2005). The CGC tool was based on the textual description of genes defined in the human phenotype database OMIM. Even though the CGC tool works well, the tool was limited by a number of inconsistencies in the underlying database structure, static web pages and some gene descriptions without properly defined function in the OMIM database. Hence, in this work the CGC tool was improved by redesigning its database structure, adding dynamic web pages and improving the prediction of unknown gene function by using exon analysis. The changes in database structure diminished the number of tables considerably, eliminated redundancies and made data retrieval more efficient. A new method for prediction of gene function was proposed, based on the assumption that similarity between exon sequences is associated with biochemical function. Using Blast with 20380 exon protein sequences and a threshold E-value of 0.01, 639 exon groups were obtained with an average of 11 exons per group. When estimating the functional similarity, it was found that on the average 72% of the exons in a group had at least one Gene Ontology (GO) term in common.

  • 8.
    Engerberg, Malin
    University of Skövde, Department of Computer Science.
    Development of database support for production of doubled haploids2002Independent thesis Advanced level (degree of Master (One Year))Student thesis
    Abstract [en]

    In this project relational and Lotus Notes database technology are evaluated with regard to their suitability in providing computer-based support in plant breeding in general and specifically in the production of doubled haploids. The two developed databases are compared based on a set of requirements produced together with the DH-group which is the main users of the databases. The results indicate that both Lotus Notes and the relational databases are able to fulfil all needs documented in this project, although both systems have their limitations. An often expressed opinion is that it is difficult to combine biology and databases. The experience gained in this project however suggests that it does not need to be the case in instances where data is not as complicated as often discussed. Observations made during this project indicate that data warehousing with integrated data mining and OLAP tools are surprisingly similar to how the DH-group at Svalöf Weibull works and could be a suitable solution for the production of doubled haploids.

  • 9.
    Genheden, Samuel
    University of Skövde, School of Humanities and Informatics.
    A fast protein-ligand docking method2006Independent thesis Basic level (degree of Bachelor), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In this dissertation a novel approach to protein-ligand docking is presented. First an existing method to predict putative active sites is employed. These predictions are then used to cut down the search space of an algorithm that uses the fast Fourier transform to calculate the geometrical and electrostatic complementarity between a protein and a small organic ligand. A simplified hydrophobicity score is also calculated for each active site. The docking method could be applied either to dock ligands in a known active site or to rank several putative active sites according to their biological feasibility. The method was evaluated on a set of 310 protein-ligand complexes. The results show that with respect to docking the method with its initial parameter settings is too coarse grained. The results also show that with respect to ranking of putative active sites the method works quite well.

  • 10.
    Gunnarsson, Ida
    University of Skövde, Department of Computer Science.
    Deriving Protein Networks by Combining Gene Expression and Protein Chip Analysis2002Independent thesis Advanced level (degree of Master (One Year))Student thesis
    Abstract [en]

    In order to derive reliable protein networks it has recently been suggested that the combination of information from both gene and protein level is required. In this thesis a combination of gene expression and protein chip analysis was performed when constructing protein networks. Proteins with high affinity to the same substrates and encoded by genes with high correlation is here thought to constitute reliable protein networks. The protein networks derived are unfortunately not as reliable as were hoped for. According to the tests performed, the method derived in this thesis does not perform more than slightly better than chance. However, the poor results can depend on the data used, since mismatching and shortage of data has been evident.

  • 11.
    Gustafsson, Sara
    University of Skövde, Department of Computer Science.
    Evaluation of analysis methods for identification of differentially expressed genes in oligonucleotide microarray data2003Independent thesis Advanced level (degree of Master (One Year))Student thesis
    Abstract [en]

    Microarrays are part of a new class of biotechnologies which allow the monitoring of expression levels four thousands of genes simultaneously. The problem is now to make sense of the resulting massive data set. In this thesis the results from five different methods for differential analysis of oligonucleotide microarray data are evaluated. The methods are simple classic t-test and Mann-Whitney U test, the software GeneSpring and Significance Analysis of Microarrays (SAM) and the use of Affymetrix software in combination with a scoring system. The methods are used to analyse two different microarray data sets with different number of replicates. These data sets are further divided in different ways to examine different questions that still are unsolved problems in the microarray technology. The aim of the evaluation is to examine the reliability of the results obtained from differential analysis of microarray data.

  • 12.
    Hettne, Kristina
    University of Skövde, Department of Computer Science.
    Using nuclear receptor interactions as biomarkers for metabolic syndrome2003Independent thesis Advanced level (degree of Master (One Year))Student thesis
    Abstract [en]

    Metabolic syndrome is taking epidemic proportions, especially in developed countries. Each risk factor component of the syndrome independently increases the risk of developing coronary artery disease. The risk factors are obesity, dyslipidemia, hypertension, diabetes type 2, insulin resistance, and microalbuminuria. Nuclear receptors is a family of receptors that has recently received a lot of attention due to their possible involvement in metabolic syndrome. Putting the receptors into context with their co-factors and ligands may reveal therapeutic targets not found by studying the receptors alone. Therefore, in this thesis, interactions between genes in nuclear receptor pathways were analysed with the goal of investigating if these interactions can supply leads to biomarkers for metabolic syndrome. Metabolic syndrome donor gene expression data from the BioExpressä, database was analysed with the APRIORI algorithm (Agrawal et al. 1993) for generating and mining association rules. No association rules were found to function as biomarkers for metabolic syndrome, but the resulting rules show that the data mining technique successfully found associations between genes in signaling pathways.

  • 13.
    Hillerton, Thomas
    University of Skövde, School of Bioscience.
    Predicting adverse drug reactions in cancer treatment using a neural network based approach2018Independent thesis Advanced level (degree of Master (One Year)), 10 credits / 15 HE creditsStudent thesis
  • 14.
    Holmgren, Gustav
    et al.
    University of Skövde, School of Bioscience. University of Skövde, The Systems Biology Research Centre. Department of Clinical Chemistry and Transfusion Medicine, Institute of Biomedicine, University of Gothenburg, Sahlgrenska University Hospital, Gothenburg, Sweden / Takara Bio Europe AB, Gothenburg, Sweden.
    Sartipy, Peter
    University of Skövde, School of Bioscience. University of Skövde, The Systems Biology Research Centre. AstraZeneca Gothenburg, CVMD GMed, GMD, Mölndal, Sweden.
    Andersson, Christian X.
    Takara Bio Europe AB, Gothenburg, Sweden.
    Lindahl, Anders
    Department of Clinical Chemistry and Transfusion Medicine, Institute of Biomedicine, University of Gothenburg, Sahlgrenska University Hospital, Gothenburg, Sweden.
    Synnergren, Jane
    University of Skövde, School of Bioscience. University of Skövde, The Systems Biology Research Centre.
    Expression profiling of human pluripotent stem cell-derived cardiomyocytes exposed to doxorubicin - integration and visualization of multi omics data2018In: Toxicological Sciences, ISSN 1096-6080, E-ISSN 1096-0929, Vol. 163, no 1, p. 182-195Article in journal (Refereed)
    Abstract [en]

    Anthracyclines, such as doxorubicin, are highly efficient chemotherapeutic agents against a variety of cancers. However, anthracyclines are also among the most cardiotoxic therapeutic drugs presently on the market. Chemotherapeutic-induced cardiomyopathy is one of the leading causes of disease and mortality in cancer survivors. The exact mechanisms responsible for doxorubicin-induced cardiomyopathy are not completely known, but the fact that the cardiotoxicity is dose-dependent and that there is a variation in time-to-onset of toxicity, and gender- and age differences suggests that several mechanisms may be involved.In the present study, we investigated doxorubicin-induced cardiotoxicity in human pluripotent stem cell-derived cardiomyocytes using proteomics. In addition, different sources of omics data (protein, mRNA, and microRNA) from the same experimental setup were further combined and analyzed using newly developed methods to identify differential expression in data of various origin and types. Subsequently, the results were integrated in order to generate a combined visualization of the findings.In our experimental model system, we exposed cardiomyocytes derived from human pluripotent stem cells to doxorubicin for up to two days, followed by a wash-out period of additionally 12 days. Besides an effect on the cell morphology and cardiomyocyte functionality, the data show a strong effect of doxorubicin on all molecular levels investigated. Differential expression patterns that show a linkage between the proteome, transcriptome, and the regulatory microRNA network, were identified. These findings help to increase the understanding of the mechanisms behind anthracycline-induced cardiotoxicity and suggest putative biomarkers for this condition.

  • 15.
    Huque, Enamul
    University of Skövde, School of Humanities and Informatics.
    Shape Analysis and Measurement for the HeLa cell classification of cultured cells in high throughput screening2006Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Feature extraction by digital image analysis and cell classification is an important task for cell culture automation. In High Throughput Screening (HTS) where thousands of data points are generated and processed at once, features will be extracted and cells will be classified to make a decision whether the cell-culture is going on smoothly or not. The culture is restarted if a problem is detected. In this thesis project HeLa cells, which are human epithelial cancer cells, are selected for the experiment. The purpose is to classify two types of HeLa cells in culture: Cells in cleavage that are round floating cells (stressed or dead cells are also round and floating) and another is, normal growing cells that are attached to the substrate. As the number of cells in cleavage will always be smaller than the number of cells which are growing normally and attached to the substrate, the cell-count of attached cells should be higher than the round cells. There are five different HeLa cell images that are used. For each image, every single cell is obtained by image segmentation and isolation. Different mathematical features are found for each cell. The feature set for this experiment is chosen in such a way that features are robust, discriminative and have good generalisation quality for classification. Almost all the features presented in this thesis are rotation, translation and scale invariant so that they are expected to perform well in discriminating objects or cells by any classification algorithm. There are some new features added which are believed to improve the classification result. The feature set is considerably broad rather than in contrast with the restricted sets which have been used in previous work. These features are used based on a common interface so that the library can be extended and integrated into other applications. These features are fed into a machine learning algorithm called Linear Discriminant Analysis (LDA) for classification. Cells are then classified as ‘Cells attached to the substrate’ or Cell Class A and ‘Cells in cleavage’ or Cell Class B. LDA considers features by leaving and adding shape features for increased performance. On average there is higher than ninety five percent accuracy obtained in the classification result which is validated by visual classification.

  • 16.
    Hurme, Mikko
    et al.
    Department of Psychology, University of Turku, Finland / Centre for Cognitive Neuroscience, University of Turku, Finland / Turku Brain and Mind Centre, University of Turku, Finland.
    Koivisto, Mika
    Department of Psychology, University of Turku, Finland / Centre for Cognitive Neuroscience, University of Turku, Finland / Turku Brain and Mind Centre, University of Turku, Finland.
    Revonsuo, Antti
    University of Skövde, School of Bioscience. University of Skövde, The Systems Biology Research Centre. Department of Psychology, University of Turku, Finland / Centre for Cognitive Neuroscience, University of Turku, Finland / Turku Brain and Mind Centre, University of Turku, Finland.
    Railo, Henry
    Department of Psychology, University of Turku, Finland / Centre for Cognitive Neuroscience, University of Turku, Finland / Turku Brain and Mind Centre, University of Turku, Finland.
    Early processing in primary visual cortex is necessary for conscious and unconscious vision while late processing is necessary only for conscious vision in neurologically healthy humans2017In: NeuroImage, ISSN 1053-8119, E-ISSN 1095-9572, Vol. 150, p. 230-238Article in journal (Refereed)
    Abstract [en]

    The neural mechanisms underlying conscious and unconscious visual processes remain controversial. Blindsight patients may process visual stimuli unconsciously despite their VI lesion, promoting anatomical models, which suggest that pathways bypassing the VI support unconscious vision. On the other hand, physiological models argue that the major geniculostriate pathway via VI is involved in both unconscious and conscious vision, but in different time windows and in different types of neural activity. According to physiological models, feedforward activity via VI to higher areas mediates unconscious processes whereas feedback loops of recurrent activity from higher areas back to VI support conscious vision. With transcranial magnetic stimulation (TMS) it is possible to study the causal role of a brain region during specific time points in neurologically healthy participants. In the present study, we measured unconscious processing with redundant target effect, a phenomenon where participants respond faster to two stimuli than one even when one of the stimuli is not consciously perceived. We tested the physiological feedforward-feedback model of vision by suppressing conscious vision by interfering selectively either with early or later VI activity with TMS. Our results show that early VI activity (60 ms) is necessary for both unconscious and conscious vision. During later processing stages (90 ms), VI contributes selectively to conscious vision. These findings support the feedforward-feedback-model of consciousness.

  • 17.
    Jacobsson, Annelie
    University of Skövde, Department of Computer Science.
    Comparing NR Expression among Metabolic Syndrome Risk Factors2003Independent thesis Advanced level (degree of Master (One Year))Student thesis
    Abstract [en]

    The metabolic syndrome is a cluster of metabolic risk factors such as diabetes type II, dyslipidemia, hypertension, obesity, microalbuminurea and insulin resistance, which in the recent years has increased greatly in many parts of the world. In this thesis decision trees were applied to the BioExpress database, including both clinical data about donors and gene expression data, to investigate nuclear receptors ability to serve as markers for the metabolic syndrome. Decision trees were created and the classification performance for each individual risk factor were then analysed. The rules generated from the risk factor trees were compared in order to search for similarities and dissimilarities. The comparisons of rules were performed in pairs of risk factors, in groups of three and on all risk factors and they resulted in the discovery of a set of genes where the most interesting were the Peroxisome Proliferator Activated Receptor - Alpha, the Peroxisome Proliferator Activated Receptor - Gamma and the Glucocorticoid Receptor. These genes existed in pathways associated with the metabolic syndrome and in the recent scientific literature.

  • 18.
    Karjalainen, Merja
    University of Skövde, Department of Computer Science.
    Analysing subsets of gene expression data to find putatively co-regulated genes2002Independent thesis Advanced level (degree of Master (One Year))Student thesis
    Abstract [en]

    This project is an investigation of whether analysing subsets of time series gene expression data can give additional information about putatively co-regulated genes, compared to only using the whole time series. The original gene expression data set was partitioned into subsets and similarity was computed for both the whole timed series and subsets. Pearson correlation was used as similarity measure between gene expression profiles. The results indicate that analysing co-expression in subsets of gene expression data derives true-positive connections, with respect to co-regulation, that are not detected by only using the whole time series data. Unfortunately, with the actual data set, chosen similarity measure and partitioning of the data, randomly generated connections have the same amount of true-positives as the ones derived by the applied analysis. However, it is worth to continue further analysis of the subsets of gene expression data, which is based on the multi-factorial nature of gene regulation. E.g. other similarity measures, data sets and ways of partitioning the data set should be tried.

  • 19.
    Keller, Jens
    University of Skövde, School of Humanities and Informatics.
    Clustering biological data using a hybrid approach: Composition of clusterings from different features2008Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Clustering of data is a well-researched topic in computer sciences. Many approaches have been designed for different tasks. In biology many of these approaches are hierarchical and the result is usually represented in dendrograms, e.g. phylogenetic trees. However, many non-hierarchical clustering algorithms are also well-established in biology. The approach in this thesis is based on such common algorithms. The algorithm which was implemented as part of this thesis uses a non-hierarchical graph clustering algorithm to compute a hierarchical clustering in a top-down fashion. It performs the graph clustering iteratively, with a previously computed cluster as input set. The innovation is that it focuses on another feature of the data in each step and clusters the data according to this feature. Common hierarchical approaches cluster e.g. in biology, a set of genes according to the similarity of their sequences. The clustering then reflects a partitioning of the genes according to their sequence similarity. The approach introduced in this thesis uses many features of the same objects. These features can be various, in biology for instance similarities of the sequences, of gene expression or of motif occurences in the promoter region. As part of this thesis not only the algorithm itself was implemented and evaluated, but a whole software also providing a graphical user interface. The software was implemented as a framework providing the basic functionality with the algorithm as a plug-in extending the framework. The software is meant to be extended in the future, integrating a set of algorithms and analysis tools related to the process of clustering and analysing data not necessarily related to biology.

    The thesis deals with topics in biology, data mining and software engineering and is divided into six chapters. The first chapter gives an introduction to the task and the biological background. It gives an overview of common clustering approaches and explains the differences between them. Chapter two shows the idea behind the new clustering approach and points out differences and similarities between it and common clustering approaches. The third chapter discusses the aspects concerning the software, including the algorithm. It illustrates the architecture and analyses the clustering algorithm. After the implementation the software was evaluated, which is described in the fourth chapter, pointing out observations made due to the use of the new algorithm. Furthermore this chapter discusses differences and similarities to related clustering algorithms and software. The thesis ends with the last two chapters, namely conclusions and suggestions for future work. Readers who are interested in repeating the experiments which were made as part of this thesis can contact the author via e-mail, to get the relevant data for the evaluation, scripts or source code.

  • 20.
    Kristinsson, Vilhelm Yngvi
    University of Skövde, School of Humanities and Informatics.
    The effect of normalization methods on the identification of differentially expressed genes in microarray data2007Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In this thesis the effect of normalization methods on the identification of differentially expressed genes is investigated. A zebrafish microarray dataset called Swirl was used in this thesis work. First the Swirl dataset was extracted and visualized to view if the robust spline and print tip loess normalization methods are appropriate to normalize this dataset. The dataset was then normalized with the two normalization methods and the differentially expressed genes were identified with the LimmaGUI program. The results were then evaluated by investigating which genes overlap after applying different normalization methods and which ones are identified uniquely after applying the different methods. The results showed that after the normalization methods were applied the differentially expressed genes that were identified by the LimmaGUI program did differ to some extent but the difference was not considered to be major. Thus the main conclusion is that the choice of normalization method does not have a major effect on the resulting list of differentially expressed genes.

  • 21.
    Laurio, Kim
    et al.
    University of Skövde, School of Humanities and Informatics. University of Skövde, The Systems Biology Research Centre.
    Svensson, Thomas
    Biovitrum AB, Göteborg, Sweden.
    Jirstrand, Mats
    Fraunhofer-Chalmers Research Center for Industrial Mathematics, Göteborg, Sweden.
    Nilsson, Patric
    University of Skövde, School of Life Sciences. University of Skövde, The Systems Biology Research Centre.
    Gamalielsson, Jonas
    University of Skövde, School of Humanities and Informatics. University of Skövde, The Systems Biology Research Centre.
    Olsson, Björn
    University of Skövde, School of Humanities and Informatics. University of Skövde, The Systems Biology Research Centre.
    Evolutionary search for improved path diagrams2007In: Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics: 5th European Conference, EvoBIO 2007, Valencia, Spain, April 11-13, 2007. Proceedings / [ed] Elena Marchiori, Jason H. Moore, Jagath C. Rajapakse, Springer Berlin/Heidelberg, 2007, p. 114-121Conference paper (Refereed)
    Abstract [en]

    A path diagram relates observed, pairwise, variable correlations to a functional structure which describes the hypothesized causal relations between the variables. Here we combine path diagrams, heuristics and evolutionary search into a system which seeks to improve existing gene regulatory models. Our evaluation shows that once a correct model has been identified it receives a lower prediction error compared to incorrect models, indicating the overall feasibility of this approach. However, with smaller samples the observed correlations gradually become more misleading, and the evolutionary search increasingly converges on suboptimal models. Future work will incorporate publicly available sources of experimentally verified biological facts to computationally suggest model modifications which might improve the model’s fitness.

  • 22.
    Lindefelt, Lisa
    University of Skövde, Department of Computer Science.
    Predicting gene expression using artificial neural networks2002Independent thesis Advanced level (degree of Master (One Year))Student thesis
    Abstract [en]

    Today one of the greatest aims within the area of bioinformatics is to gain a complete understanding of the functionality of genes and the systems behind gene regulation. Regulatory relationships among genes seem to be of a complex nature since transcriptional control is the result of complex networks interpreting a variety of inputs. It is therefore essential to develop analytical tools detecting complex genetic relationships.

    This project examines the possibility of the data mining technique artificial neural network (ANN) detecting regulatory relationships between genes. As an initial step for finding regulatory relationships with the help of ANN the goal of this project is to train an ANN to predict the expression of an individual gene. The genes predicted are the nuclear receptor PPAR-g and the insulin receptor. Predictions of the two target genes respectively were made using different datasets of gene expression data as input for the ANN. The results of the predictions of PPAR-g indicate that it is not possible to predict the expression of PPAR-g under the circumstances for this experiment. The results of the predictions of the insulin receptor indicate that it is not possible to discard using ANN for predicting the gene expression of an individual gene.

  • 23.
    Lorentzon, Fredrik
    University of Skövde, Department of Computer Science.
    Data modelling and implementation of a chemical compounds database2003Independent thesis Advanced level (degree of Master (One Year))Student thesis
    Abstract [en]

    In this project a relational DBMS has been designed, developed and implemented. The RDBMS handles information on cellular behaviour in response to many different chemical compounds. The RDBMS is accessed with a Common Gateway Interface (CGI) programmed with Perl and the new system replaces the previous system that was used at the biotech company Neuronova. The results indicate that the developed RDBMS system will improve the work, especially for the Lead Discovery department. The data is structured in a stricter way. The complete system offers the benefit of integration with other databases at the company including systems for EST sequences and for target discovery.

    The full text is not available due to confidetial parts in it.

  • 24.
    Martinez Maestre, Andreu
    University of Skövde, School of Bioscience.
    PVC: Proximity Value Clustering: A new clustering method without human interaction2018Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
  • 25.
    Mathew, Sumi
    University of Skövde, School of Humanities and Informatics.
    A method to identify the non-coding RNA gene for U1 RNA in species in which it has not yet been found2007Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Background

    Non coding RNAs are the RNA molecules that do not code for proteins but play structural, catalytic or regulatory roles in the organisms in which they are found. These RNAs generally conserve their secondary structure more than their primary sequence. It is possible to look for protein coding genes using sequence signals like promoters, terminators, start and stop codons etc. However, this is not the case with non coding RNAs since these signals are weakly conserved in them. Hence the situation with non coding RNAs is more challenging. Therefore a protocol is devised to identify U1 RNA in species not previously known to have it.

    Results

    It is sufficient to use the covariance models to identify non coding RNAs but they are very slow and hence a filtering step is needed before using the covariance models to reduce the search space for identifying these genes. The protocol for identifying U1 RNA genes employs for the filtering a pattern matcher RNABOB that can conduct secondary structure pattern searches. The descriptor for RNABOB is made automatically such that it can also represent the bulges and interior loops in helices of RNA. The protocol is compared with the Rfam and Weinberg & Ruzzo approaches and has been able to identify new U1 RNA homologues in the Apicomplexan group where it has not previously been found.

    Conclusions

    The method has been used to identify the gene for U1 RNA in certain species in which it has not been detected previously. The identified genes may be further analyzed by wet laboratory techniques for the confirmation of their existence.

    4

  • 26.
    Muhammad, Ashfaq
    University of Skövde, School of Humanities and Informatics.
    Design and Development of a Database for the Classification of Corynebacterium glutamicum Genes, Proteins, Mutants and Experimental Protocols2006Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Coryneform bacteria are largely distributed in nature and are rod like, aerobic soil bacteria capable of growing on a variety of sugars and organic acids. Corynebacterium glutamicum is a nonpathogenic species of Coryneform bacteria used for industrial production of amino acids. There are three main publicly available genome annotations, Cg, Cgl and NCgl for C. glutamicum. All these three annotations have different numbers of protein coding genes and varying numbers of overlaps of similar genes. The original data is only available in text files. In this format of genome data, it was not easy to search and compare the data among different annotations and it was impossible to make an extensive multidimensional customized formal search against different protein parameters. Comparison of all genome annotations for construction deletion, over-expression mutants, graphical representation of genome information, such as gene locations, neighboring genes, orientation (direct or complementary strand), overlapping genes, gene lengths, graphical output for structure function relation by comparison of predicted trans-membrane domains (TMD) and functional protein domains protein motifs was not possible when data is inconsistent and redundant on various publicly available biological database servers. There was therefore a need for a system of managing the data for mutants and experimental setups. In spite of the fact that the genome sequence is known, until now no databank providing such a complete set of information has been available. We solved these problems by developing a standalone relational database software application covering data processing, protein-DNA sequence extraction and

    management of lab data. The result of the study is an application named, CORYNEBASE, which is a software that meets our aims and objectives.

  • 27.
    Naswa, Sudhir
    University of Skövde, School of Humanities and Informatics.
    Representation of Biochemical Pathway Models: Issues relating conversion of model representation from SBML to a commercial tool2005Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Background: Computational simulation of complex biological networks lies at the heart of systems biology since it can confirm the conclusions drawn by experimental studies of biological networks and guide researchers to produce fresh hypotheses for further experimental validation. Since this iterative process helps in development of more realistic system models a variety of computational tools have been developed. In the absence of a common format for representation of models these tools were developed in different formats. As a result these tools became unable to exchange models amongst them, leading to development of SBML, a standard exchange format for computational models of biochemical networks. Here the formats of SBML and one of the commercial tools of systems biology are being compared to study the issues which may arise during conversion between their respective formats. A tool StoP has been developed to convert the format of SBML to the format of the selected tool.

    Results: The basic format of SBML representation which is in the form of listings of various elements of a biochemical reaction system differs from the representation of the selected tool which is location oriented. In spite of this difference the various components of biochemical pathways including multiple compartments, global parameters, reactants, products, modifiers, reactions, kinetic formulas and reaction parameters could be converted from the SBML representation to the representation of the selected tool. The MathML representation of the kinetic formula in an SBML model can be converted to the string format of the selected tool. Some features of the SBML are not present in the selected tool. Similarly, the ability of the selected tool to declare parameters for locations, which are global to those locations and their children, is not present in the SBML.

    Conclusions: Differences in representations of pathway models may include differences in terminologies, basic architecture, differences in capabilities of software’s, and adoption of different standards for similar things. But the overall similarity of domain of pathway models enables us to interconvert these representations. The selected tool should develop support for unit definitions, events and rules. Development of facility for parameter declaration at compartment level by SBML and facility for function declaration by the selected tool is recommended.

  • 28.
    Nordström, Rickard
    University of Skövde, Department of Computer Science.
    3DPOPS: From carbohydrate sequence to 3D structure2002Independent thesis Advanced level (degree of Master (One Year))Student thesis
    Abstract [en]

    In this project a web-based system called 3DPOPS have been designed, developed and implemented. The system creates initial 3D structures of oligosaccharides according to user input data and is intended to be integrated with an automatized 3D prediction system for saccharides. The web interface uses a novel approach with a dynamically updated graphical representation of the input carbohydrate. The interface is embedded in a web page as a Java applet. Both expert and novice users needs are met by informative messages, a familiar concept and a dynamically updated graphical user interface in which only valid input can be created.

    A set of test sequences was collected from the CarbBank database. An initial structure to each sequence could be created. All contained the information necessary to serve as starting points in a conformation search carried out by a 3D prediction system for carbohydrates.

  • 29.
    Olsson, Elin
    University of Skövde, Department of Computer Science.
    Deriving Genetic Networks Using Text Mining2002Independent thesis Advanced level (degree of Master (One Year))Student thesis
    Abstract [en]

    On the Internet an enormous amount of information is available that is represented in an unstructured form. The purpose with a text mining tool is to collect this information and present it in a more structured form. In this report text mining is used to create an algorithm that searches abstracts available from PubMed and finds specific relationships between genes that can be used to create a network. The algorithm can also be used to find information about a specific gene. The network created by Mendoza et al. (1999) was verified in all the connections but one using the algorithm. This connection contained implicit information. The results suggest that the algorithm is better at extracting information about specific genes than finding connections between genes. One advantage with the algorithm is that it can also find connections between genes and proteins and genes and other chemical substances.

  • 30.
    Pasalic, Zlatana
    University of Skövde, Department of Computer Science.
    Evaluation of search models for Molecular Replacement using MolRep2002Independent thesis Basic level (degree of Bachelor)Student thesis
    Abstract [en]

    he aim of this study is to use several homology models of different completeness and accuracy and to evaluate them as search models for Molecular Replacement (MR).Three structural groups are evaluated: α-, β- and α/β- group. From every group one template structure and a couple of search models are selected. The search models are manipulated and evaluated. B-factor manipulation, side chain removal and homology modelling are the ways the search models are manipulated. This work shows that B-factor manipulation do not improve the search models. The work also shows that removing the side chains is not improving the search models. Finally the work shows that homology modelling did not model better search models.

  • 31.
    Pohl, Matin
    University of Skövde, School of Humanities and Informatics.
    Using an ontology to enhance metabolic or signaling pathway comparisions by biological and chemical knowledge2006Student thesis
    Abstract [en]

    Motivation:

    As genome-scale efforts are ongoing to investigate metabolic networks of miscellaneous organisms the amount of pathway data is growing. Simultaneously an increasing amount of gene expression data from micro arrays becomes available for reverse engineering, delivering e.g. hypothetical regulatory pathway data. To avoid outgrowing of data and keep control of real new informations the need of analysis tools arises. One vital task is the comparison of pathways for detection of similar functionalities, overlaps, or in case of reverse engineering, detection of known data corroborating a hypothetical pathway. A comparison method using ontological knowledge about molecules and reactions will feature a more biological point of view which graph theoretical approaches missed so far. Such a comparison attempt based on an ontology is described in this report.

    Results:

    An algorithm is introduced that performs a comparison of pathways component by component. The method was performed on two selected databases and the results proved it to be not satisfying using it as stand-alone method. Further development possibilities are suggested and steps toward an integrated method using several approaches are recommended.

    Availability:

    The source code, used database snapshots and pictures can be requested from the author.

  • 32.
    Poudel, Sagar
    University of Skövde, School of Humanities and Informatics.
    GPCR-Directed Libraries for High Throughput Screening2006Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Guanine nucleotide binding protein (G-protein) coupled receptors (GPCRs), the largest receptor family, is enormously important for the pharmaceutical industry as they are the target of 50-60% of all existing medicines. Discovery of many new GPCR receptors by the “human genome project”, open up new opportunities for developing novel therapeutics. High throughput screening (HTS) of chemical libraries is a well established method for finding new lead compounds in drug discovery. Despite some success this approach has suffered from the near absence of more focused and specific targeted libraries. To improve the hit rates and to maximally exploit the full potential of current corporate screening collections, in this thesis work, identification and analysis of the critical drug-binding positions within the GPCRs were done, based on their overall sequence, their transmembrane regions and their drug binding fingerprints. A proper classification based on drug binding fingerprints on the basis for a successful pharmacophore modelling and virtual screening were done, which facilities in the development of more specific and focused targeted libraries for HTS.

  • 33.
    Rahpeymai, Neda
    University of Skövde, Department of Computer Science.
    Data Mining with Decision Trees in the Gene Logic Database: A Breast Cancer Study2002Independent thesis Advanced level (degree of Master (One Year))Student thesis
    Abstract [en]

    Data mining approaches have been increasingly used in recent years in order to find patterns and regularities in large databases. In this study, the C4.5 decision tree approach was used for mining of Gene Logic database, containing biological data. The decision tree approach was used in order to identify the most relevant genes and risk factors involved in breast cancer, in order to separate healthy patients from breast cancer patients in the data sets used. Four different tests were performed for this purpose. Cross validation was performed, for each of the four tests, in order to evaluate the capacity of the decision tree approaches in correctly classifying ‘new’ samples. In the first test, the expression of 108 breast related genes, shown in appendix A, for 75 patients were used as input to the C4.5 algorithm. This test resulted in a decision tree containing only four genes considered to be the most relevant in order to correctly classify patients. Cross validation indicates an average accuracy of 89% in classifying ‘new’ samples. In the second test, risk factor data was used as input. The cross validation result shows an average accuracy of 87% in classifying ‘new’ samples. In the third test, both gene expression data and risk factor data were put together as one input. The cross validation procedure for this approach again indicates an average accuracy of 87% in classifying ‘new’ samples. In the final test, the C4.5 algorithm was used in order to indicate possible signalling pathways involving the four genes identified by the decision tree based on only gene expression data. In some of cases, the C4.5 algorithm found trees suggesting pathways which are supported by the breast cancer literature. Since not all pathways involving the four putative breast cancer genes are known yet, the other suggested pathways should be further analyzed in order to increase their credibility.

    In summary, this study demonstrates the application of decision tree approaches for the identification of genes and risk factors relevant for the classification of breast cancer patients

  • 34.
    Sentausa, Erwin
    University of Skövde, School of Humanities and Informatics.
    Time course simulation replicability of SBML-supporting biochemical network simulation tools2006Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Background: Modelling and simulation are important tools for understanding biological systems. Numerous modelling and simulation software tools have been developed for integrating knowledge regarding the behaviour of a dynamic biological system described in mathematical form. The Systems Biology Markup Language (SBML) was created as a standard format for exchanging biochemical network models among tools. However, it is not certain yet whether actual usage and exchange of SBML models among the tools of different purpose and interfaces is assessable. Particularly, it is not clear whether dynamic simulations of SBML models using different modelling and simulation packages are replicable.

    Results: Time series simulations of published biological models in SBML format are performed using four modelling and simulation tools which support SBML to evaluate whether the tools correctly replicate the simulation results. Some of the tools do not successfully integrate some models. In the time series output of the successful

    simulations, there are differences between the tools.

    Conclusions: Although SBML is widely supported among biochemical modelling and simulation tools, not all simulators can replicate time-course simulations of SBML models exactly. This incapability of replicating simulation results may harm the peer-review process of biological modelling and simulation activities and should be addressed accordingly, for example by specifying in the SBML model the exact algorithm or simulator used for replicating the simulation result.

  • 35.
    Simu, Tiberiu
    University of Skövde, School of Humanities and Informatics.
    A method for extracting pathways from Scansite-predicted protein-protein interactions2006Independent thesis Advanced level (degree of Master (One Year))Student thesis
    Abstract [en]

    Protein interaction is an important mechanism for cellular functionality. Predicting protein interactions is available in many cases as computational methods in publicly available resources (for example Scansite). These predictions can be further combined with other information sources to generate hypothetical pathways. However, when using computational methods for building pathways, the process may become time consuming, as it requires multiple iterations and consolidating data from different sources. We have tested whether it is possible to generate graphs of protein-protein interaction by using only domain-motif interaction data and the degree to which it is possible to automate this process by developing a program that is able to aggregate, under user guidance, query results from different information sources. The data sources used are Scansite and SwissProt. Visualisation of the graphs is done with an external program freely available for academic purposes, Osprey. The graphs obtained by running the software show that although it is possible to combine publicly available data and theoretical protein-protein interaction predictions from Scansite, further efforts are needed to increase the biological plausibility of these collections of data. It is possible, however, to reduce the dimensionality of the obtained graphs by focusing the searches on a certain tissue of interest.

  • 36.
    Sizyoogno, Crisencia
    University of Skövde, School of Bioscience. Uppsala University, department of Immunology, Genetics and Pathology, IGP.
    A canonical correlation analysis- based approach to identify causal genes in atherosclerosis2018Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Genome-wide associations studies (GWASs) have identified hundreds of loci that are strongly associated with coronary artery disease and its risk factors. However, the causal variants and genes remain unknown for the vast majority of the identified loci. Zebrafish model systems coupled with clustered regularly interspaced short palindromic repeats-C–associated 9 (CRISPR Cas-9) mutagenesis have enabled the possibility to systematically characterize candidate genes in GWAS-identified loci. In this thesis, canonical correlation analysis (CCA) was used to identify putative causal genes in multiplexed genetic screens for atherogenic traits in zebrafish larvae in an efficient manner. The two datasets used in this thesis contained genes and phenotypes obtained through sequencing and high-throughput imaging of fish larvae. Dataset 1 contained (7 genes, 11 phenotypes, n = 384) and dataset 2 (4 genes, 11 phenotypes, n = 384). CCA’s multiple genes vs. multiple phenotype analysis in dataset 1 identified the genes met, pepd, timd4 and vegfa to have an association with the total cholesterol, triglycerides, glucose, corrected lipid disposition, as well as co- localization of (macrophage and lipid deposition,) (neutrophils and lipid deposition) and (macrophage and neutrophils). In dataset 2, CCA found previously reported correlation of genes apobb1 and apoea with total cholesterol, low-density lipoprotein and triglycerides as well as co localization of neutrophils and lipids. In comparison with hierarchical linear model, CCA represents a powerful and promising tool to identify causal genes for cardiovascular diseases in data from zebrafish model systems. 

  • 37.
    Synnergren, Jane
    et al.
    University of Skövde, School of Bioscience. University of Skövde, The Systems Biology Research Centre.
    Dönnes, Pierre
    SciCross AB, Skövde, Sweden.
    Current Perspectives on Multi-Omics Data Integration With Application on Toxicity Biomarkers Discovery2018In: Open Access journal of Toxicology, ISSN 2474-7599, Vol. 2, no 5, p. 1-2, article id OAJT.MS.ID.555597Article, review/survey (Refereed)
  • 38.
    Synnergren, Jane
    et al.
    University of Skövde, School of Bioscience. University of Skövde, The Systems Biology Research Centre.
    Ghosheh, Nidal
    University of Skövde, School of Bioscience. University of Skövde, The Systems Biology Research Centre.
    Dönnes, Pierre
    SciCross AB.
    Integration of Biomedical Big Data Requires Efficient Batch Effect Reduction2018In: 10th International Conference on Bioinformatics and Computational Biology (BICOB): Las Vegas, Nevada, USA 19 – 21 March 2018 / [ed] Hisham Al-Mubaid, Qin Ding, Oliver Eulenstein, 2018, p. 76-82Conference paper (Refereed)
    Abstract [en]

     Efficiency in dealing with batch effects will be the next frontier in large-scale biological data analysis, particularly when involving the integration of different types of datasets. Large-scale omics techniques have quickly developed during the last decade and huge amounts of data are now generated, which has started to revolutionize the area of medical research. With the increase in the volume of data across the whole spectrum of biology, problems related to data analytics are continuously increasing as analysis and interpretation of these large volumes of molecular data has become a real challenge. Tremendous efforts have been made to obtain data from various molecular levels and the most recent trends show that more and more researchers now are trying to integrate data of various molecular types to inform hypotheses and biological questions. Tightly connected to this work are the batch-related biases that commonly are apparent between different datasets, but these problems are often not tackled. In present study the ComBat algorithm was applied and evaluated on two different data integration problems. Results show that the batch effects present in the integrated datasets efficiently could be removed by applying the ComBat algorithm.

  • 39.
    Theodoropoulou, Eleftheria
    University of Skövde, School of Bioscience.
    DNA METHYLATION AGE ACCELERATION IN MULTIPLE SCLEROSIS2018Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Age acceleration is a measure indicating if a tissue is aging at an expected rate or not. In this study,the epigenetic clock was used to calculate age acceleration based on DNA methylation values inMultiple Sclerosis datasets. The samples were of whole blood, purified blood cell types and neuronsand included individuals with the disease, as well as controls. Various factors were explored for theireffect on the age acceleration in the context of the disease. In addition, three different normalisationoptions (no normalisation, Noob and Funnorm normalisation) were compared in order to assess theireffect on the output of the epigenetic clock algorithm. Finally, a workflow was proposed for theepigenetic clock analysis, highlighting suitable methods for processing, analysing statistically andvisualising the data.

  • 40.
    Ulfenborg, Benjamin
    et al.
    University of Skövde, School of Bioscience. University of Skövde, The Systems Biology Research Centre.
    Karlsson, Alexander
    University of Skövde, School of Informatics. University of Skövde, The Informatics Research Centre.
    Riveiro, Maria
    University of Skövde, School of Informatics. University of Skövde, The Informatics Research Centre.
    Améen, Caroline
    Takara Bio Europe AB, Gothenburg, Sweden.
    Åkesson, Karolina
    Takara Bio Europe AB, Gothenburg, Sweden.
    Andersson, Christian X.
    Takara Bio Europe AB, Gothenburg, Sweden.
    Sartipy, Peter
    University of Skövde, School of Bioscience. University of Skövde, The Systems Biology Research Centre. Cardiovascular and Metabolic Disease Global Medicines Development Unit, AstraZeneca, Mölndal, Sweden.
    Synnergren, Jane
    University of Skövde, School of Bioscience. University of Skövde, The Systems Biology Research Centre.
    A data analysis framework for biomedical big data: Application on mesoderm differentiation of human pluripotent stem cells2017In: PLoS ONE, ISSN 1932-6203, E-ISSN 1932-6203, Vol. 12, no 6, article id e0179613Article in journal (Refereed)
    Abstract [en]

    The development of high-throughput biomolecular technologies has resulted in generation of vast omics data at an unprecedented rate. This is transforming biomedical research into a big data discipline, where the main challenges relate to the analysis and interpretation of data into new biological knowledge. The aim of this study was to develop a framework for biomedical big data analytics, and apply it for analyzing transcriptomics time series data from early differentiation of human pluripotent stem cells towards the mesoderm and cardiac lineages. To this end, transcriptome profiling by microarray was performed on differentiating human pluripotent stem cells sampled at eleven consecutive days. The gene expression data was analyzed using the five-stage analysis framework proposed in this study, including data preparation, exploratory data analysis, confirmatory analysis, biological knowledge discovery, and visualization of the results. Clustering analysis revealed several distinct expression profiles during differentiation. Genes with an early transient response were strongly related to embryonic-and mesendoderm development, for example CER1 and NODAL. Pluripotency genes, such as NANOG and SOX2, exhibited substantial downregulation shortly after onset of differentiation. Rapid induction of genes related to metal ion response, cardiac tissue development, and muscle contraction were observed around day five and six. Several transcription factors were identified as potential regulators of these processes, e.g. POU1F1, TCF4 and TBP for muscle contraction genes. Pathway analysis revealed temporal activity of several signaling pathways, for example the inhibition of WNT signaling on day 2 and its reactivation on day 4. This study provides a comprehensive characterization of biological events and key regulators of the early differentiation of human pluripotent stem cells towards the mesoderm and cardiac lineages. The proposed analysis framework can be used to structure data analysis in future research, both in stem cell differentiation, and more generally, in biomedical big data analytics.

  • 41.
    Zichner, Thomas
    University of Skövde, School of Humanities and Informatics.
    Building graph models of oncogenesis by using microRNA expression data2008Independent thesis Advanced level (degree of Magister), 20 points / 30 hpStudent thesis
    Abstract [en]

    MicroRNAs (miRNAs) are a class of small non-coding RNAs that control gene expression by targeting mRNAs and triggering either translation repression or RNA degradation. Several groups pointed out that miRNAs play a major role in several diseases, including cancer. This is assumed since the expression level of several miRNAs differs between normal and cancerous cells. Further, it has been shown that miRNAs are involved in cell proliferation and cell death.

    Because of this role it is suspected that miRNAs could serve as biomarkers to improve tumor classification, therapy selection, or prediction of survival. In this context, it is questioned, among other things, whether miRNA deregulations in cancer cells occur according to some pattern or in a rather random order. With this work we contribute to answering this question by adapting two approaches (Beerenwinkel et al. (J Comput Biol, 2005) and Höglund et al. (Gene Chromosome Canc, 2001)), developed to derive graph models of oncogenesis for chromosomal imbalances, to miRNA expression data and applying them to a breast cancer data set. Further, we evaluated the results by comparing them to results derived from randomly altered versions of the used data set.

    We could show that miRNA deregulations most likely follow a rough temporal order, i.e. some deregulations occur early and some occur late in cancer progression. Thus, it seems to be possible that the expression level of some miRNAs can be used as indicator for the stage of a tumor. Further, our results suggest that the over expression of mir-21 as well as mir-102 are initial events in breast cancer oncogenesis.

    Additionally, we identified a set of miRNAs showing a cluster-like behavior, i.e. their deregulations often occur together in a tumor, but other deregulations are less frequently present. These miRNAs are let-7d, mir-10b, mir-125a, mir-125b, mir-145, mir-206, and mir-210.

    Further, we could confirm the strong relationship between the expression of mir-125a and mir-125b.

1 - 41 of 41
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf