Högskolan i Skövde

his.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Bioinformatics tools for discovery and evaluation of biomarkers: Applications in clinical assessment of cancer
University of Skövde, School of Bioscience. University of Skövde, The Systems Biology Research Centre. (Bioinformatik)ORCID iD: 0000-0001-9242-4852
2016 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Cancer is a disease characterized by abnormal proliferation of cells in the body and ranks as the second leading cause of death worldwide. In order to improve cancer patient care, a major focus of cancer research is to discover biomarkers. A biomarker is a biological molecule found in tissues or body fluids and can be used to predict or assess disease states. The aim of this thesis is to develop bioinformatics tools for discovery and evaluation of novel biomarkers from high-throughput datasets.

MicroRNAs (miRNAs) are short non-coding RNAs that function as negative regulators of gene expression. Dysregulation of miRNAs in cancer is frequently reported, making them interesting as biomarker candidates. GenoScan was developed for genome-wide discovery of miRNA-coding genes, as a first step in the identification of novel mi-RNA biomarkers.

High-throughput technologies such as microarrays allow researchers to measure the expression of thousands of genes or miRNAs simultaneously. The Decision Trunk Classifier (DTC) algorithm has been developed to screen datasets from these experiments for biomarker candidates. When applied to a miRNA expression dataset for endometrial cancer (EC) samples vs. controls, a two-marker model with 98 % accuracy was generated. These miRNAs (hsa-miR-183-5p and hsa-miRPlus-C1070) are promising as biomarkers for EC screening.

The miREC database was developed to store gene and miRNA data from curated expression profiling studies of EC, as well as gene-miRNA regulatory connections. Using gene-miRNA interaction networks from miREC, the roles of miRNAs in cancer hallmark acquisition can be clarified. To further support exploratory analysis of expression data, DTC was extended with partial least squares regression models. The resulting PLS-DTC algorithm can be used to gain deeper insights into the perturbation of biological processes and pathways.

Place, publisher, year, edition, pages
Örebro: Örebro University , 2016. , p. 75
Series
Örebro Studies in Medicine, ISSN 1652-4063 ; 130
Keywords [en]
Algorithms, biomarkers, machine learning, classification, cancer, microRNA database, microRNA discovery, partial least squares
National Category
Bioinformatics and Computational Biology Cancer and Oncology Cell and Molecular Biology
Research subject
Medical sciences; Bioinformatics
Identifiers
URN: urn:nbn:se:his:diva-11824ISBN: 978-91-7529-111-6 (print)OAI: oai:DiVA.org:his-11824DiVA, id: diva2:893602
Public defence
2016-02-03, Insikten (Portalen), Skövde, 23:05 (English)
Opponent
Supervisors
Available from: 2016-01-22 Created: 2016-01-12 Last updated: 2025-02-05Bibliographically approved
List of papers
1. Genome-wide discovery of miRNAs using ensembles of machine learning algorithms and logistic regression
Open this publication in new window or tab >>Genome-wide discovery of miRNAs using ensembles of machine learning algorithms and logistic regression
2015 (English)In: International Journal of Data Mining and Bioinformatics, ISSN 1748-5681, Vol. 13, no 4, p. 338-359Article in journal (Refereed) Published
Abstract [en]

In silico prediction of novel miRNAs from genomic sequences remains a challenging problem. This study presents a genome-wide miRNA discovery software package called GenoScan and evaluates two hairpin classification methods. These methods, one ensemble-based and one using logistic regression were benchmarked along with 15 published methods. In addition, the sequence-folding step is addressed by investigating the impact of secondary structure prediction methods and the choice of input sequence length on prediction performance. Both the accuracy of secondary structure predictions and the miRNA prediction are evaluated. In the benchmark of hairpin classification methods, the regression model achieved highest classification accuracy. Of the structure prediction methods evaluated, ContextFold achieved the highest agreement between predicted and experimentally determined structures. However, both the choice of secondary structure prediction method and input sequence length had limited impact on hairpin classification performance.

Place, publisher, year, edition, pages
InderScience Publishers, 2015
National Category
Bioinformatics and Computational Biology
Research subject
Natural sciences; Bioinformatics
Identifiers
urn:nbn:se:his:diva-11759 (URN)10.1504/IJDMB.2015.072755 (DOI)000366135400002 ()26547983 (PubMedID)2-s2.0-84946741012 (Scopus ID)
Available from: 2015-12-15 Created: 2015-12-15 Last updated: 2025-02-07Bibliographically approved
2. Classification of tumor samples from expression data using decision trunks
Open this publication in new window or tab >>Classification of tumor samples from expression data using decision trunks
2013 (English)In: Cancer Informatics, E-ISSN 1176-9351, Vol. 12, p. 53-66Article in journal (Refereed) Published
Abstract [en]

We present a novel machine learning approach for the classification of cancer samples using expression data. We refer to the method as "decision trunks," since it is loosely based on decision trees, but contains several modifications designed to achieve an algorithm that: (1) produces smaller and more easily interpretable classifiers than decision trees; (2) is more robust in varying application scenarios; and (3) achieves higher classification accuracy. The decision trunk algorithm has been implemented and tested on 26 classification tasks, covering a wide range of cancer forms, experimental methods, and classification scenarios. This comprehensive evaluation indicates that the proposed algorithm performs at least as well as the current state of the art algorithms in terms of accuracy, while producing classifiers that include on average only 2-3 markers. We suggest that the resulting decision trunks have clear advantages over other classifiers due to their transparency, interpretability, and their correspondence with human decision-making and clinical testing practices. © the author(s), publisher and licensee Libertas Academica Ltd.

Place, publisher, year, edition, pages
Sage Publications, 2013
Keywords
Biomarkers, Classification, Gene expression, Machine learning, accuracy, article, classification algorithm, controlled study, decision making, decision tree, intermethod comparison, learning algorithm
National Category
Computer Sciences Cancer and Oncology
Research subject
Natural sciences; Bioinformatics
Identifiers
urn:nbn:se:his:diva-8394 (URN)10.4137/CIN.S10356 (DOI)23467331 (PubMedID)2-s2.0-84874202131 (Scopus ID)
Note

CC BY-NC 3.0

© the author(s), publisher and licensee Libertas Academica Ltd. This is an open access article. Unrestricted non-commercial use is permitted provided the original work is properly cited.

Available from: 2013-08-12 Created: 2013-08-12 Last updated: 2023-04-27Bibliographically approved
3. miREC: a database of miRNAs involved in the development of endometrial cancer
Open this publication in new window or tab >>miREC: a database of miRNAs involved in the development of endometrial cancer
Show others...
2015 (English)In: BMC Research Notes, E-ISSN 1756-0500, Vol. 8, no 1, article id 104Article in journal (Refereed) Published
Abstract [en]

Background

Endometrial cancer (EC) is the most frequently diagnosed gynecological malignancy and the fourth most common cancer diagnosis overall among women. As with many other forms of cancer, it has been shown that certain miRNAs are differentially expressed in EC and these miRNAs are believed to play important roles as regulators of processes involved in the development of the disease. With the rapidly growing number of studies of miRNA expression in EC, there is a need to organize the data, combine the findings from experimental studies of EC with information from various miRNA databases, and make the integrated information easily accessible for the EC research community.

Findings

The miREC database is an organized collection of data and information about miRNAs shown to be differentially expressed in EC. The database can be used to map connections between miRNAs and their target genes in order to identify specific miRNAs that are potentially important for the development of EC. The aim of the miREC database is to integrate all available information about miRNAs and target genes involved in the development of endometrial cancer, and to provide a comprehensive, up-to-date, and easily accessible source of knowledge regarding the role of miRNAs in the development of EC. Database URL: http://www.mirecdb.orgwebcite.

Conclusions

Several databases have been published that store information about all miRNA targets that have been predicted or experimentally verified to date. It would be a time-consuming task to navigate between these different data sources and literature to gather information about a specific disease, such as endometrial cancer. The miREC database is a specialized data repository that, in addition to miRNA target information, keeps track of the differential expression of genes and miRNAs potentially involved in endometrial cancer development. By providing flexible search functions it becomes easy to search for EC-associated genes and miRNAs from different starting points, such as differential expression and genomic loci (based on genomic aberrations).

Place, publisher, year, edition, pages
BioMed Central, 2015
Keywords
Endometrial cancer, MicroRNA, Database
National Category
Cancer and Oncology
Research subject
Medical sciences; Bioinformatics; Infection Biology
Identifiers
urn:nbn:se:his:diva-10891 (URN)10.1186/s13104-015-1052-9 (DOI)25889518 (PubMedID)2-s2.0-84940717539 (Scopus ID)
Note

CC BY 4.0

© 2015 Ulfenborg et al.; licensee BioMed Central. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Available from: 2015-05-05 Created: 2015-05-05 Last updated: 2024-01-17Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records

Ulfenborg, Benjamin

Search in DiVA

By author/editor
Ulfenborg, Benjamin
By organisation
School of BioscienceThe Systems Biology Research Centre
Bioinformatics and Computational BiologyCancer and OncologyCell and Molecular Biology

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1624 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf