Högskolan i Skövde

his.sePublications
Change search
Link to record
Permanent link

Direct link
Lubovac-Pilav, ZelminaORCID iD iconorcid.org/0000-0001-6427-0315
Alternative names
Publications (10 of 28) Show all publications
Åkesson, J., Hojjati, S., Hellberg, S., Raffetseder, J., Khademi, M., Rynkowski, R., . . . Gustafsson, M. (2023). Proteomics reveal biomarkers for diagnosis, disease activity and long-term disability outcomes in multiple sclerosis. Nature Communications, 14(1), Article ID 6903.
Open this publication in new window or tab >>Proteomics reveal biomarkers for diagnosis, disease activity and long-term disability outcomes in multiple sclerosis
Show others...
2023 (English)In: Nature Communications, E-ISSN 2041-1723, Vol. 14, no 1, article id 6903Article in journal (Refereed) Published
Abstract [en]

Sensitive and reliable protein biomarkers are needed to predict disease trajectory and personalize treatment strategies for multiple sclerosis (MS). Here, we use the highly sensitive proximity-extension assay combined with next-generation sequencing (Olink Explore) to quantify 1463 proteins in cerebrospinal fluid (CSF) and plasma from 143 people with early-stage MS and 43 healthy controls. With longitudinally followed discovery and replication cohorts, we identify CSF proteins that consistently predicted both short- and long-term disease progression. Lower levels of neurofilament light chain (NfL) in CSF is superior in predicting the absence of disease activity two years after sampling (replication AUC = 0.77) compared to all other tested proteins. Importantly, we also identify a combination of 11 CSF proteins (CXCL13, LTA, FCN2, ICAM3, LY9, SLAMF7, TYMP, CHI3L1, FYB1, TNFRSF1B and NfL) that predict the severity of disability worsening according to the normalized age-related MS severity score (replication AUC = 0.90). The identification of these proteins may help elucidate pathogenetic processes and might aid decisions on treatment strategies for persons with MS.

Place, publisher, year, edition, pages
Springer Nature, 2023
National Category
Neurosciences Rheumatology and Autoimmunity Bioinformatics and Systems Biology Bioinformatics (Computational Biology)
Research subject
Bioinformatics
Identifiers
urn:nbn:se:his:diva-23344 (URN)10.1038/s41467-023-42682-9 (DOI)37903821 (PubMedID)2-s2.0-85175444895 (Scopus ID)
Funder
Swedish Foundation for Strategic Research, SB16-0011The Swedish Brain FoundationKnut and Alice Wallenberg FoundationSwedish Research Council, 2019-04193Swedish Research Council, 2018-02776Swedish Research Council, 2020-02700Swedish Research Council, 2020-00014Swedish Research Council, 2021-03092Medical Research Council of Southeast Sweden (FORSS), FORSS-315121Swedish Association of Persons with Neurological Disabilities, F2018-0052
Note

CC BY 4.0

e-mail: mika.gustafsson@liu.se

The study was funded by the Swedish Foundation for Strategic Research (SB16-0011 [M.G., J.E.]), the Swedish Brain Foundation, Knut and Alice Wallenberg Foundation, and Margareth AF Ugglas Foundation, Swedish Research Council (2019-04193 [M.G.], 2018-02776 [J.E.], 2020-02700 [F.P.], 2020-00014 [Z.L.P.], 2021-03092 [J.E.]), the Medical Research Council of Southeast Sweden (FORSS-315121 [J.E.]), NEURO Sweden (F2018-0052 [J.E.]), ALF grants, Region Östergötland, the Swedish Foundation for MS Research and the European Union’s Marie Sklodowska-Curie (813863 [J.E.]). The authors would like to acknowledge support of the Clinical biomarker facility at SciLifeLab Sweden for providing assistance in protein analyses.

Open access funding provided by Linköping University.

Available from: 2023-11-08 Created: 2023-11-08 Last updated: 2023-11-09Bibliographically approved
Jurcevic, S., Keane, S., Borgmästars, E., Lubovac-Pilav, Z. & Ejeskär, K. (2022). Bioinformatics analysis of miRNAs in the neuroblastoma 11q-deleted region reveals a role of miR-548l in both 11q-deleted and MYCN amplified tumour cells. Scientific Reports, 12(1), Article ID 19729.
Open this publication in new window or tab >>Bioinformatics analysis of miRNAs in the neuroblastoma 11q-deleted region reveals a role of miR-548l in both 11q-deleted and MYCN amplified tumour cells
Show others...
2022 (English)In: Scientific Reports, E-ISSN 2045-2322, Vol. 12, no 1, article id 19729Article in journal (Refereed) Published
Abstract [en]

Neuroblastoma is a childhood tumour that is responsible for approximately 15% of all childhood cancer deaths. Neuroblastoma tumours with amplification of the oncogene MYCN are aggressive, however, another aggressive subgroup without MYCN amplification also exists; rather, they have a deleted region at chromosome arm 11q. Twenty-six miRNAs are located within the breakpoint region of chromosome 11q and have been checked for a possible involvement in development of neuroblastoma due to the genomic alteration. Target genes of these miRNAs are involved in pathways associated with cancer, including proliferation, apoptosis and DNA repair. We could show that miR-548l found within the 11q region is downregulated in neuroblastoma cell lines with 11q deletion or MYCN amplification. In addition, we showed that the restoration of miR-548l level in a neuroblastoma cell line led to a decreased proliferation of these cells as well as a decrease in the percentage of cells in the S phase. We also found that miR-548l overexpression suppressed cell viability and promoted apoptosis, while miR-548l knockdown promoted cell viability and inhibited apoptosis in neuroblastoma cells. Our results indicate that 11q-deleted neuroblastoma and MYCN amplified neuroblastoma coalesce by downregulating miR-548l.

Place, publisher, year, edition, pages
Springer Nature, 2022
National Category
Bioinformatics and Systems Biology Biomedical Laboratory Science/Technology Bioinformatics (Computational Biology) Cancer and Oncology Medical Genetics Cell and Molecular Biology
Research subject
Infection Biology; Translational Medicine TRIM; Bioinformatics
Identifiers
urn:nbn:se:his:diva-22068 (URN)10.1038/s41598-022-24140-6 (DOI)000885172100065 ()36396668 (PubMedID)2-s2.0-85142197814 (Scopus ID)
Funder
Swedish Childhood Cancer Foundation
Note

CC BY 4.0

© 2022 Springer Nature Limited

We thank the Swedish Childhood Cancer Fund and Assar Gabrielsson Found for financial support.

Open access funding provided by University of Skövde.

Correspondence and requests for materials should be addressed to S.J.

Available from: 2022-11-21 Created: 2022-11-21 Last updated: 2023-01-16Bibliographically approved
de Weerd, H. A., Åkesson, J., Guala, D., Gustafsson, M. & Lubovac-Pilav, Z. (2022). MODalyseR—a novel software for inference of disease module hub regulators identified a putative multiple sclerosis regulator supported by independent eQTL data. Bioinformatics Advances, 2(1), Article ID vbac006.
Open this publication in new window or tab >>MODalyseR—a novel software for inference of disease module hub regulators identified a putative multiple sclerosis regulator supported by independent eQTL data
Show others...
2022 (English)In: Bioinformatics Advances, E-ISSN 2635-0041, Vol. 2, no 1, article id vbac006Article in journal (Refereed) Published
Abstract [en]

MotivationNetwork-based disease modules have proven to be a powerful concept for extracting knowledge about disease mechanisms, predicting for example disease risk factors and side effects of treatments. Plenty of tools exist for the purpose of module inference, but less effort has been put on simultaneously utilizing knowledge about regulatory mechanisms for predicting disease module hub regulators.

ResultsWe developed MODalyseR, a novel software for identifying disease module regulators and reducing modules to the most disease-associated genes. This pipeline integrates and extends previously published software packages MODifieR and ComHub and hereby provides a user-friendly network medicine framework combining the concepts of disease modules and hub regulators for precise disease gene identification from transcriptomics data. To demonstrate the usability of the tool, we designed a case study for multiple sclerosis that revealed IKZF1 as a promising hub regulator, which was supported by independent ChIP-seq data.

Availability and implementationMODalyseR is available as a Docker image at https://hub.docker.com/r/ddeweerd/modalyser with user guide and installation instructions found at https://gustafsson-lab.gitlab.io/MODalyseR/.

Supplementary informationSupplementary data are available at Bioinformatics Advances online.

Place, publisher, year, edition, pages
Oxford University Press, 2022
National Category
Bioinformatics and Systems Biology
Research subject
Bioinformatics
Identifiers
urn:nbn:se:his:diva-21058 (URN)10.1093/bioadv/vbac006 (DOI)36699378 (PubMedID)2-s2.0-85148565848 (Scopus ID)
Funder
Knowledge Foundation, dnr HSK219/26Swedish Foundation for Strategic Research, SB16-0011Swedish Research Council, 2019-04193
Note

CC BY 4.0

Correspondence: Mika Gustafsson

Advance Access Publication Date: 25 January 2022

Funding: This work was supported by the Knowledge Foundation [dnr HSK219/26]; Swedish Foundation for Strategic Research [SB16-0011]; and Swedish Research Council [grant 2019-04193].

Available from: 2022-04-13 Created: 2022-04-13 Last updated: 2023-03-02Bibliographically approved
Badam, T. V. S., de Weerd, H. A., Martínez-Enguita, D., Olsson, T., Alfredsson, L., Kockum, I., . . . Gustafsson, M. (2021). A validated generally applicable approach using the systematic assessment of disease modules by GWAS reveals a multi-omic module strongly associated with risk factors in multiple sclerosis. BMC Genomics, 22(1), Article ID 631.
Open this publication in new window or tab >>A validated generally applicable approach using the systematic assessment of disease modules by GWAS reveals a multi-omic module strongly associated with risk factors in multiple sclerosis
Show others...
2021 (English)In: BMC Genomics, E-ISSN 1471-2164, Vol. 22, no 1, article id 631Article in journal (Refereed) Published
Abstract [en]

Background: There exist few, if any, practical guidelines for predictive and falsifiable multi-omic data integration that systematically integrate existing knowledge. Disease modules are popular concepts for interpreting genome-wide studies in medicine but have so far not been systematically evaluated and may lead to corroborating multi-omic modules. Result: We assessed eight module identification methods in 57 previously published expression and methylation studies of 19 diseases using GWAS enrichment analysis. Next, we applied the same strategy for multi-omic integration of 20 datasets of multiple sclerosis (MS), and further validated the resulting module using both GWAS and risk-factor-associated genes from several independent cohorts. Our benchmark of modules showed that in immune-associated diseases modules inferred from clique-based methods were the most enriched for GWAS genes. The multi-omic case study using MS data revealed the robust identification of a module of 220 genes. Strikingly, most genes of the module were differentially methylated upon the action of one or several environmental risk factors in MS (n = 217, P = 10− 47) and were also independently validated for association with five different risk factors of MS, which further stressed the high genetic and epigenetic relevance of the module for MS. Conclusions: We believe our analysis provides a workflow for selecting modules and our benchmark study may help further improvement of disease module methods. Moreover, we also stress that our methodology is generally applicable for combining and assessing the performance of multi-omic approaches for complex diseases. 

Place, publisher, year, edition, pages
BioMed Central, 2021
Keywords
Benchmark, Data integration, Disease modules, Genome-wide association analysis, Methylomics, Multi-omics, Multiple sclerosis, Network analysis, Network modules, Protein network analysis, Risk factors, Transcriptomics
National Category
Bioinformatics and Systems Biology Immunology in the medical area Medical Genetics
Research subject
Bioinformatics
Identifiers
urn:nbn:se:his:diva-20535 (URN)10.1186/s12864-021-07935-1 (DOI)000692402600002 ()34461822 (PubMedID)2-s2.0-85113734842 (Scopus ID)
Funder
Swedish Research Council, 2015–03807Swedish Research Council, 2018–02638EU, Horizon 2020, grant 818170Knut and Alice Wallenberg Foundation, 2019.0089Knowledge Foundation, 20170298Swedish Foundation for Strategic Research , SB16–0095Swedish National Infrastructure for Computing (SNIC), SNIC 2020/5–177, LiU-2018-12 and LiU-2019-25
Note

CC BY 4.0

© 2021, The Author(s)

This article is licensed under a Creative Commons Attribution 4.0 International License,which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you giveappropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate ifchanges were made. The images or other third party material in this article are included in the article's Creative Commonslicence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commonslicence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtainpermission directly from the copyright holder. To view a copy of this licence, visithttp://creativecommons.org/licenses/by/4.0/.The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to thedata made available in this article, unless otherwise stated in a credit line to the data.

Correspondence: mika.gustafsson@liu.se

This work was supported by the Swedish Research Council (grant 2015–03807(M.G.), grant 2018–02638(M.J.)), the Swedish foundation for strategic research (grant SB16–0095(M.G.)), the Center for Industrial IT (CENIIT)(M.G.), European Union Horizon 2020/European Research Council Consolidator grant (Epi4MS, grant 818170(M.J.)), Knut and Alice Wallenberg Foundation (grant 2019.0089(M.J.)) and the Knowledge Foundation (grant 20170298(Z.L.)). Computational resources were granted by Swedish National Infrastructure for Computing (SNIC; SNIC 2020/5–177, LiU-2018-12 and LiU-2019-25). The funding bodies had no role in the study and collection, ana-lysis, and interpretation of data and in writing the manuscript. Open Accessfunding provided by Linköping University.

Available from: 2021-09-09 Created: 2021-09-09 Last updated: 2024-01-17Bibliographically approved
Åkesson, J., Lubovac-Pilav, Z., Magnusson, R. & Gustafsson, M. (2021). ComHub: Community predictions of hubs in gene regulatory networks. BMC Bioinformatics, 22(1), Article ID 58.
Open this publication in new window or tab >>ComHub: Community predictions of hubs in gene regulatory networks
2021 (English)In: BMC Bioinformatics, E-ISSN 1471-2105, Vol. 22, no 1, article id 58Article in journal (Refereed) Published
Abstract [en]

BACKGROUND: Hub transcription factors, regulating many target genes in gene regulatory networks (GRNs), play important roles as disease regulators and potential drug targets. However, while numerous methods have been developed to predict individual regulator-gene interactions from gene expression data, few methods focus on inferring these hubs.

RESULTS: We have developed ComHub, a tool to predict hubs in GRNs. ComHub makes a community prediction of hubs by averaging over predictions by a compendium of network inference methods. Benchmarking ComHub against the DREAM5 challenge data and two independent gene expression datasets showed a robust performance of ComHub over all datasets.

CONCLUSIONS: In contrast to other evaluated methods, ComHub consistently scored among the top performing methods on data from different sources. Lastly, we implemented ComHub to work with both predefined networks and to perform stand-alone network inference, which will make the method generally applicable.

Place, publisher, year, edition, pages
Springer Nature, 2021
Keywords
Gene regulatory networks, Hubs, Master regulators, Network inference
National Category
Bioinformatics and Systems Biology
Research subject
Bioinformatics; INF502 Biomarkers
Identifiers
urn:nbn:se:his:diva-19478 (URN)10.1186/s12859-021-03987-y (DOI)000617736000001 ()33563211 (PubMedID)2-s2.0-85100810993 (Scopus ID)
Note

CC BY 4.0

The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. CC0 1.0

Available from: 2021-02-18 Created: 2021-02-18 Last updated: 2024-01-17Bibliographically approved
Magnusson, R. & Lubovac-Pilav, Z. (2021). TFTenricher: a python toolbox for annotation enrichment analysis of transcription factor target genes. BMC Bioinformatics, 22(1), Article ID 440.
Open this publication in new window or tab >>TFTenricher: a python toolbox for annotation enrichment analysis of transcription factor target genes
2021 (English)In: BMC Bioinformatics, E-ISSN 1471-2105, Vol. 22, no 1, article id 440Article in journal (Refereed) Published
Abstract [en]

Background: Transcription factors (TFs) are the upstream regulators that orchestrate gene expression, and therefore a centrepiece in bioinformatics studies. While a core strategy to understand the biological context of genes and proteins includes annotation enrichment analysis, such as Gene Ontology term enrichment, these methods are not well suited for analysing groups of TFs. This is particularly true since such methods do not aim to include downstream processes, and given a set of TFs, the expected top ontologies would revolve around transcription processes.

Results: We present the TFTenricher, a Python toolbox that focuses specifically at identifying gene ontology terms, cellular pathways, and diseases that are over-represented among genes downstream of user-defined sets of human TFs. We evaluated the inference of downstream gene targets with respect to false positive annotations, and found an inference based on co-expression to best predict downstream processes. Based on these downstream genes, the TFTenricher uses some of the most common databases for gene functionalities, including GO, KEGG and Reactome, to calculate functional enrichments. By applying the TFTenricher to differential expression of TFs in 21 diseases, we found significant terms associated with disease mechanism, while the gene set enrichment analysis on the same dataset predominantly identified processes related to transcription.

Conclusions and availability: The TFTenricher package enables users to search for biological context in any set of TFs and their downstream genes. The TFTenricher is available as a Python 3 toolbox at https://github.com/rasma774/Tftenricher, under a GNU GPL license and with minimal dependencies.

Place, publisher, year, edition, pages
Springer Nature, 2021
National Category
Bioinformatics and Systems Biology Cancer and Oncology Biochemistry and Molecular Biology
Research subject
Bioinformatics
Identifiers
urn:nbn:se:his:diva-20586 (URN)10.1186/s12859-021-04357-4 (DOI)000696540200002 ()34530727 (PubMedID)2-s2.0-85115057681 (Scopus ID)
Note

CC BY 4.0

Correspondence: rasmus.magnusson@his.se School of Bioscience, Systems Biology Research Center, University of Skövde, Skövde, Sweden

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the mate‑rial. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdo‑main/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Available from: 2021-09-23 Created: 2021-09-23 Last updated: 2024-01-17Bibliographically approved
de Weerd, H. A., Badam, T. V. S., Martínez-Enguita, D., Åkesson, J., Muthas, D., Gustafsson, M. & Lubovac-Pilav, Z. (2020). MODifieR: an ensemble R package for inference of disease modules from transcriptomics networks. Bioinformatics, 36(12), 3918-3919
Open this publication in new window or tab >>MODifieR: an ensemble R package for inference of disease modules from transcriptomics networks
Show others...
2020 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 36, no 12, p. 3918-3919Article in journal (Refereed) Published
Abstract [en]

MOTIVATION: Complex diseases are due to the dense interactions of many disease-associated factors that dysregulate genes that in turn form so-called disease modules, which have shown to be a powerful concept for understanding pathological mechanisms. There exist many disease module inference methods that rely on somewhat different assumptions, but there is still no gold standard or best performing method. Hence, there is a need for combining these methods to generate robust disease modules.

RESULTS: We developed MODule IdentiFIER (MODifieR), an ensemble R package of nine disease module inference methods from transcriptomics networks. MODifieR uses standardized input and output allowing the possibility to combine individual modules generated from these methods into more robust disease-specific modules, contributing to a better understanding of complex diseases.

AVAILABILITY: MODifieR is available under the GNU GPL license and can be freely downloaded from https://gitlab.com/Gustafsson-lab/MODifieR and as a Docker image from https://hub.docker.com/r/ddeweerd/modifier.

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Place, publisher, year, edition, pages
Oxford University Press, 2020
National Category
Bioinformatics and Systems Biology
Research subject
Bioinformatics; INF501 Integration of -omics Data
Identifiers
urn:nbn:se:his:diva-18387 (URN)10.1093/bioinformatics/btaa235 (DOI)000550127500051 ()32271876 (PubMedID)2-s2.0-85087321319 (Scopus ID)
Note

CC BY 4.0

"Applications Note". "Systems biology". 

Available from: 2020-04-15 Created: 2020-04-15 Last updated: 2022-04-13Bibliographically approved
Björn, N., Badam, T., Spalinskas, R., Brandén, E., Koyi, H., Lewensohn, R., . . . Gréen, H. (2020). Whole-genome sequencing and gene network modules predict gemcitabine/carboplatin-induced myelosuppression in non-small cell lung cancer patients. NPJ Systems Biology and Applications, 6(1), Article ID 25.
Open this publication in new window or tab >>Whole-genome sequencing and gene network modules predict gemcitabine/carboplatin-induced myelosuppression in non-small cell lung cancer patients
Show others...
2020 (English)In: NPJ Systems Biology and Applications, E-ISSN 2056-7189, Vol. 6, no 1, article id 25Article in journal (Refereed) Published
Abstract [en]

Gemcitabine/carboplatin chemotherapy commonly induces myelosuppression, including neutropenia, leukopenia, and thrombocytopenia. Predicting patients at risk of these adverse drug reactions (ADRs) and adjusting treatments accordingly is a long-term goal of personalized medicine. This study used whole-genome sequencing (WGS) of blood samples from 96 gemcitabine/carboplatin-treated non-small cell lung cancer (NSCLC) patients and gene network modules for predicting myelosuppression. Association of genetic variants in PLINK found 4594, 5019, and 5066 autosomal SNVs/INDELs with p ≤ 1 × 10−3 for neutropenia, leukopenia, and thrombocytopenia, respectively. Based on the SNVs/INDELs we identified the toxicity module, consisting of 215 unique overlapping genes inferred from MCODE-generated gene network modules of 350, 345, and 313 genes, respectively. These module genes showed enrichment for differentially expressed genes in rat bone marrow, human bone marrow, and human cell lines exposed to carboplatin and gemcitabine (p < 0.05). Then using 80% of the patients as training data, random LASSO reduced the number of SNVs/INDELs in the toxicity module into a feasible prediction model consisting of 62 SNVs/INDELs that accurately predict both the training and the test (remaining 20%) data with high (CTCAE 3–4) and low (CTCAE 0–1) maximal myelosuppressive toxicity completely, with the receiver-operating characteristic (ROC) area under the curve (AUC) of 100%. The present study shows how WGS, gene network modules, and random LASSO can be used to develop a feasible and tested model for predicting myelosuppressive toxicity. Although the proposed model predicts myelosuppression in this study, further evaluation in other studies is required to determine its reproducibility, usability, and clinical effect.

Place, publisher, year, edition, pages
Nature Publishing Group, 2020
National Category
Bioinformatics and Systems Biology Medical Genetics
Research subject
Bioinformatics
Identifiers
urn:nbn:se:his:diva-18947 (URN)10.1038/s41540-020-00146-6 (DOI)000568927100001 ()32839457 (PubMedID)2-s2.0-85089776223 (Scopus ID)
Note

CC BY 4.0

Available from: 2020-08-26 Created: 2020-08-26 Last updated: 2020-11-12Bibliographically approved
Weishaupt, H., Johansson, P., Sundström, A., Lubovac-Pilav, Z., Olsson, B., Nelander, S. & Swartling, F. J. (2019). Batch-normalization of cerebellar and medulloblastoma gene expression datasets utilizing empirically defined negative control genes. Bioinformatics, 35(18), 3357-3364
Open this publication in new window or tab >>Batch-normalization of cerebellar and medulloblastoma gene expression datasets utilizing empirically defined negative control genes
Show others...
2019 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 35, no 18, p. 3357-3364Article in journal (Refereed) Published
Abstract [en]

Motivation: Medulloblastoma (MB) is a brain cancer predominantly arising in children. Roughly 70% of patients are cured today, but survivors often suffer from severe sequelae. MB has been extensively studied by molecular profiling, but often in small and scattered cohorts. To improve cure rates and reduce treatment side effects, accurate integration of such data to increase analytical power will be important, if not essential.

Results: We have integrated 23 transcription datasets, spanning 1350 MB and 291 normal brain samples. To remove batch effects, we combined the Removal of Unwanted Variation (RUV) method with a novel pipeline for determining empirical negative control genes and a panel of metrics to evaluate normalization performance. The documented approach enabled the removal of a majority of batch effects, producing a large-scale, integrative dataset of MB and cerebellar expression data. The proposed strategy will be broadly applicable for accurate integration of data and incorporation of normal reference samples for studies of various diseases. We hope that the integrated dataset will improve current research in the field of MB by allowing more large-scale gene expression analyses.

Place, publisher, year, edition, pages
Oxford University Press, 2019
National Category
Bioinformatics and Systems Biology
Research subject
Bioinformatics
Identifiers
urn:nbn:se:his:diva-16769 (URN)10.1093/bioinformatics/btz066 (DOI)000487327500019 ()30715209 (PubMedID)2-s2.0-85072349088 (Scopus ID)
Note

CC BY-NC 4.0

Available from: 2019-04-11 Created: 2019-04-11 Last updated: 2023-09-21Bibliographically approved
Borgmästars, E., de Weerd, H. A., Lubovac-Pilav, Z. & Sund, M. (2019). miRFA: an automated pipeline for microRNA functional analysis with correlation support from TCGA and TCPA expression data in pancreatic cancer. BMC Bioinformatics, 20(1), 1-17, Article ID 393.
Open this publication in new window or tab >>miRFA: an automated pipeline for microRNA functional analysis with correlation support from TCGA and TCPA expression data in pancreatic cancer
2019 (English)In: BMC Bioinformatics, E-ISSN 1471-2105, Vol. 20, no 1, p. 1-17, article id 393Article in journal (Refereed) Published
Abstract [en]

BACKGROUND: MicroRNAs (miRNAs) are small RNAs that regulate gene expression at a post-transcriptional level and are emerging as potentially important biomarkers for various disease states, including pancreatic cancer. In silico-based functional analysis of miRNAs usually consists of miRNA target prediction and functional enrichment analysis of miRNA targets. Since miRNA target prediction methods generate a large number of false positive target genes, further validation to narrow down interesting candidate miRNA targets is needed. One commonly used method correlates miRNA and mRNA expression to assess the regulatory effect of a particular miRNA. The aim of this study was to build a bioinformatics pipeline in R for miRNA functional analysis including correlation analyses between miRNA expression levels and its targets on mRNA and protein expression levels available from the cancer genome atlas (TCGA) and the cancer proteome atlas (TCPA). TCGA-derived expression data of specific mature miRNA isoforms from pancreatic cancer tissue was used.

RESULTS: Fifteen circulating miRNAs with significantly altered expression levels detected in pancreatic cancer patients were queried separately in the pipeline. The pipeline generated predicted miRNA target genes, enriched gene ontology (GO) terms and Kyoto encyclopedia of genes and genomes (KEGG) pathways. Predicted miRNA targets were evaluated by correlation analyses between each miRNA and its predicted targets. MiRNA functional analysis in combination with Kaplan-Meier survival analysis suggest that hsa-miR-885-5p could act as a tumor suppressor and should be validated as a potential prognostic biomarker in pancreatic cancer.

CONCLUSIONS: Our miRNA functional analysis (miRFA) pipeline can serve as a valuable tool in biomarker discovery involving mature miRNAs associated with pancreatic cancer and could be developed to cover additional cancer types. Results for all mature miRNAs in TCGA pancreatic adenocarcinoma dataset can be studied and downloaded through a shiny web application at https://emmbor.shinyapps.io/mirfa/ .

Place, publisher, year, edition, pages
BioMed Central, 2019
Keywords
Functional enrichment, Mature miRNA, Pancreatic cancer, TCGA, TCPA, miRNA functional analysis, miRNA target prediction
National Category
Bioinformatics and Systems Biology
Research subject
Bioinformatics; INF502 Biomarkers
Identifiers
urn:nbn:se:his:diva-17456 (URN)10.1186/s12859-019-2974-3 (DOI)000475761100001 ()31311505 (PubMedID)2-s2.0-85069159500 (Scopus ID)
Note

CC BY 4.0

Available from: 2019-07-19 Created: 2019-07-19 Last updated: 2024-01-17Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-6427-0315

Search in DiVA

Show all publications