TFTenricher: a python toolbox for annotation enrichment analysis of transcription factor target genes
2021 (English)In: BMC Bioinformatics, E-ISSN 1471-2105, Vol. 22, no 1, article id 440Article in journal (Refereed) Published
Abstract [en]
Background: Transcription factors (TFs) are the upstream regulators that orchestrate gene expression, and therefore a centrepiece in bioinformatics studies. While a core strategy to understand the biological context of genes and proteins includes annotation enrichment analysis, such as Gene Ontology term enrichment, these methods are not well suited for analysing groups of TFs. This is particularly true since such methods do not aim to include downstream processes, and given a set of TFs, the expected top ontologies would revolve around transcription processes.
Results: We present the TFTenricher, a Python toolbox that focuses specifically at identifying gene ontology terms, cellular pathways, and diseases that are over-represented among genes downstream of user-defined sets of human TFs. We evaluated the inference of downstream gene targets with respect to false positive annotations, and found an inference based on co-expression to best predict downstream processes. Based on these downstream genes, the TFTenricher uses some of the most common databases for gene functionalities, including GO, KEGG and Reactome, to calculate functional enrichments. By applying the TFTenricher to differential expression of TFs in 21 diseases, we found significant terms associated with disease mechanism, while the gene set enrichment analysis on the same dataset predominantly identified processes related to transcription.
Conclusions and availability: The TFTenricher package enables users to search for biological context in any set of TFs and their downstream genes. The TFTenricher is available as a Python 3 toolbox at https://github.com/rasma774/Tftenricher, under a GNU GPL license and with minimal dependencies.
Place, publisher, year, edition, pages
Springer Nature, 2021. Vol. 22, no 1, article id 440
National Category
Bioinformatics and Systems Biology Cancer and Oncology Biochemistry and Molecular Biology
Research subject
Bioinformatics
Identifiers
URN: urn:nbn:se:his:diva-20586DOI: 10.1186/s12859-021-04357-4ISI: 000696540200002PubMedID: 34530727Scopus ID: 2-s2.0-85115057681OAI: oai:DiVA.org:his-20586DiVA, id: diva2:1596628
Note
CC BY 4.0
Correspondence: rasmus.magnusson@his.se School of Bioscience, Systems Biology Research Center, University of Skövde, Skövde, Sweden
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the mate‑rial. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdo‑main/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
2021-09-232021-09-232024-01-17Bibliographically approved