Högskolan i Skövde

his.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Multi-Assignment Clustering: Machine learning from a biological perspective
University of Skövde, School of Bioscience. University of Skövde, Systems Biology Research Environment. (Translationell bioinformatik, Translational Bioinformatics)ORCID iD: 0000-0001-9242-4852
University of Skövde, Informatics Research Environment. University of Skövde, School of Informatics. (Skövde Artificial Intelligence Lab (SAIL))ORCID iD: 0000-0003-2973-3112
University of Skövde, Informatics Research Environment. University of Skövde, School of Informatics. Department of Computer Science and Informatics, School of Engineering, Jönköping University, Sweden. (Skövde Artificial Intelligence Lab (SAIL))ORCID iD: 0000-0003-2900-9335
Takara Bio Europe AB, Gothenburg, Sweden.
Show others and affiliations
2021 (English)In: Journal of Biotechnology, ISSN 0168-1656, E-ISSN 1873-4863, Vol. 326, p. 1-10Article in journal (Refereed) Published
Abstract [en]

A common approach for analyzing large-scale molecular data is to cluster objects sharing similar characteristics. This assumes that genes with highly similar expression profiles are likely participating in a common molecular process. Biological systems are extremely complex and challenging to understand, with proteins having multiple functions that sometimes need to be activated or expressed in a time-dependent manner. Thus, the strategies applied for clustering of these molecules into groups are of key importance for translation of data to biologically interpretable findings. Here we implemented a multi-assignment clustering (MAsC) approach that allows molecules to be assigned to multiple clusters, rather than single ones as in commonly used clustering techniques. When applied to high-throughput transcriptomics data, MAsC increased power of the downstream pathway analysis and allowed identification of pathways with high biological relevance to the experimental setting and the biological systems studied. Multi-assignment clustering also reduced noise in the clustering partition by excluding genes with a low correlation to all of the resulting clusters. Together, these findings suggest that our methodology facilitates translation of large-scale molecular data into biological knowledge. The method is made available as an R package on GitLab (https://gitlab.com/wolftower/masc).

Place, publisher, year, edition, pages
Elsevier, 2021. Vol. 326, p. 1-10
Keywords [en]
Clustering, K-means, annotation enrichment, multiple cluster assignment, pathways, transcriptomics
National Category
Bioinformatics and Computational Biology
Research subject
Bioinformatics; Skövde Artificial Intelligence Lab (SAIL)
Identifiers
URN: urn:nbn:se:his:diva-19329DOI: 10.1016/j.jbiotec.2020.12.002ISI: 000616124700001PubMedID: 33285150Scopus ID: 2-s2.0-85097644109OAI: oai:DiVA.org:his-19329DiVA, id: diva2:1510637
Note

CC BY 4.0

Available from: 2020-12-16 Created: 2020-12-16 Last updated: 2025-09-29Bibliographically approved

Open Access in DiVA

fulltext(4781 kB)469 downloads
File information
File name FULLTEXT01.pdfFile size 4781 kBChecksum SHA-512
bd48d37481e84dd6a115b882b83fd45a63cae5b4b19a8ab4617ab66df776e34741564d00a5814e6f87b7f8de2a54b53087789431c56290869230d1dfb72a3064
Type fulltextMimetype application/pdf

Other links

Publisher's full textPubMedScopus

Authority records

Ulfenborg, BenjaminKarlsson, AlexanderRiveiro, MariaSartipy, PeterSynnergren, Jane

Search in DiVA

By author/editor
Ulfenborg, BenjaminKarlsson, AlexanderRiveiro, MariaSartipy, PeterSynnergren, Jane
By organisation
School of BioscienceSystems Biology Research EnvironmentInformatics Research EnvironmentSchool of Informatics
In the same journal
Journal of Biotechnology
Bioinformatics and Computational Biology

Search outside of DiVA

GoogleGoogle Scholar
Total: 469 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 853 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf