Högskolan i Skövde

his.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Capturing genes with high impact based on reconstruction errors produced by variational autoencoders
University of Skövde, School of Bioscience.
2023 (English)Independent thesis Advanced level (degree of Master (Two Years)), 30 credits / 45 HE creditsStudent thesis
Abstract [en]

In this work we present a novel method to extract potential hub genes, transcription factors and regions with densely interconnected protein-protein-interaction networks from RNAseq data. To achieve this we deploy variational autoencoders, a generative machine learning framework, and extract the gene-wise reconstruction errors. This reconstruction error produced during training is considered as a measurement of impact for a gene on the transcriptome here. 

The method can handle big datasets (3.5Gb and more) in reasonable time on computers for domestic usage without any gpu-acceleration. This circumstance allows users without access to large amounts of computational resources to also work with expression data of large size. 

The final ranking based on reconstruction errors underlies less of a bias compared to most hub gene inference methods currently available. Also no prior gene regulatory network inference is required. However, the introduction of a bias can help to focus on certain genes of interest. Here we biased by using genes present in the STRING data base to also ease the following analysis. 

Analysis of reconstruction error showed a tendency for genes with low reconstruction error to capture genes with central meaning to the data set used for training. In case of healthy cells this was genes associated with house keeping mechanisms and for breast cancer data those genes were associated to breast cancer. In breast cancer specific data we found for example a high frequency of HOX family members linked specifically to breast cancer. For data covering different types of cancer here the picture was broader and covered a wide range of genes associated with different types of cancer. 

There also was a high enrichment of transcription factors present in the genes with low reconstruction error. Not only the regions with lowest reconstruction error will reveal a high enrichment for transcription factors, also other regions show transcription factor enrichment. Transcription factors from these other regions will differ regarding their correlation patterns. 

Regions with low reconstruction error and/or a high transcription factor enrichment show a high PPI-enrichment and exhibit densely interconnected networks. 

Place, publisher, year, edition, pages
2023. , p. 37
National Category
Bioinformatics and Systems Biology
Identifiers
URN: urn:nbn:se:his:diva-22977OAI: oai:DiVA.org:his-22977DiVA, id: diva2:1780505
Subject / course
Systems Biology
Educational program
Systems Biology with specialization in Bioinformatics - Master's Programme
Supervisors
Examiners
Available from: 2023-07-05 Created: 2023-07-05 Last updated: 2023-07-05Bibliographically approved

Open Access in DiVA

fulltext(9487 kB)98 downloads
File information
File name FULLTEXT01.pdfFile size 9487 kBChecksum SHA-512
968f2d139af52cb9e8c3f459b6a530c68d696ae63879fc1decf429b8a35311ece5e15f5cd100976202ad7ea5e320313d13c5a4807845e2fb4f95187f1ac3c6d5
Type fulltextMimetype application/pdf

By organisation
School of Bioscience
Bioinformatics and Systems Biology

Search outside of DiVA

GoogleGoogle Scholar
Total: 98 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 576 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf