Högskolan i Skövde

his.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Evaluation of clusterings of gene expression data
University of Skövde, Department of Computer Science.
2000 (English)Independent thesis Advanced level (degree of Master (One Year))Student thesis
Abstract [en]

Recent literature has investigated the use of different clustering techniques for analysis of gene expression data. For example, self-organizing maps (SOMs) have been used to identify gene clusters of clear biological relevance in human hematopoietic differentiation and the yeast cell cycle (Tamayo et al., 1999). Hierarchical clustering has also been proposed for identifying clusters of genes that share common roles in cellular processes (Eisen et al., 1998; Michaels et al., 1998; Wen et al., 1998). Systematic evaluation of clustering results is as important as generating the clusters. However, this is a difficult task, which is often overlooked in gene expression studies. Several gene expression studies claim success of the clustering algorithm without showing a validation of complete clusterings, for example Ben-Dor and Yakhini (1999) and Törönen et al. (1999).

In this dissertation we propose an evaluation approach based on a relative entropy measure that uses additional knowledge about genes (gene annotations) besides the gene expression data. More specifically, we use gene annotations in the form of an enzyme classification hierarchy, to evaluate clusterings. This classification is based on the main chemical reactions that are catalysed by enzymes. Furthermore, we evaluate clusterings with pure statistical measures of cluster validity (compactness and isolation).

The experiments include applying two types of clustering methods (SOMs and hierarchical clustering) on a data set for which good annotation is available, so that the results can be partly validated from the viewpoint of biological relevance.

The evaluation of the clusters indicates that clusters obtained from hierarchical average linkage clustering have much higher relative entropy values and lower compactness and isolation compared to SOM clusters. Clusters with high relative entropy often contain enzymes that are involved in the same enzymatic activity. On the other hand, the compactness and isolation measures do not seem to be reliable for evaluation of clustering results.

Place, publisher, year, edition, pages
Skövde: Institutionen för datavetenskap , 2000. , p. 100
Keywords [en]
Gene expression analysis, Evaluation of clusterings, Self organizing maps, Average linkage clustering, Annotations
National Category
Information Systems
Identifiers
URN: urn:nbn:se:his:diva-484OAI: oai:DiVA.org:his-484DiVA, id: diva2:2863
Presentation
(English)
Uppsok
Social and Behavioural Science, Law
Supervisors
Available from: 2008-01-11 Created: 2008-01-11 Last updated: 2018-01-12

Open Access in DiVA

fulltext(1113 kB)237 downloads
File information
File name FULLTEXT01.psFile size 1113 kBChecksum MD5
eb89767237820e2a964c0c6ee1301e471517ccba5789f3cb6edf73a0407d4d720b2d1bb8
Type fulltextMimetype application/postscript
fulltext(450 kB)333 downloads
File information
File name FULLTEXT02.pdfFile size 450 kBChecksum SHA-512
911bc5738e5e8efcfd220d42b725ec253ae745844534c278f879f20ade9410781765cce774c7a81719d551911d6fbdee38362a6b0f1dfc2fca69ca98895da93b
Type fulltextMimetype application/pdf

By organisation
Department of Computer Science
Information Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 572 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 308 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf