Högskolan i Skövde

his.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A comparative study of social bot classification techniques
University of Skövde, School of Informatics.
University of Skövde, School of Informatics.
University of Skövde, School of Informatics.
2019 (English)Independent thesis Basic level (degree of Bachelor), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

With social media rising in popularity over the recent years, new so called social bots are infiltrating by spamming and manipulating people all over the world. Many different methods have been presented to solve this problem with varying success. This study aims to compare some of these methods, on a dataset of Twitter account metadata, to provide helpful information to companies when deciding how to solve this problem. Two machine learning algorithms and a human survey will be compared on the ability to classify accounts. The algorithms used are the supervised algorithm random forest and the unsupervised algorithm k-means. There will also be an evaluation of two ways to run these algorithms, using the machine learning as a service BigML and the python library Scikit-learn. Additionally, what metadata features are most valuable in the supervised and human survey will be compared. Results show that supervised machine learning is the superior technique for social bot identification with an accuracy of almost 99%. To conclude, it depends on the expertise of the company and if a relevant training dataset is available but in most cases supervised machine learning is recommended.

Place, publisher, year, edition, pages
2019. , p. 48
Keywords [en]
manual bot classification, social bot, metadata, machine learning, supervised learning, unsupervised learning, random forest, k-means
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:his:diva-16994OAI: oai:DiVA.org:his-16994DiVA, id: diva2:1321724
Subject / course
Informationsteknologi
Educational program
Computer Science - Specialization in Systems Development
Supervisors
Examiners
Available from: 2019-06-10 Created: 2019-06-09 Last updated: 2019-06-10Bibliographically approved

Open Access in DiVA

fulltext(1067 kB)782 downloads
File information
File name FULLTEXT01.pdfFile size 1067 kBChecksum SHA-512
4126bed7df8ce6bdac6543989043d0c9c207c7106f2062f2a719688b7193afa6ce0c1d330e4649f426cd88a61cb439a3a40187a95755a37e6949559f3c19c56b
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Örnbratt, FilipIsaksson, JonathanWilling, Mario
By organisation
School of Informatics
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 783 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 1669 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf