Högskolan i Skövde

his.sePublications
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Real-time Object Detection for the Visually Impaired An On-Device Federated Learning Approach
University of Skövde, School of Informatics.
2024 (English)Independent thesis Advanced level (degree of Master (One Year)), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Visually impaired people often face difficulties in recognizing objects around them, which can make everyday tasks harder and reduce independence. Advances in artificial intelligence and computer vision now make it possible to design systems that detect and describe objects in real time. The goal of this project was to create such a system with a focus on speed, efficiency, and privacy, so it could run directly on affordable devices without relying on cloud services. The model was trained on the VizWiz dataset, which contains photos taken by blind and low-vision users. A lightweight YOLOv8-small model was trained and fine-tuned, then optimized through model compression techniques, including pruning, knowledge distillation, and quantization. The final version was converted into TensorRT engines in FP16 and INT8 formats, which are suitable for high-speed, low-power inference on devices such as the NVIDIA Jetson Nano. Privacy was a key consideration. Instead of sending images to external servers, the system was designed with future support for Federated Learning, allowing devices to train locally and share only model updates. Although full implementation of Federated Learning was not part of this work, the design allows easy integration. Testing on a laptop with a webcam showed that the system runs smoothly on limited hardware while providing useful object detection. Future improvements could include running tests on real devices such as smart glasses or mobile boards, involving visually impaired users for feedback, expanding the dataset, and allowing each model to adapt to personal needs. Together, these steps can turn the system into a practical and trustworthy tool that gives visually impaired people more confidence and independence in their daily lives.

Place, publisher, year, edition, pages
2024. , p. 56
Keywords [en]
Object Detection, YOLOv8, Federated Learning, Jetson Nano, Assistive Technology, TensorRT
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:his:diva-25922OAI: oai:DiVA.org:his-25922DiVA, id: diva2:2007312
Subject / course
Informationsteknologi
Educational program
Data Science - Master’s Programme
Supervisors
Examiners
Available from: 2025-10-17 Created: 2025-10-17 Last updated: 2025-10-17Bibliographically approved

Open Access in DiVA

fulltext(2273 kB)85 downloads
File information
File name FULLTEXT01.pdfFile size 2273 kBChecksum SHA-512
35052c657b3ea963b641479ebe7819fee2ff83948e1268c8cc3ee4dc490fad7c1528474b2e91b55dae009c323478451a137ea2d7d575583fb8858c743a44a9f3
Type fulltextMimetype application/pdf

By organisation
School of Informatics
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 822 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf