Immersive Real-Time Language Translator for Augmented Reality – Research Unit Virtual & Augmented Reality

Description

Recent advancements in Automatic Speech Recognition (ASR), Text-to-Speech (TTS) technologies and Large Language Models (LLMs) have made real-time transcription and translation of spoken language progressively more practical. This thesis will explore the design and implementation of an immersive communication framework for Augmented Reality (AR), enabling real-time cross-language communication between multiple users without requiring a shared language.

The system will integrate ASR for real-time speech recognition, LLMs for accurate translation and TTS for natural voice synthesis, allowing effective multilingual interactions. This thesis focuses on performance, user experience and the reduction of language barriers in real-time communication. The system will be implemented on real AR glasses.

Tasks

Design and develop a real-time language translator
Technologies to integrate: ASR, TTS, LLMs
Run a user study to evaluate the system

Requirements

Knowledge of English language (source code, comments, and final report should be in English)
Programming skills (especially JS)
Knowledge of AR is advantageous

Environment

The project will be developed for the novel Snapchat Spectacles in Lens Studio.

Contact

For more information please contact Matteo Bosco – matteo.bosco@tuwien.ac.at