A GPU-Accelerated RAG-Based Telegram Assistant for Supporting Parallel Processing Students

Guy Tel-Zur

Published: 2025/9/15

Abstract

This project addresses a critical pedagogical need: offering students continuous, on-demand academic assistance beyond conventional reception hours. I present a domain-specific Retrieval-Augmented Generation (RAG) system powered by a quantized Mistral-7B Instruct model and deployed as a Telegram bot. The assistant enhances learning by delivering real-time, personalized responses aligned with the "Introduction to Parallel Processing" course materials. GPU acceleration significantly improves inference latency, enabling practical deployment on consumer hardware. This approach demonstrates how consumer GPUs can enable affordable, private, and effective AI tutoring for HPC education.

A GPU-Accelerated RAG-Based Telegram Assistant for Supporting Parallel Processing Students | SummarXiv | SummarXiv