Adaptive Learning with On-Device Foundation Models: Design and Real-Time Performance Evaluation
Course Instructor
Pramod Gupta
Abstract
Existing mobile learning tools often rely heavily on cloud-based language models, reducing privacy, consistency, and multimodal context awareness. Moreover, current mobile AI systems struggle to generate reliable structured educational content, as JSON-based parsing frequently fails and multimodal signals (PDFs, voice notes, conversation history) are rarely fused coherently. This project presents Inquis, an on-device, AI-powered adaptive study companion that integrates Apple’s foundation models with techniques for structured generation, multimodal context fusion, and intelligent scheduling. We introduce a type-safe content generation pipeline using Swift’s @Generable macro, enabling the language model to emit strongly typed Swift structures with 100% valid outputs. To evaluate real-time feasibility, we benchmark on-device foundation model performance across three task classes on a smartphone-class device: simple factual queries (2.57 s mean), structured flashcard generation (4.44 s mean, n=40), and PDF summarization (6.74 s mean, n=40), all with practical real-time latency. Finally, a hybrid voice architecture combines low-latency cloud transcription with an on-device fallback to maintain offline functionality and privacy. These results show that privacy-preserving, on-device foundation models can support robust, multimodal, adaptive learning experiences for next-generation mobile educational systems.
Adaptive Learning with On-Device Foundation Models: Design and Real-Time Performance Evaluation
Existing mobile learning tools often rely heavily on cloud-based language models, reducing privacy, consistency, and multimodal context awareness. Moreover, current mobile AI systems struggle to generate reliable structured educational content, as JSON-based parsing frequently fails and multimodal signals (PDFs, voice notes, conversation history) are rarely fused coherently. This project presents Inquis, an on-device, AI-powered adaptive study companion that integrates Apple’s foundation models with techniques for structured generation, multimodal context fusion, and intelligent scheduling. We introduce a type-safe content generation pipeline using Swift’s @Generable macro, enabling the language model to emit strongly typed Swift structures with 100% valid outputs. To evaluate real-time feasibility, we benchmark on-device foundation model performance across three task classes on a smartphone-class device: simple factual queries (2.57 s mean), structured flashcard generation (4.44 s mean, n=40), and PDF summarization (6.74 s mean, n=40), all with practical real-time latency. Finally, a hybrid voice architecture combines low-latency cloud transcription with an on-device fallback to maintain offline functionality and privacy. These results show that privacy-preserving, on-device foundation models can support robust, multimodal, adaptive learning experiences for next-generation mobile educational systems.