AI-Powered Eye Tracking for Early Detection of Reading Difficulties
ReadTrack – AI-based eye tracking platform for early screening of reading challenges in educational settings
Use Case:
Development of an AI-enhanced markerless eye tracking solution that supports early detection of reading difficulties such as dyslexia using custom hardware and deep learning models. The system provides pixel-accurate gaze estimation without beacons or markers.
Outcome:
- Fully operational AI pipeline for gaze detection using custom three-camera hardware (frontal, left eye, right eye);
- Implementation of an end-to-end deep learning framework based on ResNet-18 architecture with early and late fusion strategies;
- Integration of semantic segmentation (EllSeg) for iris and pupil tracking with improved robustness to occlusion, lighting, and head movement;
- Collection of over 180,000 annotated triplets for model training, achieving an average error of ~1–2 cm for gaze point estimation;
- Real-time gaze inference via API with training and inference modes, integrated into GUI with visualization support.
Ecosystem Support:
StairwAI support enabled AI development by DEPTHEN, leveraging GPU cloud infrastructure (DataCrunch) and expert mentoring. The project benefited from flexibility in voucher allocation and integration guidance for training pipelines and API development.
AI Relevance:
This success story demonstrates AI accessibility through:
- Development of personalized models from minimal training (15 min);
- Edge integration with commodity hardware;
- Elimination of traditional visual markers;
- Support for inclusive education through scalable, non-invasive AI-based screening tools.
Summary:
Omolab developed ReadTrack, an innovative AI-based eye tracking platform aimed at supporting early identification of reading difficulties in school-aged children. The system uses a three-camera custom device and deep learning models to predict a user’s gaze point across multiple screens, without needing visual markers. The project involved building a full pipeline with training and inference capabilities, applying a ResNet-18 model with both early and late fusion of input streams. Using a dataset of 180,000 annotated image triplets, the team achieved high precision in gaze estimation (~1–2 cm error). The product is designed to support scalable deployment in schools and speech therapy settings, enhancing Omolab’s existing assistive solutions. The work was supported by the StairwAI ecosystem, which provided cloud GPU access and AI expertise.

