With the rapid growth in recent years, Massive Open Online Courses (MOOCs) pro-vide both opportunities and obstacles to learning at scale. On one hand, MOOCs allow learners to get access to diversified high quality learning materials at low cost, and “to control where, what, how and with whom they learn” (Kop et al 2011). As a result, there were around 16.8 million registered MOOC learners by the end of 2014 (Fisher 2012). On the other hand, educators and researchers have raised concerns on the low completion rates (10% in Fowler 2013, less than 7% in Parr 2013), high in-session interruptions (Fowler 2013), and lack of interactions among students and instructors. In current MOOCs, pre-recorded lecture videos, split into 3 – 15 minutes pieces, is the dominant format for knowledge dissemination. In fact, major MOOC providers such as Coursera, edX, and Udacity, have released mobile apps to allow learners to consume video materials “on the move”.

Unfortunately, MOOCs today face at least three major challenges. First, learners are more prone to “mind wandering” (MW, or zoning out) in non-classroom environments (Morris et al 2014). This is in part due to external distractions and the lack of sustained motivation when studying alone. The second problem is the current design of MOOCs is primarily uni-directional, i.e. from instructors to students. Although feedback forms and learner activity logs (e.g. log-in frequency, in-page dwell time, click-through rates) can be used to infer learning efficacy (Killingsworth et al 2011), such measurements are only indirect measurement of the cognitive states in learning. As a result, instructors have little information on how well lectures are received by the learners. Finally, there is little personalization of instruction. It is hard for the instructors in MOOCs to cater learning materials for individual learners’ need and learning process. Different from traditional classrooms, the instructors can no longer rely on facial cues and in-class activities to discover learners who are struggling or MW.


Fig. 1 AttentiveLearner uses the back camera as both a video play control channel in MOOC and an implicit heart rate sensing channel in learning.
In response to these challenges, we propose AttentiveLearner (Fig. 1), an intelligent mobile learning system which supports attentive and bi-directional learning on unmodified mobile phones. AttentiveLearner uses on-lens finger gestures as an intuitive control mechanism for video playback (i.e. covering and holding the camera lens to play an instructional video, uncovering the lens to pause the video, Fig. 1 right). More importantly, AttentiveLearner implicitly extracts learners’ heart rates and infers “zoning out” events by analyzing fingertip transparency changes captured by the built-in cameras. With MW information from learners, AttentiveLearner has the potential to enable adaptive tutoring features on today’s mobile phones (e.g., alerting learners when zoning out, providing more relevant review exercises). AttentiveLearner can also help instructors to improve their syllabus and teaching style by providing an aggregated timeline view of learners’ attention levels synchronized with the learning material.



Fig. 2 AttentiveLearner’s course enrollment and navigation interface.

AttentiveLearner can be fully integrated into a MOOC platform. A learner can register and login the MOOC platform (Fig. 2 a). After accessing to the platform, the learner can browse various courses that s/he has registered (Fig. 2 b). To watch a tutorial video, the learner can simply select the video on the mobile phone (Fig. 2 c). AttentiveLearner’s video GUI is intuitive and easy to use. Simply cover the back camera lens, the video is played; uncover the lens, the video is pause. More importantly, while covering the back camera lens for video watching, the learner’s PPG signal is implicitly recorded and extracted into different features in order to infer the learner’s cognitive states (Fig. 3). There are several widgets come along with AttentiveLearner. The Camera View widget shows the back camera lens’ view in real time, e.g. the red color in Fig. 3 is the real skin color of the covering fingertip under the phone’s flash light. The Attentive Indicator widget notifies the lens covering status which is directly related to the physiological signal recording process (detail is in Fig. 3 left). PPG Signal indicator shows real time PPG signal recorded implicitly from the covering fingertip. Last but not least, Extracted Features widget shows, in real time, extracted features used to infer learner’s cognitive states. With these functionalities, AttentiveLearner can be used to infer implicitly learner’s cognitive states while watching tutorial videos. Thus, AttentiveLearner can provide a bi-directional and attentive learning MOOC environment on unmodified smartphones without dedicated hardware.


Fig. 3 Attentive Widget and indicators of AttentiveLearner during MOOC learning (some visual analytics can be turned off). 


  • Xiang Xiao, Jingtao Wang, Understanding and Detecting Divided Attention in Mobile MOOC Learning, Short Paper, Proceedings of ACM Conference on Human Factors in Computing Systems (CHI 2017), Denver, CO, May 6 – 11, 2017. ( to appear ) | ( press release )
  • Phuong Pham, Jingtao Wang, Understanding Emotional Responses to Mobile Video Advertisements via Physiological Signal Sensing and Facial Expression Analysis, Proceedings of 22nd ACM Conference on Intelligent User Interfaces (IUI 2017), Limassol, Cyprus, March 13 – 16, 2017. ( to appear )
  • Phuong Pham, Jingtao Wang, Adaptive Review for Mobile MOOC Learning via Implicit Physiological Signal Sensing, Proceedings of ACM International Conference on Multimodal Interaction (ICMI 2016), Tokyo, Japan, November 12 – 16, 2016. ( pdf ) | Best Student Paper Award
  • Xiang Xiao, Jingtao Wang, Context and Cognitive-State Triggered Interventions for Mobile MOOC Learning, Proceedings of ACM International Conference on Multimodal Interaction (ICMI 2016), Tokyo, Japan, November 12 – 16, 2016. ( pdf )
  • Phuong Pham, Jingtao Wang, AttentiveVideo: Quantifying Emotional Responses to Mobile Video Advertisements, Demo, ACM International Conference on Multimodal Interaction (ICMI 2016), Tokyo, Japan, November 12 – 16, 2016. ( pdf )
  • Xiang Xiao, Jingtao Wang, Towards Attentive, Bi-directional MOOC Learning on Mobile Devices, Proceedings of ACM International Conference on Multimodal Interaction (ICMI 2015), Seattle, WA, November 9 – 13, 2015. ( pdf ) | Best Paper Nomination
  • Xiang Xiao, Phuong Pham, Jingtao Wang, AttentiveLearner: Adaptive Mobile MOOC Learning via Implicit Cognitive States Inference, Demo, ACM International Conference on Multimodal Interaction (ICMI 2015), Seattle, WA, November 9 – 13, 2015. ( pdf )
  • Phuong Pham, Jingtao Wang, AttentiveLearner: Improving Mobile MOOC Learning via Implicit Heart Rate Tracking, Proceedings of 17th International Conference on Artificial Intelligence in Education (AIED 2015), Madrid, Spain, June 22 – 26, 2015. ( pdf )
  • Xiangmin Fan, Jingtao Wang, BayesHeart: A Probabilistic Approach for Robust, Low-Latency Heart Rate Monitoring on Camera Phones, Proceedings of 20th ACM Conference on Intelligent User Interfaces (IUI 2015), Atalanta, GA, March 29 – April 1, 2015. ( pdf )