By taking a multidisciplinary, evidence-based approach to modeling video difficulty, this study investigates the impact of multimodal complexity on language learners' self-ratings of video difficulty, while simultaneously accounting for the effects of learner differences and video production styles. A set of 320 instructional videos from the corpus of second language video complexity were analyzed using sophisticated natural language processing and computer vision algorithms to extract and compute a wide range of multimodal complexity indices. The results of a linear mixed-effects model demonstrated that pitch variation and academic spoken formulaic sequences helped to facilitate viewing comprehension, whereas infrequent words, image clutter, the number of visual objects, salient objects, visual texts, shots, and moving objects all impeded viewing comprehension. This study concludes by presenting a number of practical implications that can prove useful for English as a Foreign Language teachers and practitioners alongside a comprehensive agenda for future research in this area.
© 2001-2024 Fundación Dialnet · Todos los derechos reservados