Characterising and expoiting evolving human visual interest in videos

29 October 2014
October 29, 2014

Time: 14:00
Location: Meeting Room Ofek, Polo Scientifico e Tecnologico "Fabio Ferrari" (Edificio Povo 1), via Sommarive 5 - Povo, Trento

Speaker:

  • Harish Katti, Indian Institute of Science 

Abstract
Regions in video streams attracting human interest contribute significantly to human understanding of the video. Being able to predict salient and informative Regions of Interest (ROIs) through a sequence of eye movements is a challenging problem. Applications such as content-aware retargeting of videos to different aspect ratios while preserving informative regions and smart insertion of dialog (closed-caption text) into the video stream can significantly be improved using the predicted ROIs. We propose an interactive human-in-the-loop framework to model eye movements and predict visual saliency into yet-unseen frames. Eye tracking and video content are used to model visual attention in a manner that accounts for important eye-gaze characteristics such as temporal discontinuities due to sudden eye movements, noise, and behavioral artifacts. A novel statistical- and algorithm-based method gaze buffering is proposed for eye-gaze analysis and its fusion with content-based features. Our robust saliency prediction is instantiated for two challenging and exciting applications. The first application alters video aspect ratios on-the-fly using content-aware video retargeting, thus making them suitable for a variety of display sizes. The second application dynamically localizes active speakers and places dialog captions on-the-fly in the video stream. Our method ensures that dialogs are faithful to active speaker locations and do not interfere with salient content in the video stream. Our framework naturally accommodates personalisation of the application to suit biases and preferences of individual users.

Recently published in ACM Transactions on Multimedia

About the Speaker
Harsih Katti is a post-doctoral research in the vision research group at the Center for Neuroscience, Indian Institute of Science. His nterests are in the intersection of human cognition and visual media. He has a bachelor in computer science from Karnatak University, master in bio-medical engineering from IIT Bombay and a PhD from the National University of Singapore.

Contact Person Regarding this Talk:
Nicu Sebe, sebe [at] disi.unitn.it