Toward Automated Understanding of Instructional Videos

10 aprile 2017
10 aprile 2017

Time: April 10, 2017, h. 10:30 am
Location: Room Garda, Polo Scientifico e Tecnologico "Fabio Ferrari", via Sommarive 5, Povo (Trento)

Speaker

Prof. Jason Corso, University of Michigan

Abstract

Complex human activities involve many sub-activities that have further sub-activities and so on.  Understanding the sequence of these steps iscritical to understanding long-term complex activities, like those involved incooking, healthcare, interaction, etc.  However, manually annotating these sub-activities is cumbersome and error-prone.  We hence propose a deep learning model that takes a first step toward understanding such complex human activities.  Our first model on this hard problem seeks to automatically discover the temporal segments in each video.  We apply this model to instructional cooking videos that are unstructured, untrimmed and fromthe web.

We experiment with a new version of our YouCook dataset that containstwenty-five samples of more than 100 different cooking recipes, fully annotated with natural language recipes.  Time permitting, I will discuss the incorporation of language models both in guidance for the process learning and for the description of the steps.  I will also discuss related work that incorporates human-input as side information during inference on the problem of pose estimation.

About the Speaker

Prof. Corso is an associate professor of Electrical Engineering and Computer Science at the University of Michigan.  He received his PhD and MSE degrees at The Johns Hopkins University in 2005 and 2002, respectively, and the BS Degree with honors from Loyola College In Maryland in 2000, all in Computer Science.  He spent two years as a post-doctoral fellow at the University of California, Los Angeles.

From 2007-14 he was a member of the Computer Science and Engineering faculty at SUNY Buffalo.  He is the recipient of a Google Faculty Research Award 2015, the Army Research Office Young Investigator Award
2010, NSF CAREER award 2009, SUNY Buffalo Young Investigator Award 2011, a member of the 2009 DARPA Computer Science Study Group, and a recipient of the Link Foundation Fellowship in Advanced Simulation and Training 2003.  Corso has authored more than one-hundred peer-reviewed papers on topics of his research interest including computer vision, robot perception, data science, and medical imaging.  He is a member of the AAAI, ACM, MAA and a senior member of the IEEE.

Contact person regarding this talk: Nicu Sebe, sebe [at] disi.unitn.it