Breaking the Natural Language Understanding pipeline

1 aprile 2019

Date & Time: April 1, 2019 h. 2:30 to 4:30pm
Venue: Room A204, Via Sommarive 5 - Polo Ferrari 1 (Povo, TN)

Speaker

Prof. Frédéric Béchet , U. Aix Marseille

Abstract

Natural Language Understanding (NLU) is the process of producing semantic interpretations from words and other linguistic events that are automatically detected in a text document or a speech signal. The kind of semantic model used to express the interpretation of a sentence or an utterance can be specific to a given application (for example an intent+slot/value model for a spoken dialog system) or can be a generic purpose semantic model, such as FrameNet or Abstract Meaning Representation (AMR).

For such models, the standard linguistic processing pipeline is made of a chain of sequential processes such as Part-Of-Specch tagging, chunking, Named-Entity (NE) recognition, syntactic parsing and, finally, semantic analysis. This architecture is clearly sub-optimal as each error at a given level can lead to more errors at the next level, following a "snow ball" effect. This phenomenon is particularly critical when processing speech transcriptions prone to contain errors.

In the last few years, approaches based on a continuous vector space representation for words and Deep Neural Networks have been proposed to unify several Natural Language Processing tasks into a single model that can be optimized selectively according to the application targeted. This effort to "break the pipeline" is one of the very interesting properties of the DNN methods, leading to better performance and more robustness in the parsing process.

Another way to "break the pipeline" is to change the parsing paradigm: instead of processing each sentence/utterance as a whole, initiating the process only when the whole sequence of words is known, transition-based parsing methods take as input a stream of words, breaking even more the pipeline by suppressing the need for sentence segmentation prior to parsing. Several NLP tasks can be performed at the same time in a transition based parser, and can also be implemented with DNN models.

This talk will briefly introduce the different tasks involved in a standard NLU pipeline, then present to ways of "breaking" this pipeline, first by means of multi-tasks DNN methods, then through the transition-based parsing paradigm.

About the Speaker

Frédéric Béchet is a researcher in the field of Speech and Natural Language Processing. His research activities are mainly focused on Spoken Language Understanding for both Spoken Dialogue Systems and Speech Mining applications. After studying Computer Science at the University of Marseille, he obtained his PhD in Computer Science in 1994 from the University of Avignon, France. Since then he worked at the Ludwig Maximilian University in Munich, as a Professor Assistant at the University of Avignon, as an invited professor at ATT Research Shannon Lab in Florham Park, New Jersey.

Frédéric Béchet is currently a full Professor of Computer Science at the Aix Marseille University, and a member of the Natural Language Processing research group of the Laboratoire Informatique et Systèmes - LIS UMR 7020.

Frédéric Béchet is the author/co-author of over 100 refereed papers in journals and international conferences and hold two patents. He is an Associate Editor for IEEE Signal Processing Letter since 2012, has served on the reviewing committees of several international conferences (ICASSP, Interspeech, ASRU, HLT, EMNLP) and has been an invited reviewer for several journals including : Speech Communication, IEEE Signal Processing Letters, IEEE Transactions on Speech and Audio Processing, Traitement Automatique des Langues.

Frédéric Béchet was an elected member of the IEEE Speech and Language Processing Technical Committee and is currently vice-president of the board of the French Natural Language Processing association ATALA.

Frédéric Béchet has been involved in many French and European research programs in the fields of Speech Processing and Spoken Dialog Systems : FP5 SMADA STREP, FP6 LUNA STREP, FP6 PASCAL NoE, ANR EPAC, ANR SEQUOIA, ANR EDYLEX, ANR DECODA, ANR PERCOL, ANR ASFALDA, ANR ORFEO, ANR DATCHA. He was the Coordinator of several French programs such as PERCOL, DECODA and DATCHA, as well as Principal Investigator for the Aix Marseille University for a US DARPA funded project (BOLT 2012-2015).

In terms of managerial activities, Frédéric Béchet is currently vice-director of the Laboratoire Informatique et Systèmes - LIS UMR 7020.

Contact: giuseppe.riccardi [at] unitn.it (Giuseppe Riccardi)