Structural Kernels and Neural Network Models for Question Answering Systems

PhD candidate Massimo Nicosia

20 aprile 2018

Time: h 09:30 am
Location: Room Ofek, Polo Ferrari 1 - Via Sommarive 5, Povo (TN)

PhD Candidate

Massimo Nicosia

Abstract of Dissertation

Tree kernels and neural networks are powerful machine learning models for extracting patterns from data. Tree kernels compute the similarity between two tree-structured text representations that may incorporate syntactic and semantic information. Neural networks map words into informative embeddings, and learn complex non-linear decision functions by applying a number of transformations to the input. Joining the two approaches is an exciting research direction. In this work, which is set in a Question Answering (QA) context, we apply the individual models to classification and ranking tasks. More importantly, we explore the intersection of tree kernels and neural networks, with the goal of developing more accurate models. Initially, we focus on a challenging QA task, the resolution of Crossword Puzzles (CPs), and improve an automatic CP solver by tackling two problems: (i) answering crossword clues by reranking snippets from a search engine, and (ii) clue paraphrasing, which is extremely useful for finding clues with the same answers. We apply reranking models based on syntactic structures, and therefore tree kernels, to increase the accuracy and speed of the solver. In addition, we design and evaluate a composite kernel that combines a kernel over structures, and a kernel on neural network induced representations. Going beyond the neural feature vector approach, we develop a structural kernel that exploits a deep siamese network for evaluating the similarity between words. We assess the resulting model on two classification tasks: question classification and sentiment analysis. To conclude, we study QA models that establish links between question and candidate answer passages using semantic information. First, we present our tree kernel model for answer sentence selection, which captures relations between important question words and entities in the answer. Then, we build a neural network model that can be trained to extract semantic features from text, and eventually establish links between text pairs. We show that such network is able to better model the notion of question-answer relatedness on several QA datasets, compared to the tree kernel model.