QACNN Experiment Results

MM 0606/2018


Implementation details

In the preprocessing step, the pre-trained GloVe vectors is used for word embeddings, and they would not be updated during taining.

Sentence number is padded in each passage to 101.

All word number in each sentence are padded to 100..

Word number of queries and choices were padded to 50.

For all kernels of CNN, W¯¯¯1A\bar{\bar{\bar{W}}}_1^A, W¯¯¯2A\bar{\bar{\bar{W}}}_2^A, W¯¯¯1R\bar{\bar{\bar{W}}}_1^R, W¯¯¯2R\bar{\bar{\bar{W}}}_2^R, each of which has three different kernel width d={1,3,5}d=\{1,3,5\}; each of them has same kernel number l=128l=128.

The dropout is utilized in each CNN layer with dropout rate 0.8. Adam optimizer is used to optimize model with initial learning rate 0.001.

MovieQA Result

The MoviewQA dataset is used to train automatic story comprehension from both video and text.

The data set consists of almost 15,000 multiple choice question answers. Diverse information in this dataset like plots, scripts, sub-title and video captions can be used to infer answers.

Only plot caption is used in this work.

The MovieQA dataset is suitable to evaluate QACNN because movie plots are longer than normal reading comprehension task.

Each question comes with a set of five highl plausible choices, only one of which is correct;

In the MovieQA benchmark, there are 1958 QA pairs in the val set and 3138 QA pairs in the test set.

The ensemble model is used. The ensemble model consists of eight training runs models with identical structure and hyper-parameter. In the val set, 79.0% accuracy with ensemble model is achieved.

In the test set, The model achieves 79.99% accuracy with ensemble model and is the state of the art.

model dev set test set
cosine word2vec 46.4 45.63
cosine TFIDF 47.6 47.36
SSCB Compare Aggregate 48.5
Compare Aggregate 72.1 72.9
QACNN 77.6 75.84
Convnet Fusion 77.63
QACNN (ensemble) 79 79.99

MCTest Results

results matching ""

    No results matching ""