QACNN Discussion
MM 0606/2018
QACNN focus on multiple choice QA task.
QACNN matches between passage and choices based on query information.
One of the most important idea in QACNN is two staged attention map.
The first attention map is at word level, representing the importance of each word in paragraph to a certain question; the second attention map is at sentence level representing the importance of each sentence in paragraph to a certain question.
Two-stage Effect Experiment
Difference between one stage QACNN and two-stage QACNN is examined.
For one-stage QACNN, an entire passage is not aplit into sentences.
That is, the shape of passage-query similarity map and passage-choice similarity map are 2D rather than 3D.
The word-level passage is convolved and the passage feature is obtained without second-stage involved.
Table shows the results, and the modified one staged QACNN reaches 66.8% accuracy on validation set, which is ten percent lower than 78.1%, the original QACN accuracy on validation set.
Model | dev set | test set |
---|---|---|
one stage QACNN | 66.8 | |
QACNN (without attention) | 69.6 | |
QACNN (only word-level attention | 72.5 | |
QACNN (only sentence-level attention) | 75.1 | |
QACNN (single) | 77.6 | 75.84 |
QACNN (ensemble) | 79 | 79.99 |
Attention Effect Experiment
The target is to validate the effect of query based attention in QACNN.
Three different structures are modified from original QACNN Layer below:
1 ) For the first one, QACNN layer is modified and both sentence-level attention map and word-level attention map are removed. This modified model would have a deficiency of query information.
Therefore, the final output representation of and a
re concatenated together before prediction layer.
The experiment result is shown on Table. The result is almost ten percent less than the original one.
2) For the second one, only sentence-level attention is removed from QACNN layer and kept word-level atetntion in the model.
3) For the last one, instead of removing sentence-level attention, word-level attention is removed from QACNN.
From table, QACNN (with only word-level attention) performs better than QACNN (without attention);
QACNN (with only sentence-level attention) performs better than QACNN (with only word-level attention).
Original QACNN which contains both word-level and sentence-level attention does the best job among all.
Thus, not only word-level attention but also sentence-level attention can contribute to the performance of QACNN.
However, sentence-level attention seems to play a more important role.