Open Sourcing a Version of the DGCNN Reading Comprehension QA Model (Keras Version)

By 苏剑林 | August 20, 2019

Last year, I wrote "DGCNN: A CNN-based Reading Comprehension Question Answering Model", which introduced a simple pure convolutional question-answering model. At that time, it was implemented in TensorFlow and was not open-sourced. These past few days, I took some time to reproduce it using Keras and decided to open-source it.

Model Overview

Regarding the basic introduction of DGCNN, I will not repeat it here. The model released here is not a simple repetition of the previous model but has some modifications. Here, I will only introduce the parts that have been changed.

1. The score of the model released here on the offline validation set is approximately 0.72 (previously it was approximately 0.75);

2. This model uses characters as the basic unit and utilizes the "Mixed Character-Word Embedding" explored by the author previously (previously it was word-based);

3. This model has completely removed manual features (previously 8 manual features were used);

4. This model has removed Position Embeddings (previously Position Embeddings were concatenated to the input);

5. The model architecture and training details have been slightly adjusted.

Among these, using characters as the unit is to make the model's labeling more flexible (avoiding word segmentation errors); removing manual features also enhances the model's flexibility and improves prediction speed; as for removing Position Embeddings, it is because several tests showed that Position Embeddings did not provide a significant improvement; other adjustments include using the latest RAdam optimizer for training, and so on.

This release did not meticulously pursue an increase in score; it was purely to introduce a Keras version for everyone's reference. I believe there is still plenty of room for improvement, and interested friends can debug it themselves (both code and datasets have been released).

Open Source Addresses

GitHub Address: https://github.com/bojone/dgcnn_for_reading_comprehension

(Running environment: Python 2.7 + Tensorflow 1.8 + Keras 2.2.4. Please do not disturb with running environment issues, thank you!)

Word Vectors: https://pan.baidu.com/s/1YYE2T3f-lPyLBrJuUowAsA, Password: 5p0h

Dataset: https://pan.baidu.com/s/11C21BAupOpiYWoOx23J7Mg, Password: dh9w

(If there is any impropriety in the open-sourcing of the dataset, please inform me via email, and I will delete it as soon as possible.)

Final Words

I wish you a pleasant trial and look forward to further communication~

,
         author={Su Jianlin},
         year={2019},
         month={Aug},
         url={\url{https://kexue.fm/archives/6906}},
}