Pytorch seq2seq chatbot
MM 0406 2018
Formosa Speech Grand Challenge
Fig.1 File structure of pytorch chatbot code example.
Fig.1 shows file structure of Pytorch chatbot code example.
README.md
Get started
Clone the repository
git clone https://github.com/ywk991112/pytorch-chatbot
Corpus
In the corpus file, the input-output sequence pairs should be in the adjacent lines. For example,
a1 I'll see you next time.
a2 Sure. Bye.
b1 How are you?
b2 Better than ever.
The a1, a2 are one pair sequence; b1 and b2 are another pair sequence.
The corpus files should be placed under a path like,
pytorch-chatbot/data/
<
corpus file name
>
Otherwise, the corpus file will be tracked by git.
Training
Training process runs with the following command codes,
python3 main.py -tr
<
CORPUS_FILE_PATH
>
-la 1 -hi 512 -lr 0.0001 -it 50000 -b 64 -p 500 -s 1000
where the argument values can be assigned.
If there is saved model, the training process can be continued with the following command codes
python3 main.py -tr
<
CORPUS_FILE_PATH
>
-l
<
MODEL_FILE_PATH
>
-lr 0.0001 -it 50000 -b 64 -p 500 -s 1000
More options can be obtained with the following commanding codes.
python3 main.py -h
Testing
Models will be saved inpytorch-chatbot/save/model
while training, and this can be changed inconfig.py
.
The saved model can be evaluated with input sequences in the corpus.
python3 main.py -te
<
MODEL_FILE_PATH
>
-c
<
CORPUS_FILE_PATH
>
The model is tested with input sequences manually with the following command codes.
python3 main.py -te
<
MODEL_FILE_PATH
>
-c
<
CORPUS_FILE_PATH
>
-i
Beam search with size k is implemented with the following command codes.
python3 main.py -te
<
MODEL_FILE_PATH
>
-c
<
CORPUS_FILE_PATH
>
-be k [-i]