Neural Machine Translation (seq2seq)

Shrimp Chen 2018/04/17

Installing the Tutorial

To install this tutorial, we need to have TensorFlow installed on our system. It requires TensorFlow Nightly. To install TensorFolw, follow theinstructions here.

Then, we can download the source code of this tutorial by running:

git clonehttps://github.com/tensorflow/nmt

Training - How to build our first NMT system

Let's first dive into the heart of building an NMT model with concrete code snippets through which we will explain Figure 2. in more detail. We defer data preparation and the full code to later. This part refers to the filemodel.py.

At the bottom layer, the encoder and decoder RNNs receive as input the following: first, the source sentence, then a boundary marker "\<s>" which indicates the transition from encoding to the decoding mode, and the target sentence. For _training, _we will feed the system with the following tensors, which are in time-major format and contain word indices:

  • encoder inputs [max_encoder_time, batch_size]: source input words.
  • decoder inputs [max_encoder_time, batch_size]: target input words.
  • decoder outputs [max_decoder_time, batch_size]: target output words, these are decoder inputs shifted to the left by one time step with and end-of-sentence tag appended on the right.

Here foe efficiency, we train with multiple sentence (batch_size) at once. Testing is slightly different, so we will discuss it later.

Embadding

Given the categorical nature of words, the model must first look up the source and target embeddings to retrieve the corresponding word representations. For this _embedding _layer to work, a vocabulary is first chosen for each language. Usually, a vocabulary size V is selected, and only the most frequent words are treated as unique. All other words are converted to an "unknown" token and all get the same embedding The embedding weights, one set per language, are usually learner during training.

這裡還有一段表格

Similarly, we can build_embedding_decoder _and _decoder_emb_inp._Note that one can choose to initialize embedding weights with pretrained word representations such as word2vec or Glove vectors. In general, given a large amount of training data we can learn these embeddings from scratch.

[0]https://www.tensorflow.org/tutorials/seq2seq

results matching ""

    No results matching ""