Towards Two-Dimensional Sequence to Sequence Model in Neural Machine Translation
P. Bahar, C. Brix, H. Ney. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3009-3015. Association for Computational Linguistics
We describe a novel network architecture: A two-dimensional LSTM can process two separate sequences at the same time. By feeding in the source sentence, as well as the partial target hypothesis, the 2D-LSTM can internally attend to the relevant source positions. Therefore, we are able to use the final 2D-LSTM output directly to predict the next target token, and do not need an explicit decoder.