Exploring Pair-Wise NMT for Indian Languages

Published in International Conference on Natural Language Processing (ICON-2020), 2020

Recommended citation: K. Akella, S.H. Allu, S.S. Ragupathi, A. Singhal, Z. Khan, V.P. Namboodiri, C V Jawahar, "Exploring Pair-Wise NMT for Indian Languages", International Conference on Natural Language Processing (ICON-2020) short paper https://arxiv.org/abs/2012.05786

Download paper here

In this paper, we address the task of improving pair-wise machine translation for specific low resource Indian languages. Multilingual NMT models have demonstrated a reasonable amount of effectiveness on resource-poor languages. In this work, we show that the performance of these models can be significantly improved upon by using back-translation through a filtered back-translation process and subsequent fine-tuning on the limited pair-wise language corpora. The analysis in this paper suggests that this method can significantly improve a multilingual model's performance over its baseline, yielding state-of-the-art results for various Indian languages.

Recommended citation: K. Akella, S.H. Allu, S.S. Ragupathi, A. Singhal, Z. Khan, V.P. Namboodiri, C V Jawahar, “Exploring Pair-Wise NMT for Indian Languages”, International Conference on Natural Language Processing (ICON-2020) short paper