Neural vs Statistical Machine Translation: Revisiting the Bangla-English Language Pair
Machine translation systems facilitate our communication and access to information, taking down language barriers. It is a well-researched area of Natural Language Processing(NLP), especially for resource-rich languages (e.g., language pairs in Europarl Parallel corpus). Besides these languages, there is also work on other language pairs including the Bangla-English language pair. In the current study, we aim to revisit both Statistical Machine Translation (SMT) and Neural Machine Translation (NMT) approaches using well-known, publicly avail-able corpora for the Bangla-English (Bangla to English) language pair. We reported how the performance of the models differ based on the data and modeling techniques; consequently, we also compared the results obtained with Google’s machine translation system. Our findings, across different corpora, indicates that NMT based approaches outperform SMT systems. Our results also outperform existing baselines by a large margin.