Jump to navigation Jump to search
- Using the arz-ara part of Corpus-26 (a parallel corpus of 2000 sentences taken from the Basic Traveling Expression Corpus (BTEC)).
- Publication: Bouamor, Houda, Nizar Habash, Mohammad Salameh, Wajdi Zaghouani, Owen Rambow, Dana Abdulrahim, Ossama Obeid, Salam Khalifa, Fadhl Eryani, Alexander Erdmann and Kemal Oflazer. The MADAR Arabic Dialect Corpus and Lexicon. In Proceedings of the International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, 2018.
- A Parallel corpus of arz-ara-apc/ajp (2,994 sentences). The data was manually translated by professional translators. Sentences are collected from Wikipedia and movie subtitles.
- Publication: Wael Abid. 2020. The SADID evaluation datasets for low-resource spoken language machine translation of Arabic dialects. In Proceedings of the 28th International Conference on Computational Linguistics, pages 6030–6043, Barcelona, Spain
- An attempt to produce such translation pair using rule-based machine translation.
- The system isn't currently available online!
- Puclication: Wael Salloum and Nizar Habash. 2012. Elissa: A dialectal to Standard Arabic machine translation system. In Proceedings of COLING 2012: Demonstration Papers, pages 385–392, Mumbai, India. The COLING 2012 Organizing Committee.