Wikipedia dumps

From Apertium
Revision as of 07:41, 15 May 2018 by Hectoralos (talk | contribs) (+ wikiextractor)
Jump to navigation Jump to search

Wikipedia dumps are useful for quickly getting a corpus. They are also the best corpora for making your language pair are useful for Wikipedia's Content Translation tool :-)

You download them from


There are several tools for turning dumps into useful plaintext, e.g.