Turkish and Kyrgyz/Kymorph article
< Turkish and Kyrgyz
Jump to navigation
Jump to search
Revision as of 07:29, 5 October 2011 by Firespeaker (talk | contribs)
Outline
Morphotactica
Morphophonologia
Corpora
- Which corpora to use?
- Wikipedia
- Azattyk
- concerns
- Wikipedia is messy; should we have an automated cleaning process or get stats as-is?
- Use aq-wikicrp, this way it is reproducible .
- Wikipedia is messy; should we have an automated cleaning process or get stats as-is?
Numbers
wikipedia | azattyk | |
---|---|---|
num words | 271005 | |
xml file size | >3.8MB |