Turkish and Kyrgyz/Kymorph article
< Turkish and Kyrgyz
		
		
		
		
		Jump to navigation
		Jump to search
		Revision as of 07:29, 5 October 2011 by Firespeaker (talk | contribs)
Outline
Morphotactica
Morphophonologia
Corpora
- Which corpora to use?
- Wikipedia
 - Azattyk
 
 - concerns
- Wikipedia is messy; should we have an automated cleaning process or get stats as-is?
- Use aq-wikicrp, this way it is reproducible .
 
 
 - Wikipedia is messy; should we have an automated cleaning process or get stats as-is?
 
Numbers
| wikipedia | azattyk | |
|---|---|---|
| num words | 271005 | |
| xml file size | >3.8MB |