Automatic text normalisation
Revision as of 12:48, 23 March 2014 by Francis Tyers (talk | contribs) (Created page with " ==General ideas== * Diacritic restoration * Reduplicated character reduction ** How to learn language specific settings? -- e.g. in English certain consonants can double, bu...")
General ideas
- Diacritic restoration
- Reduplicated character reduction
- How to learn language specific settings? -- e.g. in English certain consonants can double, but others cannot, same goes for vowels. Can we learn these by looking at a corpus ?